Primitive Obsession

Primitive Obsession

I just learned this expression... I was complaining about the code using only basic datatypes and my boss agreed that it looked like "primitive obsession". My reaction: What??? there's a term for that???

A lot of the time I recognise a pattern in coding that works and I see issues with other patterns and I quickly develop rules in my head for good code. Sadly these rules do not necessarily exist in other people's heads. One of the things I struggle with is my general assumption that everyone sees the same things I do and experiences them as I do. They don't, I am really weird and often I need to remind myself of that.

Not speaking the lingo

A lot of the time when I try to explain the rules in my head to someone they say "that's not a thing". But it is obvious to me... then you end up disagreeing with someone. Here be monsters. The problem here is someone just told me I'm wrong. I can see the logic in what I am saying and when they don't agree. Then, my brain assumes they just can't see the logic I see, so I will show them. Then they hear me something that sound a lot like "No! You're wrong". Suddenly two people will fight to prove they are right even if they realise they were wrong all along.

Alternatively, when you know the special name for something, i.e "Primitive Obsession", you just need to flash it, like it's a detective badge or a VIP access all areas. Other devs suddenly recognise what I am talking about or otherwise look it up with a reputable source and walk away impressed.

The frustrating part is just because I might not know the special jargon name for it... it was still "a thing". I just invented it. Just like the guy who invented it and established the name. The logic was sound, it just needed to be spoken by Martin Fowler and not me.

I guess I am trying to find a way to cultivate the understanding of logic and the ability to express it in a way that other people can understand. Sadly the only person who seemed to really communicate things well to me was the architect in an old company who couldn't afford to pay me average wage.

Today, I learned about primitive obsession. This is where a developer sticks to the primitives of a programming language. Primitives are the basic building blocks of data within the language. For a language like C, these would be char, int, float... hopefully you get the picture. With regards to programming languages like C#, I think strings are also considered primitives too.

One of the great things about primitives is that everyone who uses the language knows how they work. They don't need to do any reading. A string is a string and anyone who has used C# will know what a string is, unless they were doing something VERY niche in it.

Alternatively when you only use primitives then there's a few things which I have spotted can go wrong.

A String is a String

Let's imagine one of my most irritating issues... accidently swapping one parameter for another. Imagine you write a method and all its parameters are strings (I was dealing with one these recently). Now what if you wanted to rearrange the order of the parameters... well in places where the method was called you wouldn't get an error message because it's thinking login(string, string) not login(user, password). What if someone accidently wrote login(password, user) ... the system does not complain.

Complex datatypes mashed together

Often a complicated datatype containing all sorts of data ends up mashed together in a string. A perfect example of this is a URI. A uri consists of a scheme (expressed as https://), a domain (such as google.com), sometimes a routing structure or directory struction (/search) and sometimes a query string (?source=hp&q=hello). The query string could also be broken down into multiple parameters such as source and q which have values hp and hello. Now imagine you want to figure out what one of those is. You need to work through massive string manipulation techniques to figure it out. For example finding the value of q in the querystring. You need to jump to the string which is after the ? then it is preceded by a 'q=' that either immediately follows the question mark or immediately follows an ampersand after the question mark and it ends with either the end of the string or the next ampersand. It's starting to sound complicated because it unnecessarily is. Compare this code:

Query.Parse(uri.Query)['q']  

It would be even greater if the query string wasn't a bloody primitive either then it would just be

uri.Query['q']

And this is all on the assumption the string is correctly formatted as a Uri in the first place. That's another matter.

Validation

Now imagine that you spend the beginning of every method that received a password and a username as a string checking for validation. Is the password field Hashed correctly? Does it have the right check digit? Is it blank? Does the user name have an email format with an @ and a domain like it is supposed to? Well with a basic data transfer object (or a POCO class) you can leave validation to the creation of the class and just assume it is valid from then on.

Passed by reference

When you pass a method a primitive in its arguments, in most high-level languages you pass the actual data. However with complicated objects that might contain multiple pieces of data in most high level languages you pass a reference to a object. It is always possible that this data could be updated in the process and something could change. The code which is executing will always be working with the latest version of the data. But with primitives, it will be working with whatever the value was when it was passed the data.

Conclusion

So there's a number of issues caused by using primitives for everything. Please don't be afraid of a POCO (plain old clr object) class. A POCO class or a DTO class (data transfer object).

These classes mostly consist of a collection of primitives so you're not far from it. They don't have much logic in them, maybe a little validation, but they're pretty simple and they make sure the data you're using doesn't get sent as the wrong argument, help keep things valid, give you simpler access to a complicated data structure. They allow you to pass things by reference which are dynamically updated when things happen in other threads. Don't do it for the word though... do it because you see the logic.

Graeme