Good Software Design - Part 2: Code Smells
Last month’s article (found here) introduced the notion of code smells. Code smells are names given to those instinctual thoughts you have whenever you look at a chunk of less-than-elegant code. Some are subtle, and some really stink.
Stinky Code
Each code smell represents something that would make understanding, extending, or maintaining the code just a bit more challenging than it needs to be.
How to use smells on your team? You and your teammates know what you don’t like to see in your code. You’ll look at a blob of code because you need to either understand or alter it, and the worst of those code smells will hit you right away.
Refactor away those, then look again. You refactor until you understand the code sufficiently, or (not XOR) adding some bit of new functionality is made easier. Refactoring is an “Incremental Design” practice that results in readable, sufficiently flexible “Emergent” designs.
This makes refactoring the indispensable engineering practice for successful iterative and incremental software development.
Let’s explore some common code smells. The following is Java or C# code (with Java naming idioms). The Connection object is this team’s home-brewed database interface.
Please don’t get your axle wrapped up in those details. You don’t have to understand a blob of code in order to detect code smells. In fact, if the code isn’t excruciatingly obvious, then there’s a smell hiding the intent. Also, you’re not looking for defects. Code smells are definitely not the same thing as functional defects, because behavior and software design are—albeit related—not the same thing.
Just follow your nose: trust your instincts and experience. Perhaps write down what you dislike about this code.
What’s wrong with this code? Nothing? Looks fine?
Or everything?
The detection of code smells is sensitive to team context and the context of the product’s existing functionality. Who is on the team? What experience have they had with procedural programming (it’s a static method, which is Java’s way of providing procedural programming structure in an object-oriented language)? Object-oriented programming? Patterns? Are you planning to fix a defect? Do you need to add a new record type? Is there similar code elsewhere that works with a different type of data-source? How old is this code?
Without knowing these things, you will not be able to guess what the “best” design will be. Once your team is familiar with many of the most common smells, and the refactorings needed to clean them up, you’ll see just how bad this code is because you’ll have seen much better code.
Code smells have names for the same reason design patterns and refactorings have names: So that you can efficiently communicate complex ideas to others on your team.
Here are some code smells I notice, above, and refactorings that could help.
Long Method
The first thing that strikes me is the length of the method. Right away I know it’s doing a lot because I have to take time to study it. A well-factored method is quickly read, digested, “grokked” (nerd-speak for “fully understood”).
Long Method is resolved mostly using the Extract Method refactoring. You take an identifiable, isolatable piece of the stuff between curly-braces and move that into its own well-named method.
Comments
Fowler described Comments as a code smell, but as a pleasant smell we often use to cover up other stinky code.
Again, Extract Method is often appropriate, and the comment becomes a template for the intention-revealing name of the new method. You’ll want to choose a method name that makes the code clearer in the calling code; more so than in the definition of the method.
Whitespace
My own flavor of the Comments code smell. Again, empty lines between paragraphs of code are not bad, per se, but they tell you that the method you’re looking at is doing more than one thing, which means it’s doing too much. (I.e., It lacks cohesiveness.)
Yep, Extract Method again.
Duplicated Code
This is the most common, and most odiferous of code smells. There are the obvious, character-for-character forms (e.g., .equals(x.getString(“RECORD_TYPE”) occurs in three different places). There are also more subtle forms. For example, notice how each if statement follows the same format: If the record-type is a certain value, get some other data from the record and do something with it.
There are numerous refactorings that help with these various forms of duplication. (Actually, most of the syntax in modern object-oriented languages is intended to reduce duplication. This reinforces the notion that software design is for the benefit of humans, not computers. We like to have one home for each idea in our system. “Once and Only Once” (OAOO), “Don’t Repeat Yourself” (DRY), et cetera…)
In the code above, you might imagine refactoring the if() statements into a switch() statement, assuming that your programming language can currently support switching on string values (this was not always so!).
Switches can also be a code smell, though, particularly if there are two or more switches on the same value. We don’t see that here, but you might have seen it in old ANSI C code. Two or more switch() statements switching on the same value is sometimes known as “Poor Man’s Polymorphism.” The solution is barely hiding in that smell name: Refactor to a simple, lightweight object hierarchy using Extract Class, Extract Superclass, and perhaps refactoring to a Strategy or a Template Method pattern.
If that seems like overkill, consider reviewing Kent Beck’s Four Rules of Simple Design: Passing tests (and testability) are of prime importance, and the objects that make up traditional Design Patterns are amazingly testable. With tiny, single-purpose objects, you might feel like you're barely testing anything at all. That’s the right feeling. We call them unit tests, or microtests, after all. Powerful, fluid, composed runtime behaviors are possible via discrete, encapsulated definitions.
With this particular sample of stinky code, you might also use lambda functions (aka closures, anonymous functions…), one per record-type. Or you might choose named methods or classes, depending on whether naming the behaviors would be beneficial elsewhere, or would simply aid in comprehension.
Another important consideration is what functionality you’re planning to add via TDD or BDD, and what behavioral variation is implied by that addition. For example, will you need to add processing for other record types? How often, and how is that configured? Is it per product, or per platform?
The variations are endless and unpredictable, so you will want to defer these design decisions until the user-story (Scrum’s “PBI”) arrives in your queue (aka sprint backlog).
Magic String
This is a cousin of Magic Numbers. Magic Numbers are hard-coded constants that aren’t declared as constants. They just appear in their raw, naked form in the code. For example, (x < 1000). Why 1000? What is that? What if I find a different 1000 elsewhere in the system? Are they related???
You are forced to read through much more of the code in order to determine why there is a 1000 in a particular place.
Extract Constant is a good way to deal with Magic Numbers, but you have to be careful not to let your powerful IDE replace all 1000s unless they all really represent the same thing. (It’s often a checkbox in the refactoring dialog.)
Similarly, the string “RECORD_TYPE” is a Magic String. Here, the dangers are a bit different. After all, this string is fairly self-explanatory, so what’s the concern?
Perhaps your team needs to change the contents of this string, but you miss one, and introduce a defect. Or, somehow your cursor accidentally landed inside that string and you inadvertently add a typo. (Sure, careful review of all changes when committing might catch it…but I’ve actually encountered a case where one developer typed ‘O’ and another one typed ‘0’ and it took 15 minutes for them to figure out why the test wasn’t passing. (No, really: True story!)
Extract Constant both reduces duplication and reduces the opportunity to break something.
Abbreviations, or Bad Names
pi?! x?! Need I say more? With today’s sophisticated IDEs, there’s never any reason to type a longer name more than once, so you’re free to name identifiers clearly. Avoid jargon, acronyms, and abbreviations (unless they’re clearly understood parts of your team’s and customers’ ubiquitous language). If your abstract class names contain the word Abstract, or a concrete class ends in Impl, you’re making things less readable by adding redundant noise. Put simply: avoid programmer-speak. Yes, this even applies to Hungarian Notation in languages with strong typing.
Rename Method, Rename Class, and Rename Variable, all allow you to improve readability and match your team’s evolving ubiquitous language.
Refactor
Follow your nose: When you identify a stinky smell, refactor it away; then take another sniff. Run the tests whenever they should pass, and commit (at least locally) whenever they do pass.
You don’t have to refactor away all code smells all at once. When something (a new user story, or a defect) causes you to examine or alter some code, strive to leave it as good as it was, or better.
You typically start with the really stinky smells first, then clean up the remaining smells that are preventing you from skillfully accomplishing your task. You commit frequently, so you can recover whenever a test fails. A failing test means the refactoring step you just took wasn’t a complete refactoring. Likely, you tried to change too much too soon, or too fast. Frequent commits allow you to rewind a set of refactorings in a controlled way, in case the design goes someplace that isn’t working out.
If you have fast, comprehensive tests and a flexible code repository, you can even experiment! Try out a few alternative design choices and see which one best follows Kent Beck’s four guidelines and Daniel Terhort-North’s five heuristics.
Keep in mind, we’re talking about minutes, or perhaps an hour in the worst case; not hours or days.
Next Up
In Part III we’ll tune in to our favorite developers, Alice and Bob, as they refactor the code above. Then, in Part IV, we’ll look more deeply at the role tests (or “specs”) play in creating good software design.