Jacques Mattheij

Technology, Coding and Business

fix the bugs and do not forget to fix the class of bug and the process too

When you’re in ticket-closing mode it is all too easy to find the cause of some issue, fix it, close the ticket and move on.

But that’s a little bit too fast. You’re missing out on three things when you go about it that way!

The first is relatively easy, that bug that you just fixed was not caught in testing, so before you actually fix it you need to write a test that fails, if only to prove that the issue really goes away after you’ve applied your fix. (you’ll probably be just as surprised as I sometimes am when it doesn’t! Easy fixes are rarely easy.).

The second thing that you could take away from fixing this one bug is that it is possibly representative of a class of bugs that is repeated elsewhere in the codebase. Have a long hard look at the bug and the circumstances which caused it to appear. Do these circumstances repeat elsewhere in the codebase? If so then it probably pays off quite handsomely to investigate those repetitions. Chances are you’ll be able to squash a bunch of bugs before they’ve even registered as such. Two or more for the price of one :) And in the age of cut-and-paste it is actually quite likely to find the same bug copied literally to another spot.

Finally, the hardest thing: the process. Somehow this bug managed to slip through your carefully crafted gauntlet of tests. That means that during the phase where this piece of code was first designed the requirements were not nailed down properly and that in turn suggests that there is another bug, a bug that is human rather than code. Either there is a problem in communicating requirements to the developers, a problem in designing test cases, a problem in nailing down those requirements in the first place or a problem in translating those requirements into code that implements them. You need to find out which of those it is and if you can backtrack further then do so!

Such errors of process are very easy to ignore, after all bugs can be fixed and then you move on. But if you fix them at the process level you don’t just fix that single bug, or even that class of bugs.

You fix all the bugs that you could get by fixing the process once and (hopefully) for all. That’s a pretty big boost in productivity, and such a backtracking to root causes can also help a lot in increasing the quality of the software that you produce.

Over the years I’ve slowly matured in this, I used to be happy by simply fixing the bugs and calling it a day. Then at some point after fixing two similar bugs in a row I finally clued in to the various ‘classes’ of bugs that seem to crop up over and over again and how you can tackle a whole class at once by changing your style of working. A nice example is memory leaks and how to deal with them (and their closely related cousin the double free) in languages that require you to do your own memory and resource management. One way to tackle this kind of problem is to have a symmetric arrangement with constructors and destructors responsible for the allocation and freeing up of resources, and a very clear hand-off of the ownership of a resource. Such a style goes a long way towards solving that particular problem.

An example of fixing the process was when at one company that I work for the q&a was split out from the development process. It seems a pretty basic thing (it is quite hard to reliably judge the quality of your own work, if not impossible). But this particular company grew slowly from a one man shop to one with lots of developers and nobody every stopped to ask if this was the best way to do it. By taking a much more adversarial attitude towards integration and acceptance testing the number of issues found before they hit production briefly went up to a very scary number, flushed out by the new way of doing things. In due course things the frequency of production issues cropping up dropped to previously unseen (and likely unimaginable) levels. In hindsight it all seems pretty obvious.

In another, similar case backtracking to the root cause of a single bug actually meant going all the way back to the specification, which turned out to be insufficient. No debugger or nifty programming trick will fix a broken specification!

Developing software is hard enough without having to deal with problems that sneaked in during the specification phase or with repeated instances of similar issues.

Better fix those too while you’re at it!

HN Submission/Discussion
If you read this far you should probably follow me on twitter: