The Enterprise

Fix the cause

Michael Pankov •

How do you usually react when you discover a small bug in your code? Do you fix it right away? Or do you stop and try to find why it was introduced in the first place?

Consider a simple scenario: developer tries to build the project and at some point he discovers that not all files are rebuilt. He has two options. First is filing a bug against the build system maintainer (or at least putting it onto a post-it note). And the second is using the fact that the bug doesn't happen when you clean the project build directory first.

If one goes with the latter option, it's worth putting the `make clean; make` sequence to the script. The script that may accidentally get committed to repository under the name ''. And someone else may start using this inherently broken solution.

It will take more time to build the program — but that's "lazy time". One doesn't really do anything while waiting for compilation to finish. At least not when the build time is at a scale of several seconds or minutes. So — it's very easy to work around the problem. Moreover, this solution provides false motivation to implement it — it offers a free chance to procrastinate a bit more on the internet.

Of course, that's pretty obvious to any engineer, that the "right" way to handle this is to fix the bug in the build system. The problem is we are often forced to act in "not right" way. Not because of our laziness, but because the business requires quick solution, not the right one. Amount of "right" is hard to qualify, and amount of time spent is easy.

That actually got me thinking: what if we could unite the "right" and "quick" ways to fix the problem? It probably means going with one of two options, again. Either make the "quick" way the sole option, or only provide the "right" way.

What I'm talking about is kind of "Python vs. Haskell" decision. With first, you have "mostly sane" default behavior, and engineering is pretty straightforward. Straightforward in the sense it's designed to be easy to make progress. With second, you are forced to solve small tactical problems at the moment they're introduced — i.e. make the types correct. It may be hard to make progress at times, but the built program is more robust because you're not allowed to take shortcuts. You stop to think what you're doing.

While I understand that's probably not the best possible analogy, I think the reader can grasp what I mean: one approach emphasizes the importance of good design, and the other strives to achieve the goal of program implementation no matter what. One is trying to make a good program, and the other is trying to solve the problem the program is intended to solve.

Perception of implementation in a programming language differs significantly in these two approaches. For "quick" the implementation itself is only a tool to reach a non-technical goal. A business goal, so to say. And for "right" one, the right design of program is recognized as a separate and not less important goal, than the business one.

The focus on the business problem sometimes goes as far as discarding the importance of tools altogether. People taking this point of view claim that technologies — programming languages, editors, databases, etc. — are mere appliances. Like a mixer in your kitchen. Except studying to properly operate your high-tech bytes mixer — programming language — is orders of magnitude harder and longer process.

We wandered off quite far from the main topic of discussion. But I believe the examples help to understand what I mean. When something breaks, it's an opportunity to re-consider whether you're on the right path at all. Maybe the stuff doesn't work because you're trying to implement some awful workarounds and there's actually a better way.

Occurrence of some bug is also a chance to think about prevention of such bug in the future. You discovered a goto fail? Introduce an automatic check of your code with static analyzer. Your colleague found a memory leak? Urge him to start not with fixing the immediate problem, but to add a valgrind run to the commit test suite. It's in spirit of TDD: first you introduce a non-passing test, and only then make it pass.

Behaving in such a way solves not only current issues, but prevents many of them in future. And the same bugs occur over and over again more times than any individual developer notices. That's why it pays off to do the right thing at the time it becomes needed, not postponing it "until next time". It's easy to become overwhelmed with the recurring problems because you weren't able to implement a proper prevention measure when you first encountered a particular bug.

Don't fix the consequences of an issue. Fix the cause.

comments powered by Disqus