Saturday, December 1, 2012

Write Code to Forget: 15 Relatively Specific Tips


[This is a rewritten and much improved version of a previous, now excised entry.]

A professional programmer builds up a large body of code, hundreds of thousands of lines.

Code has a strangely unpredictable lifespan. Off-the-cuff code can sometimes live and breathe for decades, while other carefully planned and meticulously constructed code may never see sunlight. As long as it lives, or might come back to life in a product some day, you or some hapless programmer that follows you will have to live and deal with it.

Your body of code - call it your corpus - demands time and attention. Every day, week, or month, its maintenance needs pull away a little of the time that you could otherwise spend on exciting new software. As the body gets larger, it demands more and more time. How much more is largely under your control - you can build it upfront to either rest easy, or come to life and harass you constantly. Like zombies.

Take a simple measurement, such as the number of seconds you spend per line of code in your corpus per month. Probably a small fraction of a second, so it might take 50 or 100 lines to add up to one second per month. That's time you don't get back and can't spend on more interesting things. You could actually measure this, but it's only necessary to understand that you could measure it, and the phenomenon is real. Your body of code siphons your available time.

Some of that siphoned time is simply wasted. When you must revisit code you've already written, and you're not actually adding features, it's wasted. Examples include:
  • Bug fixes
  • Technical debt payment
  • Code cleanup, reformatting, etc.
  • Re-familiarizing yourself with the code
  • Patches and security improvements
  • Adding unit tests
  • Adding support for integration tests, new test frameworks, etc.
  • Nailing boards to the windows so zombies have something to take apart while you whack at them semi-helplessly with baseball bats, chairs, cashmere sweaters, etc.
Some time spent on the corpus is valuable and not wasted. Examples of useful time include:
  • Strictly adding new features
  • Teaching the code to someone else
  • Automated static analysis
As you can see, I think there are few legitimate reasons to revisit code. My rule of thumb is "if it's creating new value for customers - and not in the form of fixing mistakes - it's productive". Otherwise a revisit is just a distraction you might have prevented; the code base 'coming back to life' to attack your productivity.

The more likely a body of code is to cause wasted time, the higher its 'zombie factor'.

Low Zombie Factor Coding Practices

Assume the next programmer will not be as smart as you are, so code for slightly-dumber-than-you. This has the effect of producing code with a lighter cognitive load, so programmers can interpret and manage larger chunks of code in situ. That makes it easier to maintain and extend over time. When you return to look at the code you wrote after 16 months, you'll be happy you made it easier to understand, too.

Use unit tests. If you can't prove your code works, it does not work. There's really no need to defend this position, so I won't. For goodness' sake, if you're not writing unit tests, start writing them the next time you open up your IDE. Write a primitive test. You don't even need a framework. You'll feel good immediately.

Consider test-driven development. This technique is not as widely followed as unit tests, and experienced developers often tell me that they know perfectly well how they need to build a class, interface, etc., so building it piecemeal is a waste of time. That's a compelling argument. Still, many report success and happiness with this.

Give code to the customer as soon as possible. This is a powerful idea, wrapped up in a methodology sometimes called continuous delivery. It minimizes the "shelf life" of code - how long it sits around doing nothing for customers - which customers appreciate and costs less. It also exposes problems quickly. The longer a particular problem is buried, the more it's going to distract you to fix it. Corpses rot. Or corpuses. You get it.

A corollary to giving out code as soon as possible is to optimize your build and deployment process. If it takes a week to build, test, and distribute your product, it really doesn't matter that it took you 90 seconds to find, fix, create tests for, peer review, verify, and check in a fix. You'll be whacking at that zombie for a very long time.

Anticipate and circumvent threading problems. Avoid threading any time you can. Use patterns such as Immutable Objects to minimize exposure to thread interaction issues. Have an expert evaluate your threading code. And defensively put in synchronization even if you don't plan to multi-thread. Yeah, there's a slight performance penalty. Consider this question carefully: are you worried about that?

Branching considered harmful. Use techniques such as the Null Object pattern to minimize the paths and forks your code can take. Each fork is something your unit tests might miss, and each path potentially unused, or rarely used. Rarely used code paths breed bugs.

Permanently kill off dead code. Bugs love to breed in this stuff. If code isn't doing anything, if it's commented out... aggressively get rid of it. Your source control system will hold onto it on the remote chance you'll want to use or consult it again, which, honestly, you won't.

Manage comments as well as the rest of your code. Delete comments which describe something patently obvious - they unnecessarily add to a reader's cognitive load and have to be maintained. Keep and expand comments which explain references to unavailable source, non-obvious or legacy techniques, names of fundamental patterns used, and comments used by parsers and static analysis. Ruthlessly destroy outdated or incorrect comments, even in code you don't feel you own. They waste serious time and cause bugs.

The compiler is your best friend. Do everything you can to let it help you find bugs. If you're bypassing the compiler, or using loose types when you could be easily helping the compiler help you with stricter types, just stop that. Loading a dynamic type? Anything that references a class as a string name is totally not OK. Just don't do that, unless you also build a static analyzer that verifies it will continue to work forever.

Design your software before building it to whatever extent is possible. Model your design with tools that help you prevent further re-work in the future, such as threat modeling and failure mode analysis. It's surprisingly quick and easy and your software is better for it. That's a good deal.

Assume failure. Any given operation may fail, and your code should be robust enough to recover, isolate, report, or appropriately ignore as many failure modes as possible. Things go wrong when software runs, so assume that everything which could go wrong will. I.e., you won't be able to open that file you just created (thank you virus scanners), or your allocation of a 4-byte heap variable will fail. Plan ahead for these things.

When totally unexpected things happen, your software will thrash and die anyway. A failure mode you have to consider is that the software you built is flawed in unknown ways. When unexpected, unrecoverable problems happen, do yourself a huge favor and make it trivial to pinpoint and reproduce errors. Use exceptions, logging, transactions. Label your error messages with GUIDs. Use every technique you've ever heard of, and innovate on this, so when your corpus does come alive, you know exactly where to shoot it so you can re-bury it and move on quickly.

Use machines to help you test and find bugs. You, or a dedicated tester, will not write tests that hammer every possible input into a non-trivial method or program. You just don't have time. So buy yourself some extensive coverage by investing in fuzz-testing technology.

Fear and avoid zombie-awakening patterns and and interfaces. Some software, such as document parsers, are both difficult to get right and security problems waiting to happen. Design your software not to need them, or if you do need them, write as little code as possible. Take off-the-shelf, well-tested, trusted software components and use those. Of course, you should be doing this anyway. Do it double for anything known to cause problems in general or for which your experience tells you that you, your team, or your organization are more likely than average to get wrong somehow. Then assume it's badly broken and put fences and barbed wire around it, metaphorically speaking. Unless you have fences and barbed wire readily available, then feel free to go literal.

Write software you can forget about, that isn't going to come back from the grave and interrupt the awesomely cool and revolutionary stuff you really should be working on. Because you're good, really good, and wasting your time is criminal.

Patrick

No comments: