Saturday, December 1, 2012

Write Code to Forget: 15 Relatively Specific Tips


[This is a rewritten and much improved version of a previous, now excised entry.]

A professional programmer builds up a large body of code, hundreds of thousands of lines.

Code has a strangely unpredictable lifespan. Off-the-cuff code can sometimes live and breathe for decades, while other carefully planned and meticulously constructed code may never see sunlight. As long as it lives, or might come back to life in a product some day, you or some hapless programmer that follows you will have to live and deal with it.

Your body of code - call it your corpus - demands time and attention. Every day, week, or month, its maintenance needs pull away a little of the time that you could otherwise spend on exciting new software. As the body gets larger, it demands more and more time. How much more is largely under your control - you can build it upfront to either rest easy, or come to life and harass you constantly. Like zombies.

Take a simple measurement, such as the number of seconds you spend per line of code in your corpus per month. Probably a small fraction of a second, so it might take 50 or 100 lines to add up to one second per month. That's time you don't get back and can't spend on more interesting things. You could actually measure this, but it's only necessary to understand that you could measure it, and the phenomenon is real. Your body of code siphons your available time.

Some of that siphoned time is simply wasted. When you must revisit code you've already written, and you're not actually adding features, it's wasted. Examples include:
  • Bug fixes
  • Technical debt payment
  • Code cleanup, reformatting, etc.
  • Re-familiarizing yourself with the code
  • Patches and security improvements
  • Adding unit tests
  • Adding support for integration tests, new test frameworks, etc.
  • Nailing boards to the windows so zombies have something to take apart while you whack at them semi-helplessly with baseball bats, chairs, cashmere sweaters, etc.
Some time spent on the corpus is valuable and not wasted. Examples of useful time include:
  • Strictly adding new features
  • Teaching the code to someone else
  • Automated static analysis
As you can see, I think there are few legitimate reasons to revisit code. My rule of thumb is "if it's creating new value for customers - and not in the form of fixing mistakes - it's productive". Otherwise a revisit is just a distraction you might have prevented; the code base 'coming back to life' to attack your productivity.

The more likely a body of code is to cause wasted time, the higher its 'zombie factor'.

Low Zombie Factor Coding Practices

Assume the next programmer will not be as smart as you are, so code for slightly-dumber-than-you. This has the effect of producing code with a lighter cognitive load, so programmers can interpret and manage larger chunks of code in situ. That makes it easier to maintain and extend over time. When you return to look at the code you wrote after 16 months, you'll be happy you made it easier to understand, too.

Use unit tests. If you can't prove your code works, it does not work. There's really no need to defend this position, so I won't. For goodness' sake, if you're not writing unit tests, start writing them the next time you open up your IDE. Write a primitive test. You don't even need a framework. You'll feel good immediately.

Consider test-driven development. This technique is not as widely followed as unit tests, and experienced developers often tell me that they know perfectly well how they need to build a class, interface, etc., so building it piecemeal is a waste of time. That's a compelling argument. Still, many report success and happiness with this.

Give code to the customer as soon as possible. This is a powerful idea, wrapped up in a methodology sometimes called continuous delivery. It minimizes the "shelf life" of code - how long it sits around doing nothing for customers - which customers appreciate and costs less. It also exposes problems quickly. The longer a particular problem is buried, the more it's going to distract you to fix it. Corpses rot. Or corpuses. You get it.

A corollary to giving out code as soon as possible is to optimize your build and deployment process. If it takes a week to build, test, and distribute your product, it really doesn't matter that it took you 90 seconds to find, fix, create tests for, peer review, verify, and check in a fix. You'll be whacking at that zombie for a very long time.

Anticipate and circumvent threading problems. Avoid threading any time you can. Use patterns such as Immutable Objects to minimize exposure to thread interaction issues. Have an expert evaluate your threading code. And defensively put in synchronization even if you don't plan to multi-thread. Yeah, there's a slight performance penalty. Consider this question carefully: are you worried about that?

Branching considered harmful. Use techniques such as the Null Object pattern to minimize the paths and forks your code can take. Each fork is something your unit tests might miss, and each path potentially unused, or rarely used. Rarely used code paths breed bugs.

Permanently kill off dead code. Bugs love to breed in this stuff. If code isn't doing anything, if it's commented out... aggressively get rid of it. Your source control system will hold onto it on the remote chance you'll want to use or consult it again, which, honestly, you won't.

Manage comments as well as the rest of your code. Delete comments which describe something patently obvious - they unnecessarily add to a reader's cognitive load and have to be maintained. Keep and expand comments which explain references to unavailable source, non-obvious or legacy techniques, names of fundamental patterns used, and comments used by parsers and static analysis. Ruthlessly destroy outdated or incorrect comments, even in code you don't feel you own. They waste serious time and cause bugs.

The compiler is your best friend. Do everything you can to let it help you find bugs. If you're bypassing the compiler, or using loose types when you could be easily helping the compiler help you with stricter types, just stop that. Loading a dynamic type? Anything that references a class as a string name is totally not OK. Just don't do that, unless you also build a static analyzer that verifies it will continue to work forever.

Design your software before building it to whatever extent is possible. Model your design with tools that help you prevent further re-work in the future, such as threat modeling and failure mode analysis. It's surprisingly quick and easy and your software is better for it. That's a good deal.

Assume failure. Any given operation may fail, and your code should be robust enough to recover, isolate, report, or appropriately ignore as many failure modes as possible. Things go wrong when software runs, so assume that everything which could go wrong will. I.e., you won't be able to open that file you just created (thank you virus scanners), or your allocation of a 4-byte heap variable will fail. Plan ahead for these things.

When totally unexpected things happen, your software will thrash and die anyway. A failure mode you have to consider is that the software you built is flawed in unknown ways. When unexpected, unrecoverable problems happen, do yourself a huge favor and make it trivial to pinpoint and reproduce errors. Use exceptions, logging, transactions. Label your error messages with GUIDs. Use every technique you've ever heard of, and innovate on this, so when your corpus does come alive, you know exactly where to shoot it so you can re-bury it and move on quickly.

Use machines to help you test and find bugs. You, or a dedicated tester, will not write tests that hammer every possible input into a non-trivial method or program. You just don't have time. So buy yourself some extensive coverage by investing in fuzz-testing technology.

Fear and avoid zombie-awakening patterns and and interfaces. Some software, such as document parsers, are both difficult to get right and security problems waiting to happen. Design your software not to need them, or if you do need them, write as little code as possible. Take off-the-shelf, well-tested, trusted software components and use those. Of course, you should be doing this anyway. Do it double for anything known to cause problems in general or for which your experience tells you that you, your team, or your organization are more likely than average to get wrong somehow. Then assume it's badly broken and put fences and barbed wire around it, metaphorically speaking. Unless you have fences and barbed wire readily available, then feel free to go literal.

Write software you can forget about, that isn't going to come back from the grave and interrupt the awesomely cool and revolutionary stuff you really should be working on. Because you're good, really good, and wasting your time is criminal.

Patrick

5 Résumé Details to Never Read Again

Any manager who works for a growing company, especially in the tech sphere, knows that hiring is an ongoing task. You have new needs to fill, people leave for greener pastures, or you just finally fired an incompetent employee. You sigh, imagining the stack of résumés looming in your near future.

You know that it’s illegal to discriminate when hiring—against women, people of a certain color or religion or age, and more. “I’d never do that,” you think. Alas, even the best of us have trouble keeping subtle discrimination at bay, putting any hiring manager at risk of a lawsuit or simply narrowing the talent pool unnecessarily. So here’s how to keep bias out of your hiring decision.

Studies show that nearly everyone has biases, and that we usually don’t know we have them. These biases harm our decisions and may be the root of subtle but real and even pervasive discrimination. They often operate below the surface of conscious thought, so you may not even be able to detect them as they happen.
Problematic biases start with the first things you learn about a candidate, usually in the résumé, and may stop you from going any farther. Without ever noticing it, you might reject one résumé and proceed to interviews, but accept another otherwise identical résumé—identical but for a few details that trigger associations and biases best kept away from your decision. You may lose out on great candidates before you even get started.

The best protection against accidentally discriminating, inappropriately or illegally, is never knowing the information in the first place. Sure, at some point you will inevitably know the race and probably the sex and approximate age of a candidate, but the longer you avoid knowing it the better and clearer your decisions will be. With a little preparation, you can avoid it from the point of first contact.

How? Take some information out of every résumé before you look at it. Of course you’ll have to draft someone else to do this, or ask your sources or candidates to do so. The benefit is that you won’t see bias-triggering information before you’ve made a decision on whether to proceed with a candidate, and what’s more, stripping out that irrelevant information will help the important things come into focus.

You or a third party can redact résumés in common electronic formats such as Word and PDF documents without much difficulty. Word documents can be directly edited and saved. For PDFs, in Adobe Reader you can highlight a section of text, right-click on the selection, choose properties, and change the highlight color to black. The data is still technically in the document, but all you need to do is hide it from sight. Place a comment next to the text with any substitution you want to make, as suggested below.

Here are five kinds of information that often appear on résumés that you can safely remove before seriously examining them.

Name

Names are packed with information about sex and ethnicity, social class, age, and more. Studies demonstrate that people form ideas about a person’s likeability and the tendency to hire based on just a name. You don’t need to know any of this to make sound decisions, so replace names with neutral random identifiers.

Citizenship

Unless you legitimately require particular citizenship, it’s illegal to discriminate based on nationality. More obviously problematic than names, the information is nonetheless often stated or easily inferred from many résumés. Remove it when you can.
Of course there’s no hiding the fact that IIT Madras is in India and Ecole Polytechnique Fédérale de Lausanne is in Switzerland. Attending school in a country doesn’t mean you’re a citizen, but it’s an easy assumption to make because it’s usually true. It’s valuable to know that a candidate attended a prestigious school, but you might substitute “top 50 school” or “top 500 school” to learn what you need to know to make a sound decision.

Year of graduation

Ageism is an insidious aspect of the workforce and manifests strongly in the recruiting process. While it’s common and usually appropriate for candidates to say how many years of experience they have with the techniques or technologies relevant to the job, it is never necessary to know, when looking at a résumé, exactly when a person graduated. It’s too easy to infer age. While an older candidate may have recently graduated with a germane degree, he or she is more likely to have obtained a degree long ago, and the graduation year is a dead giveaway. Ask for a year of graduation later, when you need to conduct a background check.

Specific dates of employment

Specific dates show gaps. Readers notice gaps in employment history automatically, and that leads to speculation. A year-long or multi-year gap could be a sign of time off to raise children, indicating the candidate’s familial circumstances. It might also indicate unemployment due to any number of other reasons, but how could you know which it is by looking at dates on a page? You’ll find out later if you need to; at the résumé-reading stage, you don’t need to know, and it could mislead you. List the length of time in each position instead.

Religious and political clubs and affiliations

Candidates frequently list outside or personal activities on their résumés, typically at the very end. They may be intended to show good citizenship, or even some relevant experience, such as leadership, by running a club or campaigning for a political party. While possibly interesting, these personal activities are too likely to trigger biases on religious, political, ethnic, or other grounds. Especially relevant experience should be in the main body of the résumé, so it’s probably best to remove this section entirely.

Wrap up: Promoting sound decisions

Removing extraneous information reduces the risk of discriminating against applicants. Without this data, subtle or invisible biases don’t have a chance to harm your decision making ability, and with less text to read and think about, the most salient elements stand out. A short checklist like this one makes the redaction process quick and easy, increasing the clarity of your résumé analysis at minimal cost. 

New book available


Apress published my book in November, How to Recruit and Hire Great Software Engineers.

http://www.amazon.com/dp/143024917X

I put in much of what I've learned to be effective, and why, and I hope you find it very useful.