Chad Perrin: SOB

3 March 2007

Lines of Code: the New Heresy

Filed under: Geek — apotheon @ 12:10

I have something heretical to say about measuring programmer productivity.

Back in the good ol’ days, corporations started doing something that made intuitive sense: they started measuring programmer productivity in lines of code. If you produced more lines of code, you were a more productive programmer. It seems like a quick, easy, simple, and objective way to solve an otherwise difficult problem.

Unfortunately, we found that it doesn’t work. The programmers realized it first, of course, because they were the ones in the best position to recognize quality of code. More lines of code is not necessarily equivalent to better productive effort. Sometimes, more lines of code just means that your program has more lines of code in it for the same functionality — and, of course, more bugs, more difficulty maintaining and extending it, and so on. When this idea first started to circulate, however, it was considered heretical. That was the Old Heresy.

These days, there are still corporations that measure programmer productivity by lines of code per day (or week or month or year). There are even occasional job postings that, among their applicant requirements, list a minimum LOC rate. Anyone worth his salt as a programmer and looking for a decent job with good work conditions, of course, would take that as a sign to look elsewhere. Regardless of this fact, these corporations are quite set in their ways and unlikely to change their hiring practices any time soon. They’re anachronisms, in the new order. The idea that measuring programmer productivity is outdated and illogical is so well and widely accepted that anyone that hasn’t caught up with the times is a dinosaur.

To suggest that LOC is a useful measure of programmer productivity, fully aware of the arguments against it, is the New Heresy.

I guess I’m a heretic. I just realized today that lines of code matter more as a measure of programmer productivity than most people realize or are willing to consider possible — more so than I realized.

Really, lines of code produced are about the only measure that works, all else being equal. That “all else being equal”, however, assumes the quality of the lines of code produced. In other words, the “lines spent” inversion of the lines of code measure isn’t really all that effective a manner of judging your code production. The only reason lines spent is at all a valuable measure of the quality of your code is that it is assumed that fewer lines of code is roughly equivalent to greater quality of code.

Sadly, that’s not always true. Sometimes, more lines of code are required for greater readability and maintainability. Quality of code is important, but that’s not productivity. It creates a modification on productivity in the long run, but given a particular level of quality one still hasn’t determined actual productivity. Producing a particular number of lines of code on a given project doesn’t necessarily mean you’re more productive than someone else who hasn’t produced as many on a similar project, but finishing the project sooner with the same overall quality and having more time to write code for other projects does.

Every time I think about it from another angle, I realize that lines of code is even more relevant and central to measuring productivity than I’d previously thought.

There’s just one hitch: it’s almost impossible to reliably measure lines of code produced against the other factors involved. You won’t know for sure how readable the code is until someone else has to read it. You won’t know how maintainable it is until significant time is spent maintaining it. You won’t know its quality in operation until it has been used — a lot.

Unfortunately, you have to measure the quality of your programmer somehow, and you can’t do it by measuring lines of code, because there’s no way to reliably benchmark it outside of a very lengthy, extensive collection of statistical data. I guess, if you want to determine whether someone is a quality programmer, you have to go by the old saw: it takes one to know one. In other words, maybe Paul Graham was right.

Maybe I’m overestimating the heretical nature of my ideas here, too. I just know that I’ve never heard a “real” programmer in the last fifteen years ever suggest that LOC is the way to judge programmer productivity, even in a hypothetical sense.

All original content Copyright Chad Perrin: Distributed under the terms of the Open Works License