Chad Perrin: SOB

3 March 2007

Lines of Code: the New Heresy

Filed under: Geek — apotheon @ 12:10

I have something heretical to say about measuring programmer productivity.

Back in the good ol’ days, corporations started doing something that made intuitive sense: they started measuring programmer productivity in lines of code. If you produced more lines of code, you were a more productive programmer. It seems like a quick, easy, simple, and objective way to solve an otherwise difficult problem.

Unfortunately, we found that it doesn’t work. The programmers realized it first, of course, because they were the ones in the best position to recognize quality of code. More lines of code is not necessarily equivalent to better productive effort. Sometimes, more lines of code just means that your program has more lines of code in it for the same functionality — and, of course, more bugs, more difficulty maintaining and extending it, and so on. When this idea first started to circulate, however, it was considered heretical. That was the Old Heresy.

These days, there are still corporations that measure programmer productivity by lines of code per day (or week or month or year). There are even occasional job postings that, among their applicant requirements, list a minimum LOC rate. Anyone worth his salt as a programmer and looking for a decent job with good work conditions, of course, would take that as a sign to look elsewhere. Regardless of this fact, these corporations are quite set in their ways and unlikely to change their hiring practices any time soon. They’re anachronisms, in the new order. The idea that measuring programmer productivity is outdated and illogical is so well and widely accepted that anyone that hasn’t caught up with the times is a dinosaur.

To suggest that LOC is a useful measure of programmer productivity, fully aware of the arguments against it, is the New Heresy.

I guess I’m a heretic. I just realized today that lines of code matter more as a measure of programmer productivity than most people realize or are willing to consider possible — more so than I realized.

Really, lines of code produced are about the only measure that works, all else being equal. That “all else being equal”, however, assumes the quality of the lines of code produced. In other words, the “lines spent” inversion of the lines of code measure isn’t really all that effective a manner of judging your code production. The only reason lines spent is at all a valuable measure of the quality of your code is that it is assumed that fewer lines of code is roughly equivalent to greater quality of code.

Sadly, that’s not always true. Sometimes, more lines of code are required for greater readability and maintainability. Quality of code is important, but that’s not productivity. It creates a modification on productivity in the long run, but given a particular level of quality one still hasn’t determined actual productivity. Producing a particular number of lines of code on a given project doesn’t necessarily mean you’re more productive than someone else who hasn’t produced as many on a similar project, but finishing the project sooner with the same overall quality and having more time to write code for other projects does.

Every time I think about it from another angle, I realize that lines of code is even more relevant and central to measuring productivity than I’d previously thought.

There’s just one hitch: it’s almost impossible to reliably measure lines of code produced against the other factors involved. You won’t know for sure how readable the code is until someone else has to read it. You won’t know how maintainable it is until significant time is spent maintaining it. You won’t know its quality in operation until it has been used — a lot.

Unfortunately, you have to measure the quality of your programmer somehow, and you can’t do it by measuring lines of code, because there’s no way to reliably benchmark it outside of a very lengthy, extensive collection of statistical data. I guess, if you want to determine whether someone is a quality programmer, you have to go by the old saw: it takes one to know one. In other words, maybe Paul Graham was right.

Maybe I’m overestimating the heretical nature of my ideas here, too. I just know that I’ve never heard a “real” programmer in the last fifteen years ever suggest that LOC is the way to judge programmer productivity, even in a hypothetical sense.

7 Comments

  1. In a hypothetical situation where a programmer was 100% responsible for maintenance of all code they wrote and for the fallout of changes to their code, then an average lines of new-code written every day would be a useful measure.

    Low quality high volume programmers would very rapidly fall into a steady state with an average assymptotically approaching zero, because they have no time to write new code, being too busy mopping up after themselves.

    High quality programmers would reach some average depending on their speed, the quality of their management structure, and luck.

    When assumptions change from the top down, it will most significantly hinder the people who worked at the lowest levels, perhaps unfairly penalizing them, but it would always shake down in the long run.

    More abstractly, you could penalize somebody’s LOC figure when they incur maintenance costs or whatever. shrug

    Really, as far as I’m concerned, measuring productivity isn’t especially useful. A good manager will know their team, and will know if they’re going to be understaffed on an upcoming project, and really… what else matters?

    Comment by SLR — 3 March 2007 @ 01:47

  2. Really, as far as I’m concerned, measuring productivity isn’t especially useful.

    That might actually work as a reasonable interpretation and summary of the entire post, now that I think about it. Wow, I’m long-winded.

    really . . . what else matters?

    Hiring the right people — and the right number of people — I suppose.

    Comment by apotheon — 3 March 2007 @ 02:04

  3. If you could hold people responsible for maintaining their code, then the worse developers would still be stuck maintaining COBOL code, or DOS or whatever was when they started their career. And you could just hire base on resume: anyone who got to use the latest technology has proved their productivity.

    Measuring productivity is still important. How else can I improve myself? and which better tools do I choose? how can I get more done? If productivity wasn’t important, we’d still all be coding in COBOL.

    LOC is an exteremly bad metric. I’m now working on significantly downsizing a portion of our code base, so it will be much easier to maintain and live with. When downsizing it’s always possible to create compact and cryptic code. Fortunately, we don’t use LOC metrics, only “easier to live with” as judged by the team.

    But the point is, if more lines of code could be better or worse, and if less lines of code could be better or worse, then it’s evident that this metric doesn’t tell you anything useful about the result.

    Comment by assaf — 3 March 2007 @ 02:04

  4. Measuring productivity is still important.

    In broad strokes, yes. You need to be able to come up with general measures like “Is this programmer A) very productive, B) marginally productive, or C) fairly unprodutive or even counterproductive?” as a project manager (or other position where measuring others’ produtivity is important).

    How else can I improve myself? and which better tools do I choose? how can I get more done? If productivity wasn’t important, we’d still all be coding in COBOL.

    That’s a completely different set of questions than those I think both Sterling and I were attempting to address. Of course we measure our own productivity levels, as we hackish types are (like artists) generally our own worst critics and biggest fans. It’s something that we can almost intuitively feel, however, based on fuzzy logic and heuristic analysis more than an attempt at objective, quantitative standards of the sort that is the holy grail of the performance review business. I suppose I could have provided a much clearer delineation in the original SOB entry but, frankly, the self-assessment aspect of productivity measurement was so far outside my focus in writing it that it didn’t occur to me. Mea culpa.

    LOC is an exteremly bad metric. I’m now working on significantly downsizing a portion of our code base, so it will be much easier to maintain and live with.

    If measured in a vacuum, sure. If measured within a meaningful context (as I tried to convey in the original SOB entry), however, it’s pretty much the only objective metric. The key is not to use LOC as a goal, but to honestly strive to do good work, then (completely separately from that) measure your productivity after the fact with the assumption that the number of LOC produced/fixed in the project was the number needed. LOC isn’t usefully measured in terms of “more equals better” in a project, but rather “more equals more productive” in a given span of time spent working (regardless of the project).

    But the point is, if more lines of code could be better or worse, and if less lines of code could be better or worse, then it’s evident that this metric doesn’t tell you anything useful about the result.

    Sure it does — as long as you’re measuring LOC of productivity rather than LOC present at the end of the day. You might produce a grand total of 300 lines of code in one day (to pick an astronomically high number out of thin air), but if they’re mostly rewrites and you have four useful LOC when you’re done, you have a four LOC productivity rate at the end of the day. If on the other hand you remove 300 LOC from a project during refactoring and replace them with 4 LOC, you have a productivity rate for that time period that is measured somewhat differently — I’m not sure at the moment how that would be measured, but some possibilities include:

    $ ruby -e 'puts "#{ -300.abs + 4.abs } LOC"' 304 LOC

    $ ruby -e 'puts "#{ -300.abs - 4.abs } LOC"' 296 LOC

    . . . or, simply, either 300 LOC or 4 LOC, depending on what you find most useful. Maybe there’s a division or multiplication involved there, because the 300 LOC removed might be worth a number divided by the number added, or because the four added may be worth a multiple of the number removed (that last is unlikely as a useful metric). I suspect that some kind of statistical modeling that is beyond my make-it-up-on-the-fly ability in the midst of writing this, so I’ll leave that as an exercise for the reader — if you’re into statistical modeling. Get back to me in a couple years when you get it finished with statistically significant input for the model, adjusted for externalities.

    Comment by apotheon — 3 March 2007 @ 10:49

  5. You improve yourself by learning more, and by always thinking of better ways to do things. Not by measuring. Measuring at MOST will tell you whether or not you’ve improved, and even then it will only tell if you have improved your speed. I, for one, program at my fastest when I’m re-solving a problem I’ve personally solved before, good for productivity, not so good for total ability. I find it unlikely that many programmers are more interested in the speed at which they program than the extent to which what they program meets desires (whether thats rote functionality, or speed of execution, or sweetness of design, etc depends on the project & the programmer) The person who cares how much you write is a manager, or potential hirer.

    ‘Poth’s said that everything else must be held equal: readability, maintainability, functionality, everything else. After those things are equalized, LOC is a meaningful number for output with no regard for (and no need to) whether a given piece of code is better or worse with more or fewere LOC. I don’t think output counting should be high on anybody’s list of tasks. Productivity is something to consider when a programmer isn’t finishing things in a timely fashion because they’re not working.

    Comment by SLR — 3 March 2007 @ 10:49

  6. Agreed, SLR — every word.

    In fact, you’ve brought up an excellent point about why actually measuring productivity in a quantitative fashion is typically an almost meaningless measure in any way that matters to those of us who are of hackish persuasion:

    I, for one, program at my fastest when I’m re-solving a problem I’ve personally solved before, good for productivity, not so good for total ability.

    A quantitative measure of productivity is almost entirely irrelevant to a qualitative measure of programming mastery. One’s status as a guru, wizard, expert, whatever, is secured by solving new and challenging problems, forging ahead into unknown territory, settling new lands and exploring new vistas (not, of course, necessarily Vistas) — taking the road less traveled by and making it look easy to hack out a path that others may follow to a new dawn, yadda yadda. One’s productivity in a corporate setting is often measured not by on the job performance, but by the experience listed on the resume — because having solved more, and more similar, problems in the past ensures that in some respects you’ll just be repeating effort so that you can solve the current problems more quickly in the future.

    That’s where the excellent programmers with weird ideas about what’s a good idea (based on challenge and fun factors) meet the mediocre programmers who have toiled away in a related industry for twenty years using the same language they’ll be using at the new job. Similar past experience leads to faster solutions; real talent leads to faster solutions by choosing shortcuts that don’t devalue the results (like code reuse from open source projects, breakthroughs that provide a simpler way to get things done, better understanding of underlying principles of algorithm design, et cetera). One is accomplished by rote, the other by intelligence, but both accomplish the same goal, probably with similar facility. That holds true right up to the point where either the expert burns out or the mediocrity runs up against something completely new and chokes (or both — but then, the similarities in results don’t end).

    I suppose I could be wrong. I’m just doing something like spilling stream of consciousness through my keyboard.

    Comment by apotheon — 3 March 2007 @ 10:59

  7. […] the number four result in a Google search was to Lines of Code: The New Heresy — which doesn’t even mention Dijkstra, let alone actually quote him — I figured I […]

    Pingback by Chad Perrin: SOB » lines spent — 4 December 2008 @ 10:58

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

All original content Copyright Chad Perrin: Distributed under the terms of the Open Works License