Chad Perrin: SOB

11 June 2007

why open source code has to be better

Filed under: Cognition,Geek — apotheon @ 03:48

Just looking at the title of this post, before the post is even written, I’m struck by the myriad interpretations that could be applied to it. It can be read in any number of ways, spanning a wide and varied spectrum of meanings. We’ll see, by the time I’m done, how many of those might still apply:

I’ve written elsewhere about why open source code has to be more secure. I didn’t use a title so cryptic and protean in its meaning at the time, of course, because it was written for professional publication — where the kind of word play I used in the title here is generally a no-no. As such, I also constrained the content somewhat to avoid directly addressing many of the potential meanings of the phrase “why open source code has to be more secure”. Think about this, though: peer review means exactly that your code has to be more secure. What code must be as secure as code that could be viewed — and reviewed — by (almost) literally anyone, for a period without known limits? It has to be more secure, because you never know who’s going to see it, and your reputation as a programmer is attached to that code. Open source development is probably the fastest way to build a reputation for yourself as a programmer, but there’s nothing that says that reputation has to be a good one other than your ability to turn out good code.

Peer review keeps us honest.

That’s why open source code has to be better. You have to plan for the inevitable future when, if anyone ever cares at all about the code, other people will read it. You can’t just hide your obfuscated tangle of “we don’t know why it works, don’t change it or you’ll break it” spaghetti behind binary compilation and a big fat copyright notice. It doesn’t have to be perfect, but it sure as hell shouldn’t be embarrassing. When you’re releasing the code to the world for free distribution, you want to release quality. Why do you think neophytes in open source programming write tiny little snippets to fix bugs, while neophytes in closed source enterprise Java shops write massive, tightly coupled modules as their on-the-job training? Nobody outside the closed source chop shop is going to see that tangled mess of Java. That’s why.

. . . but wait, there’s more:

It’s increasingly an accepted truism that when you write software, you write it to be maintained. I’m convinced that’s one of the reasons so-called Agile Programming hit the big-time — it’s a microcosm version of the true life-cycle of a piece of software. In the project manager’s notebook, and in the corporate meeting room, software is something that is built, polished, and put to use. At its least realistic extreme, this view of software leads to programs being considered atomic, finished products, manufactured and boxed up to be sold in units. This means there’s a beginning, middle, and end to a software development project, and that’s it. Voila: you have a program.

That’s all poppycock, of course. As the aphorism goes, software is never finished — only abandoned. It’s true. It’s more true than even most people parroting that phrase (adapted from a similar one about art) realize. It’s more true of software by unimaginable degrees than it is even of art. Sure, art is always fiddled with until that critical “abandonment” occurs, the artist always seeking to perfect it, but in the end there really is an end. Beyond that point, it becomes useful. With software, abandonment means it has ceased to be useful. Software isn’t just fiddled and tweaked in an asymptotic approach to perfection, the way an oil painting might be — oh, no, it’s constantly developed, it continually evolves, and it changes. One single software life cycle might be half a dozen different applications entirely before it dies of loneliness after it is abandoned. The only thing that “finishes” software is stagnation. Software engineering is the manifestation of an “intelligent design” theory of evolution.

How does this make the code in an open source project better? Simply put, in the open source world people are far more aware of the fact that, when they write software, it will be read and rewritten by others. Software is written not simply to be compiled and shipped, but to be read. It must have understandable structure, pleasing form, and clarity. Clear code is good code. It’s not just a matter of knowing your audience — the difference between closed source software development and open source software is, more often than not, bound to the simple fact that open source developers are aware there’s an audience for code at all. Think about that — completely aside from why you want to write better code, if you’re developing open source software, you must be aware that other developers are your audience, not just users. Users are secondary. As long as the software does what you want it to do, (other) end users are largely irrelevant. It’s the source code that has an audience, and as such it must be valuable in its own right, completely aside from the sort of programmatic functionality it calls into the world as an “end” result.

Open source code has to be better because it’s not just written for the (perceived) quality of the functionality — it is also, and perhaps even primarily, written for the quality of the code itself. Perhaps surprisingly to those who apply Waterfall development methodologies, Microsoftian pseudo-Hungarian notation, and typical “enterprisey” concepts of object orientation*, the quality (meaning, mostly, “clarity”) of your source code bears a roughly direct relationship to the quality of the functional application itself.

You can blame the generally high quality of open source code on the fact that open source code is, generally, written specifically for someone else to be able to read and reason through.

. . . but wait, there’s still more:

Innovation is someone coming up with a great idea and making it happen. I won’t quibble with you over the use of the term “great”; it’s just a placeholder for that actual je ne sais quoi that fits the definition of “innovation”. Call it what you will, innovation is something that occurs to someone — to one individual — and gets turned into something tangible (or at least persistent). Innovation can happen anywhere. In software, it can happen in open source software development and in closed source software development with equal ease. Whether or not it ever sees the light of day tends to vary from one development model to the next, of course. Innovation survives to public release far more often in the average Agile Programming shop than the average Waterfall shop. It happens more often in open source development than in closed source development, too — and for much the same reasons (which I leave as an exercise for the reader, so that the reader’s brain doesn’t get too fat from sitting on the couch absorbing text all the time).

Implementation is part of innovation. The idea itself is without value to anyone but the guy who came up with the idea, and it doesn’t become innovative until it’s implemented. Innovators don’t have to be good programmers. They can turn out severe crap every day of their lives. As long as it implements the idea, and others take note of the value of the idea as implemented, it’s successful innovation. Implementation can even take the form of manipulating others into doing the scut work for you. Write a book about the perfect mousetrap, and let someone else build it and get all the retail sales profits of the thing — it’s you that gets the credit for the idea, and it’s your book people buy to try to learn how to be smart like you. You’ve innovated, but it wasn’t real until someone did something with it. Until then, you were just a blowhard. There was no proof you had any clue what you were talking about. As Eric Raymond or Linus Torvalds (depending on which version of the story you like) might say, “Show me the code.”

Ahh, right, the code. The truth is that they don’t have to see the code for it to be innovative (though it can help). What matters is that it works. Software isn’t a product, though — it’s a process. You don’t just create software and declare it finished, as I’ve already mentioned. It grows, guided by many hands (and many, many more if it’s open source). The more hands there are that have a hand in it (so to speak), the better it gets — as long as it’s allowed to. Think about that for a moment. In a closed-source shop, it is changed according to the corporate plan. In open source software, one group of developers may guide the software toward a particular end, but if that doesn’t make it better it either withers away and dies from lack of interest (as it might when released to resounding applause silence in the closed source world) or, as is often the case, gets forked and done right. Ooh, look — it got better despite the “best” efforts of the people “in charge”!

Truly innovative code, no matter how poorly written (as long as it actually works), keeps its innovations for as long as they’re relevant and valuable. The code may get refactored, but the innovation at the core is still there — even if differently expressed. It gets cleaned up and clarified. It gets better. It gets better just because changes that make it worse don’t survive in the long run.

Oh, sure, there are exceptions. I’ve been generalizing so much throughout this thing, though, that I hardly think this generalization goes far amiss:

Open source code has to be better because if it isn’t, it dies.

Survival of the fittest. See that? Darwinian evolution does work within the framework of intelligent design! You just have to give up on the idea of conscious, manipulative control to make it fit.

. . . kinda like the way open source code gets better.

(* “If our object model properly implements protection, our crappy developers cannot hurt each others’ code, and the app as a whole will probably work no matter how crappy the developers are!”)

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

All original content Copyright Chad Perrin: Distributed under the terms of the Open Works License