Chad Perrin: SOB

11 June 2007

The DRY Principle and Documentation

Filed under: Cognition,Geek — apotheon @ 04:33

Someone recently wrote a rather narrow, uninsightful complaint about things that suck in open source development. The problems with that lengthy complaint, in short:

  1. Software packaging appears broken sometimes simply because a given piece of software hasn’t gained enough popularity for anyone to bother. Keep in mind that with open source software you get to see (and use) software before it’s completely ready for prime time, pretty much by definition. Once it hits the “big time” (or even the “fairly small, but at least noticeable, time”), software in the open source world gets incorporated into the best software packaging systems known to man.
  2. Documentation, for all its problems in the open source world, is actually better on the whole than in the closed source world, from what I’ve seen. By far, the best documentation for any OS I’ve come across is FreeBSD’s. OpenBSD’s has a stunning reputation for completeness and usefulness as well. Debian’s manpage coverage is so extensive it boggles the mind. The books you can get for Linux in general are many, varied, and extensive. Contrast this with the well-known problems of MS Windows documentation, for instance. MacOS X has great documentation — for a closed source, proprietary OS. That’s not saying much, though, when contrasted with the completeness and extensiveness of FreeBSD’s documentation.
  3. SourceForge isn’t exactly a place for comparing the state of documentation and packaging of open source software with that of closed source software. That’s like going around to all the proprietary software vendors and checking to see what they’ve got in their project queues, measuring the quality of documentation and packaging for software that isn’t even guaranteed to get continued development funding (let alone release-worthy).

That’s not to say that there aren’t problems with packaging and documenting software in general. If “taw” referred to software in general, rather than specifically singling out open source software as the “bad apple” (and thus ignoring the fact that closed source software seems, on the whole, to fare worse), I’d have simply nodded my head in agreement. Something needs to be done about packaging (for distribution and installation) and documenting software as it’s developed.

I’ve already talked about ensuring you achieve good results with software deployment by ceasing to consider deployment procedures as separate from development. Since packaging/installation, aka “deployment procedure”, actually involves thinking about the structure of the software and writing code, it’s an easy thing to incorporate deployment development in your application development process. The two are a natural fit, which might help people to realize that they are, in fact, one — not two — after all. The deployment procedure for a piece of software is part of its interface. Learn that simple truth, and you should be able to put it all together. Voila, you’re set.

Documentation doesn’t seem to be as easy — not by a long shot. Documentation, as it is practiced, is a long, drawn-out process of duplicating significant parts of the source code of your software in English. Add to that the fact that you must also essentially create an after-action application use flowchart using the English language (rather than one of those old plastic flowchart stencils), and it starts looking like a severe pain in the butt that will never get finished properly.

The DRY principle applies to documentation as much as to actual software development. The principle, at its core, is simply a statement of the fact that when you repeat yourself, you introduce inconsistency. Bugs are unavoidable with duplication. You don’t duplicate code because when one part changes, the other part gets out of sync. One could even consider pretty much every single advance in programming practice in the last forty or fifty years to be an attempt at solving the problem of duplication. That’s the whole point of programming in the first place — automation, which saves you from duplicating effort.

It’s no wonder that documentation development, as essentially a plain-English duplication of software development, gets neglected — or ends up out of date and sometimes worse than useless. The only real solution to the problem, it seems, is to try to figure out how to eliminate duplication without making either programming or documentation suck. There have been some abortive attempts to achieve this, or something akin to it (see the invention of the eminent Dr. Knuth known as “Literate Programming”, COBOL’s syntax, and RDoc), but they haven’t tended to be really successful — whether its failure was in popularity or technical effectiveness — as a means of producing end-user documentation.

I’m not sure how to solve the problem, personally. It needs to be fairly universal in its applicability (it has to work just as well in Visual Studio, Emacs, and ed — vi too, of course, but that goes without saying since vi is the One True Editor), and it needs to truly eliminate the duplication problem. Finally, it needs to provide good end-user documentation. It’s a real problem.

Does anyone out there have any ideas for solutions?


  1. Not to mention that ‘english’ is not enough. You have to manage translations as well.

    Comment by Norbert Klamann — 12 June 2007 @ 12:36

  2. Some time ago I had that vague idea about using the tests as part of the documentation. Examples are one of the most effective forms of documentation – and the test cases should fit as examples quite well.

    Comment by Zbigniew Lukasiak — 12 June 2007 @ 03:37

  3. Umm… docstrings? The python standard library documentation is almost always complete.

    Comment by cw — 12 June 2007 @ 05:48

  4. Unless if, by “documentation,” you mean things like tutorials, which aren’t even particularly redundant. Most programmers couldn’t write a decent tutorial to save their life.

    Comment by cw — 12 June 2007 @ 05:51

  5. Being able to write technical documentation well and being able to write software well are distinct skills. You can improve the hell out of one of them without improving the other at all.

    The fundamental problem here is that there are lots of people in the world writing useful software and relatively few writing useful documentation for that software. How best to motivate people to practice the skill of documentation?

    I wonder, do we mostly agree on what a good piece of documentation even is?

    I issue the following challenge: Write a one or two page program that is complex enough to benefit from some documentation, but simple enough for most programmers to understand after 15 minutes of study, then write some documentation for it.

    One possible goal of the documentation could be to shorten the length of time it takes a programmer of average skill to understand how your code works from 15 minutes to 10 or 5.

    Comment by Chris Marshall — 12 June 2007 @ 11:36

  6. Norbert Klamann: Good point. Somehow, I hadn’t even thought of translations while I was writing this. I had a hint when I saw your name, but looking at your website confirmed that you have more personal reason than I would to think of translations, too. The fact I don’t speak languages other than English (at least, non-programming languages) well enough to write effective documentation in other languages isn’t much of an excuse, though.

    Zbigniew Lukasiak: That’s a good place to start with thinking about how to make documenting your software a more natural process, I think — for the technical documentation intended for the eyes of other programmers, at least. The real problem for most ideas seems to be with figuring out how to implement the idea, though.

    cw: I meant the entire deal, from one end (technical, developer-oriented documentation) to the other (manpages, “online” help pages, and tutorials). I think the documentation that gets the least attention is probably tutorials, but at least with relatively simple projects and clearly written code the documentation that is probably the most important is something along the lines of online help (where “online” means “on the computer”, not “on the Web”).

    Chris Marshall: Even poorly written, if at least largely complete, documentation would be an improvement over the average documentation (an average which suffers from a lot of “nothing got written” cases). Especially in the case of open source software, poorly written documentation is far better than nothing because it then becomes much easier for someone else to come along and improve on the documentation.

    You say “The fundamental problem here is that there are lots of people in the world writing useful software and relatively few writing useful documentation for that software. How best to motivate people to practice the skill of documentation?” and I agree. That was, in fact, basically the core point of what I posted in the above SOB entry, and that’s the question I want to answer. Unfortunately, I think you’re underestimating the problem because, like most developers, when I said “documentation” you thought of things like API documentation — and what I mean to suggest we need more of is documentation all the way to the end user.

    Comment by apotheon — 12 June 2007 @ 01:39

  7. Apotheon:

    You say: “and what I mean to suggest we need more of is documentation all the way to the end user”

    Well, both API docs and end user docs fill a definite need. The more of both get written, the better things will be.

    It’s not clear to me that end user documentation is the biggest lack.

    Comment by Chris Marshall — 12 June 2007 @ 06:56

  8. Chris Marshall

    It’s not clear to me that end user documentation is the biggest lack.

    It’s not clear to me it’s the biggest lack, either — but, in combination with API documentation and everything between, it’s still part of documentation, and all of documentation seems to be lacking (to varying degrees) in development practice. I didn’t mean to suggest it’s more important to fix manpage documentation than to fix API documentation. I only meant to suggest that all documentation is important and should be considered, and somehow unified more with the development process so that there’s less repeated effort.

    Comment by apotheon — 12 June 2007 @ 09:52

  9. For software libraries/components, taking a look through the unit tests usually sheds light on the most stable use cases.

    Comment by Chui Tey — 13 June 2007 @ 02:54

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

All original content Copyright Chad Perrin: Distributed under the terms of the Open Works License