Chad Perrin: SOB

6 September 2006

the One Important Factor of programming languages

Filed under: Cognition,Geek,Popular — apotheon @ 02:03

Perl is a favorite flogging target, both of me and others, both of detractors and fans. So is Java. Both tend to become favorite flogging targets due primarily to two factors, I believe:

  1. each has a strong, widespread, zealous body of detractors who find it unutterably bad

  2. each has a strong, tightly-knit, zealous body of fans who find it unutterably divine

As such, both end up being example languages for all sorts of points (half-baked or otherwise). Favorite characteristics for maligning them are, in Java’s case, the immense scaffolding requirements of the language and, in Perl’s case, a harsh-looking syntax. Java’s noun-oriented philosophy is its excuse for the scaffolding. The design of Perl is often justified by explanations such as Paul Graham’s comment that “Real ugliness is not harsh-looking syntax, but having to build programs out of the wrong concepts.”

One of Java’s design philosophy principles is protection for the programmer from himself. It is considered a Good Thing that Java incorporates features intended to make it difficult to shoot yourself in the foot. One of Perl’s design philosophy principles, meanwhile, is that of enabling the programmer (TIMTOWTDI and “makes easy things easy, and difficult things possible”). In the process of trying to make dangerous things difficult, Java also makes things that should be easy rather more difficult. In the process of trying to make difficult things possible, Perl (the “Swiss Army Chainsaw”) also makes it easy to saw your own leg off. There’s a distinct trade-off going on, here. On the same subject, C and C++ are compared thusly, by Bjarne Stroustrup:

C makes it easy to shoot yourself in the foot; C++ makes it harder, but when you do, it blows your whole leg off.
For a better idea of how that’s relevant, feel free to substitute “Perl” for “C” and “Java” for “C++”.

Flon’s Law provides an interesting thought on this subject, as well — one quite worthy of consideration:

There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code.

So, we must ask: What is the important guiding factor toward which we should aim our efforts when designing, or simply choosing for a given project, a programming language?

The answer, I believe, is enabling the programmer to write good code. We cannot prevent him writing bad code. We cannot force him to write good code. We cannot get good code by simply making programming easy. What we must do is make writing good code* as rewarding as possible, and to hell with all other concerns.

The definition of “good code” will change from project to project, to be sure, but within general problem domains there will be definite common characteristics that will arise again and again in the language that best rewards writing good code. The definition of “writing” never changes, however, for this metric.

a note about frameworks and other peripheral tools: If you want to get something done Today, the frameworks and other peripheral tools associated with a language are likewise valuable in direct proportion to their facility for enabling your programmers to write good code, but such value cannot be usefully measured in a vacuum. It must be measured as part of the overall results of evaluating the language. In the here-and-now, your familiarity with a langage, enjoyment of working with it, and ability to expand your knowledge of it quickly are additional factors to keep in mind.

If, on the other hand, you instead wish to invest in your future, you must consider not what frameworks, IDEs, libraries, and so on exist for a given language, but what among such things will likely exist when you need them at crunch-time, at some point in the future for which you are planning. This includes measuring your ability to generate tools for that language yourself.

* Writing good code should probably be defined in the following ways:
  1. writing: actually producing, measured not in lines or in syntactic elements or in time spent, but instead measured in useful progress toward the project’s goal — generally, actual functionality
  2. good: maintainable, readable, extensible, and above all well-behaved
  3. code: this includes source code, documentation, and delivery mechanisms, but not anything that ends up getting thrown away
As hinted above, that which does not qualify as “writing good code” should not be measured as a positive result, nor should it be measured as a negative result. It is irrelevant entirely. Only the “writing good code” metric itself is of value. This is, incidentally, also the central goal of hiring a programmer, so when performing employment interviews for programmers you should definitely keep this in mind.

16 August 2006

ITLOG Import: Elegance

Filed under: Cognition,Geek,Popular — apotheon @ 05:03

The following is imported from a now effectively defunct weblog of mine called ITLOG, and was a featured “Soapbox” item at TechRepublic. It is reproduced here with some modifications.

elegant (adj.): characterized by a lack of the gratuitous
C.A.R. Hoare: “There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.”

There’s a long tradition of referring to the elegance of a system. In the IT industry, this tends most commonly to be applied to source code, and it is generally accepted that the more elegant it is, the better. Elegance is differentiated from other superficially good things in a number of ways, including the common assumption that elegance goes deeper, and applies more universally, while these other “good” things are only good within certain constraints.

For instance, “clever” source code is good for its cleverness, but can be bad for maintainability — mostly because clever code is often difficult to understand. Cleverness also falls short because of a simple principle famously articulated in an email signature of Brian Kernighan’s: “Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.”

Another example is object oriented programming. Probably ninety-some percent of the competent programmers out there are thoroughly sold on the concept that OOP is the holy grail of programming techniques, and any further advances in programming techniques are just fine-tuning OOP techniques. I think this common perception is an outgrowth of twenty years of corporate influence on the evolution of programming, where large numbers of mediocre programmers end up handling the same codebase over the course of its lifespan. Two orthogonal systems of minimizing the damage a mediocre programmer can do to a project have been introduced to programming practice with a great deal of success: version control and object oriented programming.

Object oriented programming isn’t the holy grail, though. It doesn’t in any way aid with the creation of truly excellent code, in and of itself. It simply aids in the avoidance of truly atrocious code, and even then only in an aggregate view of a complex project. When you start drilling down to the individual bits and pieces of a complex software project that has passed through the hands of a great many mediocre programmers, you’ll start seeing atrocious bits of code that limp along just well enough to keep working, as long as they’re strictly encapsulated and separated from the rest of the codebase (except for its API, of course). Encapsulation and modularity are good things in general, but they aren’t immutable axioms of goodness.

One of the trade-offs with object oriented programming is that it encourages repetitive action and tedious effort in writing code. Have a look at some “enterprise” class Java source code some time and start paying attention to how much of it is actual program logic, as contrasted with how much of it is scaffolding imposed on the source by Java’s object-orientedness. In fact, if you really want to understand what’s going on with object oriented programming and other superficially “good” things in programming, I recommend you start comparing how easily one can produce short, elegant code in various languages, and pay attention to why one language produces a shorter, more elegant solution than another. I think you’ll find some surprising facts come to light.

Of course, it’s true that brevity is not strictly synonymous with elegance. In fact, Perl golf — the practice of passing code around between programmers to see how short a given algorithm can be made — is a thoroughly gratuitous sport, concerned little, if at all, with elegance. In pursuing elegance, it is more important to be concise than merely brief. In a general sense, however, brevity of code does account for a decent quick and dirty measure of the potential elegance that can be eked out of a programming language, with length measured in number of distinct syntactic elements rather than the number of bytes of code: don’t confuse the number of keystrokes in a variable assignment with the syntactic elements required to accomplish a variable assignment. Armed with that definition of the term “shorter”, you should be able to make some meaningful comparisons of the elegance possible when working with various programming languages.

In particular, you might notice that without using any object oriented techniques, Common Lisp and Perl produce much shorter examples of certain algorithms than Java and C++. Even if you cut out all the object oriented scaffolding in the Java and C++ examples, you still typically end up with a lot more code, as measured in discrete syntactic elements. Things like lexical variables and anonymous blocks (or, roughly equivalently, lambdas) tend to make for much simpler, more elegant solutions than imposing rigorous OOP structure. In fact, the more you examine the matter and make such comparisons, the more I suspect you’ll come to realize that OOP itself has nothing to do with producing elegance, and everything to do with limiting opportunity for mediocre programmers to produce cruft and introduce bugs.

Elegance is about the gratuitous — or, rather, avoiding the gratuitous. It’s true that sometimes people disagree about which of two or more things is the “most elegant”, but this arises from underlying assumptions rather than any true subjectivity of the principle. Each of us has a set of operating assumptions, some greater (meaning: bloated and cumbersome) than others. Where something conforms to one’s expectations and assumptions, it is seen to not lack in elegance in that manner. Someone that does not have the same underlying assumptions might see the same thing as atrociously inelegant, but having a different set of assumptions would overlook similarly subjective quirks in another example that are, to the first person, inelegant.

Specifically, someone with assumptions derived from long indoctrination by the OOP crowd might overlook all the scaffolding imposed by a language like Java for using OOP techniques, and see something that takes up 50 lines of program logic and 150 lines of OOP scaffolding as elegant. Meanwhile, a long-time Perl hacker might take one look at that and see it as the inelegant monstrosity it is. This Perl hacker, on the other hand, might write 30 lines of procedural code to perform the same task, and the Java programmer might look at it and wonder why it isn’t more modular, simplifying the program logic itself and making the whole thing more scalable for future code maintenance, thus rightly seeing the inelegance of the procedural hack the Perl programmer threw together.

This doesn’t make elegance subjective: it only makes our individual perspectives on it subjective. If we can discard the assumptions of both the Java developer and the Perl hacker, and recognize the underlying principles of source code design that contributed elegance to each solution, we could probably turn the same set of solutions into something much, much simpler and more elegant, in terms of its program logic and cruft-weight. Unfortunately, languages like Java are not really suited to that sort of optimization for elegance: you really need a language more dynamic than that, such as Perl, Python, Ruby, or basically any Lisp. The more a language lets you define the language you’re using on the fly, the more likely it is to allow an excellent programmer to produce elegance, which should really be the end goal of writing code, generally speaking: elegant solutions.

All really useful principles of programming, or systems design in general, seem to be practical, case-specific extrapolations from my fundamental definition of elegance. In short, they all seem to boil down to this one instruction: If it’s gratuitous, find a way to get rid of it. For example, consider the Pragmatic Programmers’ DRY principle — Don’t Repeat Yourself. In short, it is a Good Thing to avoid repetitions of data and program logic in your code. Any time you find yourself having to repeat or rephrase something in your code, reinject data into your data model from wherever you have it stored, and so on, you’re screwing up. Ask yourself whether DRY is really useful by reducing repetition in and of itself, or by reducing gratuitous repetition. After all, recursion and looping behavior might also fall within the definition of “repeat yourself”, but I don’t think anyone (sane) would ever recommend eliminating all loops and recursion from all programs. Sometimes, you just need your program to perform a given set of instructions on a long list of slightly different items. Often, loops and recursion make source code more elegant.

This ties in very nicely with the more general, more philosophical concept of aesthetics, and that provides some understanding of why it is possible to look at source code and, without yet consciously knowing what’s wrong with it, have an immediate intuitive reaction to its inelegance. That’s not to say that something can’t be aesthetically pleasing without being perfect in its elegance, of course. Instead, the ability to recognize some characteristics of elegance is what leads to an aesthetically pleasing perception of the subject.

Elegance is not about aesthetics. Rather, aesthetics is about elegance. Ostentation lacks aesthetic appeal, and is inelegant (read: “tacky”), because it’s gratuitous. Simplicity is often not elegant either: if something is too simple, it is nonfunctional, and fails to achieve its aim. What makes something beautiful is not strictly simplicity, symmetry, complexity, or any other such characteristic. Instead, what makes something beautiful is that its characteristics are all appropriate to its purpose. Complexity can be exceedingly beautiful, as long as it’s not gratuitous complexity, which is just chaos and confusion. Likewise, simplicity can be exceedingly beautiful, but if you make something gratuitously simple, you get dullness rather than beauty. Gratuitous simplicity is merely boring.

When you’re writing source code, make it elegant. When you’ve written something, go back and look it over, and for each and every thing you’ve done you should take a moment to question whether it’s really necessary, or even functionally desirable, to have it in there. You probably won’t get it perfect, but you can at least make it awfully pretty, which is a good thing as long as you do so by addressing elegance rather than trying to disguise the inelegance of your code by conforming to formatting conventions without rethinking your program logic at all. Refactoring, in the end, is really just about looking with fresh eyes for any opportunities to introduce elegance by removing the gratuitous.

If you’re unlucky, you may discover that making your code significantly more elegant might require rewriting it in a different language.

The following definitions are from Princeton WordNet:

elegant (adj.): of seemingly effortless beauty in form or proportion
gratuitous (adj.): unnecessary and unwarranted

« Newer Posts

All original content Copyright Chad Perrin: Distributed under the terms of the Open Works License