Chad Perrin: SOB

17 April 2007

OOP and the death of modularity

Filed under: Geek — apotheon @ 03:13

In recent years I've seen what looks like a slight resurgence of interest in software modularity. I've only really seen it in open source developer communities, and only in limited contexts, but it does seem to be a real, measurable increase. People are looking at programming style and saying "This needs to be decoupled," or at least "This needs to be more loosely coupled." This is a good thing, in and of itself — the tendency has for a long time been to increase coupling over the life of a given software project, and to increasingly fail to instill decoupled programming values in people new to established projects.

Of course, in the aggregate, that trend still seems to be just as strong as ever. I'm seeing more eddies at the edges of the mainstream, to stretch a metaphor mostly leached of its meaning by the contempt of long colloquial familiarity. Some of this can likely be attributed to excellent books about good programming practice from people who obviously know their craft, like The Pragmatic Programmer by Andrew Hunt and David Thomas. Some of the credit for the eddies can probably be laid at the feet to the increasing influence of the most experienced superhackers in the open source world, influence imparted in part by the facility of communication and community accretion provided by the Internet — communities like PerlMonks. Other reasons certainly exist as well. The growing popularity of a (very) few languages that substantively get object oriented programming "more right" than others that have hogged the limelight for several years may well be a large part of that. I'm talking about languages like Ruby that encourage good programming practice without trying to force it by depriving the programmer of tools that may prove useful, and that implement excellent OOP constructs faithfully and well. Smalltalk may never gain the prominence once dreamt for it, nor even what prominence it once had, but it certainly is having some influence in other languages today.

Unfortunately, object oriented programming — good, bad, or indifferent — is also probably somewhat to blame for the tight coupling of code in larger software projects in the first place. Think about it for a moment: what's the most effective way to create a complex system with decoupled code? The answer is simple. Create small building blocks, and use them together to construct the system. Now think about what object oriented programming does: it creates a means of loosely coupling code in a given complex system. It doesn't decouple it — OOP just loosely couples it. No part of a single OO program is useful all by itself. The parts all rely on each other to be at all useful as anything other than an example of how to write code. The way to really decouple the code in a massive application is not to refactor its class hierarchy and redefine its object interfaces — it's to break the monstrosity up into individual programs that all do one job, do it well, and can be used together with glue code and other mortar for the bricks used to build the complex system.

The evidence is all around us. There are currently two major traditions of operating system design:

  • Microsoft Windows is the tradition of tightly coupled code, with more than 50,000,000 lines of code in the OS. Most of that code is very tightly coupled — pull out one brick and the whole house comes tumbling down.
  • Unix is the tradition of decoupled code, whence the very notion of a user environment composed entirely of programs that "do one thing well" comes to mind. This tradition, in general, is only more strictly enforced in open source unices.

OpenBSD, apparently, is exceedingly good at maintaining, and even improving on, the "do one thing well" tradition of the Unix Platonic ideal. Linux-based systems are in some ways quite good at it, and in some ways not quite so good (contrast APT and the extensive archives of separated little bits of software with the gooey — pun intended — blobs like KDE and GNOME). MacOS X, now composed of a unixlike core a bit like the deeper layers of the Earth with its GUI a massive, rigid surface like the Earth's crust, minus the shifting of tectonic plates, floating on a viscous mantle of glue code. Finally, we reach the avoidance of modularity with huge, proprietary, closed source, tightly coupled application suites (think of Adobe Creative Suite 3) and the tightly coupled operating system they love (the aforementioned MS Windows). These things happened in part because it became possible to build ever-larger programs (and maintain them, after a fashion) with lots of duct tape and baling wire, once object oriented programming increased the ability of programmers to loosely couple code into somewhat maintainable chunks within the larger project as a whole. It increased the maximum effective size of programs by abstracting away the hugeness of the application into smaller, more easily reasoned chunks of code.

Yes, that's a good thing, all else being equal. The thing that wasn't equal was the tendency of humans to fill a new development technique's capacity for scalability as long as that technique is used. As OOP provided greater ability to write ever-larger programs, humans wrote ever-larger programs — even when they didn't need to do so (and probably shouldn't have done so).

Tight coupling ruins stability. Loose coupling doesn't damage it as much. Decoupling — actual independence of modular parts — preserves stability (all else being equal, of course). The world at large continues to move toward larger programs, which means larger interconnected, if loosely coupled, code. This happens mostly because our ability to sustain development for ever-larger programs increases, and we rush to fill the potential with ever-longer feature additions. Yes, there are those eddies in the mainstream, where people talk about code modularity and really try to practice it, but the downside of that is that they're mostly talking about loose coupling within a program, of separating parts of a program into objects. In a few cases, things are actually decoupled, such as in CPAN, but even there things are getting tied together into larger modules as time progresses (even if they're object oriented modules that employ loose coupling). Some of those open source operating systems (Ubuntu springs to mind) are visibly abandoning their tradition of decoupled functionality a piece at a time, but it's in nontrivial part the tradition of small utilities that "do one thing well" that preserves the stability, and even the security, of the major open source operating systems.

What? You didn't think open source operating systems were usually more stable just because the developers are more virtuous, did you?

20 Comments

  1. This whole article makes no sense. What the hell does the size of the Windows OS have to do with whether or not OOP hurts modularity? How does UNIX's "do one thing well" programs speak against OOP.

    Comment by Jonathan Allen — 17 April 2007 @ 11:26

  2. I'll just assume I didn't explain things well enough for you to understand my points. I'll try to summarize it in clearer form for you:

    1. OOP allows for more maintainable code in larger projects.
    2. As technologies allow things to scale upward, people tend to scale them upward — even when they shouldn't be scaled upward.
    3. As a result of this, object oriented software projects like MS Windows sometimes get really, really big and bloated.
    4. That happened to MS Windows, where a better result would have been to include additional functionality the way the Unix tradition tends to do things — create small utilities that each do one thing well.
    5. Thus, the MS Windows user environment is full of huge, tightly coupled programs that are, in turn, tightly coupled with one another.

    Thus, MS Windows is not modular.

    You may disagree with my speculations and conclusions (though I hope that if you do so, you do so intelligently rather than merely as a knee-jerk reaction to something). After this summary, however, I really hope you can at least see that, given my premises, my conclusions aren't unexpected and everything ties together into one coherent whole.

    Comment by apotheon — 17 April 2007 @ 12:49

  3. I actually understood what you were saying, although I had to make allusion to something smaller then an OS (e.g. browser extensions and IM client plugins). OOP makes projects easier to manage more lines of source code, although it would be best applied to making things modular, and therefore more stable if something fails (such as if a Firefox extension fails, it doesn't crash the whole browser, it just gets disabled).

    Comment by Joseph A Nagy Jr — 17 April 2007 @ 04:14

  4. Indeed — and, it would be best to not combine different bits of functionality into a single program just because it's easier to do so with OOP techniques than without them.

    I guess, between what I just said here and what you said, Joseph, that's really the meat of my post.

    Comment by apotheon — 17 April 2007 @ 05:16

  5. I thought MS Windows was based on a microkernel architecture (= very modular) and, on the other hand, UNIX was based on a monolithic kernel. Another point of view when analyzing this.

    Comment by rb — 18 April 2007 @ 01:04

  6. Alas, no, MS Windows is not based on a microkernel architecture. Unix varies from one implementation to the next. Linux and MS Windows both employ a modular monolithic kernel architecture, but while that of Linux is designed to be able to swap out modules cleanly and at a moment's notice, that of MS Windows isn't designed to be altered at runtime at all. Technically, the parts of the MS Windows kernel that are toward the edges are "modular", but they are tightly coupled via their interfaces and the kernel image in memory cannot be altered at runtime without bringing the system to a screeching, crashing halt — and probably permanently hosing your system.

    The HURD kernel, while not exactly the best thing going, is a Unixlike OS kernel that is truly a modular microkernel architecture. It's also slow as crap on a cold day.

    The key here, in any case, is not the kernel architecture. It's kernel size and what parts of the system are necessarily included in the kernel. With Linux, you can (if you so desire) tie pretty much everything up to (but not including) high-level userspace stuff into the kernel by compiling a custom kernel, but the operational default Linux kernel doesn't even include many of the driver interfaces you're likely to use. They get attached to it when the system loads as modules, not as part of the kernel itself.

    Really, the quickest look you can get at what I'm talking about, if you want to use Linux and MS Windows kernels as your examples, is the lines of code count for a minimally functional system. Hint: Linux is far, far smaller, and Linux isn't even the best example from the free unix world.

    Comment by apotheon — 18 April 2007 @ 02:15

  7. Oh, I was completely wrong then. Thanks for sharing such an expert view, apotheon :)

    Comment by rb — 18 April 2007 @ 06:45

  8. Sorry, but that is just dumb. An OOP program should be built out of components (which are objects) which can often be reused. OOP and encapsulation is specifically intended to reduce the coupling between parts of the application.

    Just have a look at all of the libraries available for Java. Most Java applications are pretty much assemblies of a bunch of open source libraries.

    Another way of looking at it: a process is just an object that has a constructor that takes a string, and provides streams to write to and read from.

    Comment by CB — 18 April 2007 @ 08:39

  9. An aspect of complexity which has been bothering me for years is the role development tools play in enabling coupling. The 'power' of vi, make, cc and man pages pales in comparison to Visual Studio and MSDN, and yet I find my vi-written software has less sprawl and in the end is (for me at least) easier to understand. Or maybe I'm just a dinosaur. I'm not going to blame Intellisense for Vista, but I think there's an interesting intersection between technology, sociology and vision here. If I recall, wasn't one of the selling points of COM to reduce coupling between application components?

    It may also be unfair to single out Microsoft when we have DCE and CORBA as signposts on the road to modularity. I find it interesting that many of the same issues are resurfacing in the debate between REST and SOAP. It's also interesting to consider that a Rails application is in essence two fat apps (a browser and a database) glued together by a glorified shell script and two text-friendly pipes. It feels like we're back where it all started, only with everything super-sized!

    Comment by Rod — 18 April 2007 @ 09:33

  10. No problem, rb — I aim to please.

    It's easy to have the impression that MS Windows uses a "microkernel architecture", since Microsoft was talking about that idea so much in the '90s and claimed that NT used a microkernel architecture. To "prove" it, Microsoft started publishing a bunch of diagrams of the kernel architecture and used the term "microkernel architecture" in both its marketing materials and its certification texts (I have a bunch of those on my shelves).

    From a certain perspective, one might view the MS Windows NT kernels as using a microkernel architecture, if one tilts one's head slightly and squints, but when the term "microkernel" was coined the intent was not what Microsoft produced.

    Microsoft, I believe, has finally abandoned the term "microkernel architecture" in its descriptions of NT-based systems. I seem to recall seeing MS Windows Vista's kernel as a "hybrid kernel", as the modular monolithic kernel it uses does bear some qualities of both microkernels and monolithic kernels. It gets a lot of flak even for using the term "hybrid kernel", however, as ultimately it's still more like a pure monolithic kernel in many ways than even the Linux kernel — and the Linux guys themselves don't bother to call the Linux kernel anything more like a microkernel than a "modular monolithic kernel".

    In essence, Microsoft's justification for calling its kernel a "hybrid kernel", or a "microkernel architecture", is that its monolithic kernel is internally organized a bit like a microkernel, with parts that are related to a specific subsystem being grouped together. It's a bit like the difference if we describe it with Lego storage methods:

    • microkernel: Get a bunch of boxes that interlock with each other, kind of like legos themselves. Fill each one with a particular color of Lego blocks. Put the boxes together on the shelf. You can remove, replace, and rearrange boxes to get the combination you like for optimal storage.
    • monolithic kernel: Get one large box. Fill it with all your Legos. Stick it on a shelf. You're done.
    • modular monolithic kernel: Get a larger box and several smaller boxes. Put all of your standard Legos of all colors in the larger box. Organize the weird, nonstandard Lego blocks by type, and put them in smaller boxes. Put those together on the shelf. You can mix and match the smaller boxes of weirder Legos, and you can change your mind about which Lego blocks constitute "weird" so that different Legos are included in the main box or in the smaller boxes, but you still have to have a larger box in the middle and you have to take it off the shelf to make any changes to it.
    • Microsoft-style hybrid kernel: Get one large box. Put all the Legos in that one box. When you do so, however, make sure you first put in the red Legos, then the gray Legos, then the blue Legos, and so on.

    That's an off-the-cuff analogy, but I hope it helps explain what's going on with kernel architectures in sort of a general sense.

    Within the realm of microkernels, of course, there are different approaches — just as there are different approaches to monolithic kernels, like the Linux modular monolithic kernel. A couple of terms that might be worth looking up, if you're really interested in microkernels, are nanokernel and exokernel. If I recall correctly (and please don't shoot me if I don't), the HURD kernel is built on the Mach kernel, and is in some ways a fairly standard, canonical microkernel architecture (in the real sense of the term). MIT has an exokernel that it uses for research. I'm not aware of any nanokernels in particular off the top of my head, but suffice to say that the central kernel itself is very, very small, and doesn't even include stuff like primary memory management. It's functionally not much more than you get out of your system BIOS, though it does provide much greater ability to add to it, rather than replace it, in operation with the additional subsystems needed to run your computer in a useful state.

    Comment by apotheon — 18 April 2007 @ 10:50

  11. CB:

    An OOP program should be built out of components (which are objects) which can often be reused.

    Oh, I agree completely. I never said otherwise. In fact, I very specifically commented that OOP is good at what it does, and laid the blame for things firmly in place at the feet of those who have exploited it unnecessarily to create ever-larger applications when such larger applications are not optimal, or even desirable. I think you must be reading something into what I said that wasn't there.

    Compare Nero's CD-burning software with something like K3b, for instance. While I don't much like K3b, and it's too tightly coupled with a few other things — like the full run of KDE libraries, for instance — it does do a reasonably good job of building its core functionality from separate tools that all work individually to accomplish separate ends. They're glued together into a very effective centralized CD/DVD-burning tool, however, for people who like having everything together in a single graphical interface. Nero, meanwhile, is a truly integrated massive optical disk burning application with everything tied together in an inextricable collection of functionality, none of which exists separately from the greater Nero application. I'm sure the developers at Nero are excellent object oriented programmers, and did a fantastic job of doing good, internally modular OOP with the Nero product, but it's still a closely integrated (aka: tightly coupled) whole. It only looks loosely coupled if compared with something similar built using purely line-oriented imperative programming with the occasional procedural bit of code organization.

    Rod:

    I understand exactly what you're saying, and agree. Part of the problem, I think, is that when you say "application" most people think "program", when what they should really mean is "interface to functionality". If you have a single application, you have a single interface to a set of functionality. It can be made up of one program, or it can be made up of a collection of smaller programs tied together to provide that unified interface. I'm definitely of the opinion that the latter is generally the better option. It provides true modularity — not the sort of pseudo-modularity that you get from a massive, monolithic heap of OOP code like in Adobe Acrobat or MS Windows Vista.

    I, too, find that using Vim as my development environment helps me organize a project better than IDEs. It's just one reason of many that IDEs don't tend to make me feel any more productive, generally speaking. I'm also in an on-again off-again love-hate relationship with Rails, which keeps enticing me with its ease of use to generate web applications, but keeps driving me away with its enclosed, "I am the world" approach to web development, where I have to essentially learn a whole new language (the framework) to build something that seems far too integrated to make me happy with what I create.

    Comment by apotheon — 18 April 2007 @ 11:06

  12. The original promise of OOP was that if you use an OOP language then automagically you will get modularity and loose coupling and code reuse and all the good stuff buzzwords. We all knew it was a lie at the time, and now we can see that certain sectors IT management went and believed it anyhow (or pretended to believe it, or whatever).

    Modularity and OOP are in the mind, not in the language. Choosing the ideal type of coupling is a design decision and choosing the ideal place to break a project into modules is also a design decision. The language will never fix bad design decisions for you.

    What produced MS-Windows is the management philosophy that you throw lots of programmers at a problem, give them a strict methodology and an OOP language and tell them to churn out heaps of code to a deadline. This approach fails with literature, it fails with music, and it fails for software too.

    The reason that Linux works is that they have a small core of very good programmers surrounded by a very large pool of reviewers and occasional programmers and they don't get hung up on strict methodologies or OOP languages or other distractions. Common sense and experienced leaders will make a stronger methodology than all the buzzwords in the world.

    They use a stable, well-proven language that everyone knows and they concentrate hard on getting the high-level design structure correct (even when that involves umpteen layers of redesign).

    Comment by Tel — 18 April 2007 @ 04:13

  13. Good thoughts here, apotheon, about how to do OOP (or not). You've definitely given me some ideas for my presentation next month.

    About Windows being OOP — most of it ain't. The vast majority is just API's written in straight C without classes. And it is one tangled mess.

    Comment by Sterling Camden — 18 April 2007 @ 04:30

  14. Sterling — I'd love to see anything you have for your presentation when it's ready. I'm sure I'd learn a thing or two. Will you be making it available online at some point?

    Comment by apotheon — 20 April 2007 @ 01:30

  15. [...] Matz linked to me in regards to my SOB entry titled OOP and the death of modularity. [...]

    Pingback by Chad Perrin: SOB » Incoming links can surprise you. — 28 April 2007 @ 01:59

  16. Yes, I'll repost some of the content at least.

    Comment by Sterling Camden — 28 April 2007 @ 11:40

  17. Excellent. Thanks, Sterling.

    Comment by apotheon — 28 April 2007 @ 03:34

  18. [...] In OOP and the death of modularity, Chad Perrin notices a trend in object-oriented programming in which classes become coupled with one another, reducing modularity.  Sometimes, by thinking in terms of objects rather than starting with processes, you can bundle too much functionality together into one class.  Interfacing with such a complex class soon requires complex knowledge about how that class operates, creating unnecessary dependencies.  By contrast, keeping the design simple and atomic requires less intimate knowledge and maximizes reusability. [...]

    Pingback by Using OO in Business Applications -- Chip’s Quips — 24 May 2007 @ 12:44

  19. [...] Finally, the last thing on the first page of the programming reddit when I looked was 10 tips for developing deployment procedures (or: Deployment Is Development). Yes, I wrote it. In fact, it's the SOB post just before this one. I wasn't even aware it was on reddit until I saw it there tonight. I've also noticed, in the process of following links around, that both OOP and the death of modularity and The One Important Factor of programming languages are still getting some action, through links various sources. Include Elegance (which I'm almost embarrassed to admit has been called a "seminal work"), and I might have the beginning of a "Joel On Software" or "Hackers And Painters" style of book in my future. Then again, maybe I'm just feeling too full of myself. [...]

    Pingback by Chad Perrin: SOB » the goings-on in coderspace — 2 June 2007 @ 12:46

  20. [...] ago), I wrote a consideration of the social effects of OOP on how software has "advanced" titled OOP and the death of modularity. In it, I may have seemed to say the opposite of what I've been implying so far, but the very [...]

    Pingback by Chad Perrin: SOB » Did OOP do this to us? — 20 February 2009 @ 06:54

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

All original content Copyright Chad Perrin: Distributed under the terms of the Open Works License