Chad Perrin: SOB

15 August 2009

Learning Memory Management

Filed under: Geek — apotheon @ 05:05

A lot of people talk about the benefits of garbage collection in a programming language. It’s true — the benefits are significant, and worth having available. On the other hand, there’s something to be said for memory management the “hard way”, particularly if you ever want to be a competent programmer.

The problem with only learning a single language, and selecting one with garbage collection, is that memory management in your software will probably become more and more prone to memory management issues. While it’s easier to write code that doesn’t have significant memory management issues as you get more of the memory management nitty gritty automated by your language implementation, it’s also harder to use memory management well. Good memory management requires a lot more knowledge in a garbage collected language than in one where you need to allocate and deallocate “manually”.

With no memory management, you explicitly see everything it’s doing right there in front of you, and it’s a relatively simple matter to allocate and deallocate. With a few rules of thumb and half a brain, you can turn that into well managed memory. The keys to good memory management when you’re doing it all yourself are making good decisions about how much to allocate at a time and not forgetting to deallocate, generally speaking.

With a simplified, but effective, automated memory management system — such as reference counting — you have to think more abstractly to make sure you’re using memory management effectively. Scoping, circular references, and other challenges arise that are not as direct and conceptually simple to deal with. Sure, you can just ignore the memory management and assume it’ll be done for you, but you’re likely to end up with less efficient code if you don’t know what the reference counting system is doing for you. In certain edge cases you can still manage to create memory leaks and other problems.

Finally, with an automated, full-featured garbage collector, you then have to think even more abstractly, to the point where nobody is likely to get it right all the time. Have you ever seen a Java application suddenly slow down in mid-operation for no apparent reason? Yeah, I see that semi-regularly too. If you don’t know enough about how the garbage collector works (and plan around that painstakingly), you could very well see your application slow to a crawl when the garbage collector is running. That can be deadly for certain types of software, of course, which is why garbage collected languages are not typically used for those types of software. There are even rarer — but typically more difficult to reason through — edge cases than with reference counting where things may not be deallocated properly, and may lead to bugs. Getting memory management right with a garbage collector is more difficult to achieve with any certainty than the difficulties of no memory management and of reference counting put together. You can get it “good enough” to think you got it right pretty easily, though (if you don’t know anything about memory management other than “the garbage collector does it for me”).

This is why learning to work without an automated memory management safety net is important, in my opinion. This is also why, sometimes, reference counting is the best way to go: it’s almost as automated as a garbage collector, and almost as conceptually simple as doing the memory management yourself.

In short, sometimes garbage collection is the best way to go, but it is not an excuse to just ignore memory management and figure it’s being done for you. It is being done for you, of course, but it’s probably not being done right unless you manage the memory management as well. You need to know about memory management to really get the full benefits of automated memory management.

This, of course, is why I so lament the fact that I don’t know more about memory management than I do. I know enough to deal effectively with Perl’s reference counting, and use it to best effect. I don’t, however, know it well enough to manage Ruby’s garbage collection as well as I’d like. I’m sure I’ll feel a lot better about the whole thing when I get around to (re)learning C to a level of some competency some time in the next couple of years.

I think it was Joel Spolsky who called the relevant principle the Law of Leaky Abstractions.

It was Sean Reifschneider who said “If Java had real garbage-collection, it would delete most programs before it executed them.”


  1. The hard part about moving to a runtime without automatic memory management is that the threshold to utilize library code is much higher. Fewer guarantees about their safety and security can be made.

    Comment by Amit S. — 16 August 2009 @ 05:13

  2. This is true. Of course, I intended to specifically address making use of automated memory management in one’s code, and not so much using code others wrote that makes use of automated memory management.

    Comment by apotheon — 17 August 2009 @ 10:30

  3. That’s why I’m glad that Synergy/DE’s object cleanup uses references instead of GC. When the last reference is lost, the object is released immediately. You can work with that — if you don’t want all that cleanup to happen until later, just copy the reference to an array of “things to clean up later”, then release the array when you’re ready to drop the bomb.

    Comment by Chip Camden — 17 August 2009 @ 03:55

  4. That Law of Leaky Abstractions applies to so many things. I was just talking with a client less than half an hour ago about how the hardest thing for me about Visual Studio is figuring out what it does for you automatically, and how.

    Comment by Chip Camden — 17 August 2009 @ 03:57

  5. I do tend to like the fact that reference counting allows more straightforward options for specifying exact persistence periods than GC implementations tend to allow. In at least most cases (I haven’t worked with “all garbage collection impementations” by a long shot), it seems that the best one can usually manage with GC is to say “Don’t clean this up at this time,” which isn’t the same as being able to say “Clean this up now.”

    As for Visual Studio . . . the Law of Leaky Abstractions is a big chunk of the reason that much of what MS Windows does annoys the crap out of me. Visual Studio has that in spades, not only because of how VS itself works, but because of how the languages and frameworks and libraries (oh my!) for which VS was designed work, and also because of how the OS on which the software will run affects what the software will actually do come crunch time.

    Comment by apotheon — 18 August 2009 @ 12:37

  6. In my opinion, memory management is a mandatory skill every programmer should learn. I would throw that in one bowl of programming foundations together with pointers, recursion and similar topics. Many languages you might end up working in don’t even have a (good) garbage collector available.

    Comment by Tobias Svensson — 18 August 2009 @ 04:38

  7. You can force the .NET garbage collector to “collect now”: GC.Collect(); ButI hate having to tell it to get off its butt and do its job all the time.

    Comment by Chip Camden — 18 August 2009 @ 09:51

  8. Tobias Svensson:

    Agreed. I wrote about memory management in particular because everyone who reads about this stuff on the Web already knows that scads of people recommend learning about recursion, and far more learned men than me have suggested more eloquently than I could at this point the best reasons for learning about pointers regardless of one’s ultimate, and most common, development tools. I’m amazed that nobody seems to care about learning anything about memory management beyond thinking that GC languages are always the best, and languages that don’t have them are primitive and should be avoided.

    Chip Camden:

    Is the .NET garbage collector prone to not cleaning up in a timely manner, then?

    Comment by apotheon — 18 August 2009 @ 11:18

  9. It depends on what you’re doing. Massively recursive routines that allocate and deallocate a lot of objects are most prone to swallowing the system whole, only to spend a long time later burping it up. Calling GC.Collect() after each recursive call often helps to smooth that out. I’m pretty sure that if you simply translated my Synergy/DE mergesort into C#, you’d have that problem unless you added the explicit call to Collect.

    Comment by Chip Camden — 18 August 2009 @ 12:21

  10. That seems like a pretty much ideal (if simple) case for demonstrating the point of this SOB entry.

    Comment by apotheon — 18 August 2009 @ 12:48

  11. Yes, because a merge sort must eventually create multiple lists for each element in the original list, and you’d like to have all those intermediate list objects cleaned up long before you’re done. Otherwise, the memory requirement becomes much greater than the O(n) advertised — approaching O(n2).

    Comment by Chip Camden — 18 August 2009 @ 01:30

  12. My <sup> tag got stripped. That 2 should be superscripted (squared, not times two).

    Comment by Chip Camden — 18 August 2009 @ 01:31

  13. fixed

    Comment by apotheon — 18 August 2009 @ 02:07

  14. Thanks!

    Comment by Chip Camden — 18 August 2009 @ 02:17

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

All original content Copyright Chad Perrin: Distributed under the terms of the Open Works License