Chad Perrin: SOB

30 September 2007

I learn something new every day — this time about Python and Ruby.

Filed under: Geek — apotheon @ 03:34

It’s sort of a generally recognized “fact” that Python executes a little faster than Ruby. Generally speaking, nobody in the Ruby community seems to have any objections to that estimation of execution speed, at least up through Ruby v1.8.x (though 1.9, soon to be 2.0, is another story).

On the ruby-talk list, someone whose identifier string is “Ruby Maniac” (e.g. From: Ruby Maniac <foo@bar.baz>) started two or three threads about how Ruby isn’t as fast as Python, and therefore isn’t as good as Python, among other very trollish statements. This has spawned some actually useful discussion about Ruby performance and other characteristics of both the language and the implementations thereof, largely ignoring “Ruby Maniac”. It has also spawned a fair bit of discussion about how to properly handle trolls, with just a little bit of discussion about whether “Ruby Maniac” qualifies as a troll. I enjoyed some kudos for discussing the fact that, as much as I personally dislike Python and as much as “Ruby Maniac” has decreased the signal:noise ratio on ruby-talk, his(?) worst offense was really to the Python community by (mis)representing Python fans as trollish pricks.

Mostly, however, the resulting discussion has proved fruitful, in terms of addressing subjects of performance and other features of the language and implementations.

An interesting tidbit about the relative performance of Python and Ruby came up on Friday, but I haven’t gotten around to reading the interesting tidbit until today. For background, on Friday I saw that someone posted the following equivalent code snippets from Ruby and Python.

#!/usr/bin/env ruby

s = ""                                                                                                                   
i = 0
while line = gets
  s += line
  i += 1
  puts(i) if i % 1000 == 0
end


#!/usr/bin/env python
import sys

s = ""
i = 0
for line in sys.stdin:
  s += line
  i += 1
  if i % 1000 == 0: print i

The observation made about it was that the Python code was reasonably quick with output coming smoothly and at a fairly steady pace, while running the Ruby code resulted in output coming more and more slowly as execution progressed and the string got longer. I’ve tested it myself by piping a cat of several large text files into the script, and it does indeed slow down between every iteration of outputting a number incremented by a thousand, by a quite visible degree.

The reason, of course, is that in the above Ruby code a new object is being created for every single iteration of the string concatenation, which produces a fair bit of process overhead. I watched CPU usage jump up to about 85% for that process while watching the output of top in another term window. The suggested fix that came up on the list was to use the << operator rather than += for the string concatenation operation, so that instead of s += line it would say s << line. This actually uses a true in-place concatenation, rather than allocating a new string object that contains a concatenation of the previous string object’s value with the newly input string.

I already knew all that. I didn’t know using the += operator (well, technically it’s a method in Ruby, but it looks like an operator) would have that dramatic an effect, however. News to me, especially considering that the Python code was doing the same thing and didn’t have the same slow-down problems.

The optimized version of the Ruby code (using << instead of +=), then kicked the crap out of the Python code in a performance test. What surprised me about this is that Python apparently doesn’t provide an equivalent optimization option, however. This is the part I learned today — “Florian Frank” posted to ruby-talk the following as an explanation for why Python has no equivalent concatenation performance optimization:

Yes, for a + b python allocates a new string of length(a) + length(b), then a‘s and b‘s content are copied into the new string. There is only an optimization if a or b is empty. a << b in ruby first reallocates memory of length(b) to a‘s then copies only b‘s content to a. If b is empty ruby does nothing. The more complex concatenation operation in python is caused by python’s immutable strings.

The final lesson I get from this, at this point, is that there’s yet another reason for me to dislike Python’s immutable strings.

Granted, there may be some other “better” way to get similar optimization in Python that has not caught my attention. If so, I’d like to know about it. I certainly didn’t expect to find that something like an iteratively growing string operation would be so much more optimizable in Ruby than in Python.

It sorta flies in the face of the common “wisdom” about Python generally outperforming Ruby. Sure, it’s just one use case comparison out of many, but it seems like kind of a big one.

37 Comments

  1. The standard answer for years now is to append the strings to a list, and join them when you are ready to use them. Often, this is the implementation of +=; I believe that’s what Java did, for instance, switching += with a StringBuffer accumulation added by the compiler behind the scenes.

    Python “generally” outperforms Ruby right now on comparable algorithms. When you use two different algorithms in two different languages, all bets are always off; language differences are generally swamped by the differences in algorithms.

    If this post says anything, it speaks to the dangers of how high-level languages can obscure the underlying algorithms, and therefore obscure the performance implications of them. This goes for Python, Perl, Ruby, et al equally. One advantage of STL is that the standard specifies runtimes for its algorithms.

    Comment by Jeremy Bowers — 1 October 2007 @ 02:51

  2. Hi,

    I think, and for many reasons, Python code run faster than Ruby’s one.

    To do iterative concatenation in python you can use StringIO or cStringIO it is more efficient then += operator

    for example:

    #!/usr/bin/env python
    import sys
    from cStringIO import StringIO
    s = StringIO()
    i = 0
    for line in sys.stdin:
       s.write(line)    
       i += 1
       if i % 1000 == 0: print i
    

    and use s.getvalue() to get result

    Comment by Smel — 1 October 2007 @ 03:12

  3. WordPress, POS that it is, is gagging on the less-than-sign. I respect you too much to crap all over your comments for something you didn’t do… I’ll leave my frustration at that.

    J.Ja

    Comment by Justin James — 1 October 2007 @ 04:01

  4. It also says a lot about the current “state of the art”. To expect the run of the mill programmer to know the internal differences between “less than less than” and “+=”, or to even expect them to remember thousands of little rules like “use ‘less than less than’ instead of ‘+=’ under these circumstances” is rediculous. The mere mortals who make up the bulk of the population who become programmers are just not built this way. This isn’t even a Ruby slam… .Net suffers the exact same problem, and instead of using a different operator, you use the uncomfortably unweidly StringBuilder class instead of concatenation operators. And of course, using an object instead of an operator makes it much less likely to be used! Who in their right mind really feels like declaring an instance class and calling methods on it, instead of “sTemp += ‘blah'”? Umm… no one. This is the same reason why people use regex’s in Perl constantly and never in OO language (and halfway between “constantly” and “never” in PHP, where it seems to be a function call), even though the regex pattern syntax itselx is identical… doing it via object is a hassle and doing it through an operator is a cinch. Indeed, I will be posting about this tonight in my space, in the context of threads.

    In any event, I feel that this is the exact type of thing that optimizing compilers are for. Let’s get real, if your language (or even your compiler) get a bad rep for “slowness” when really, it’s just too obscure or difficult to write fast code, you’re language/compiler/etc. won’t do well, and people will avoid your product (who wants to try to market or sell or even open source a project in a language known to be slow? Look at how long Java took to gain traction due to its speed issues!). Of course, being “fast” in and of itself is not a guarantee of success (look at Python’s and Perl’s market share, despite being very fast for what they are).

    I ran on too long here…

    J.Ja

    Comment by Justin James — 1 October 2007 @ 04:01

  5. You may want to try in Python (untested):

    s = [] i = 0 for line in sys.stdin: s.append(line) i += 1 if i % 1000 == 0: print "".join(i)

    This should do a single concat at the end instead of copying every time. Join is also known to be faster than a direct concat above about 500 characters.

    Comment by James H — 1 October 2007 @ 04:11

  6. And of course WordPress killed the indentation on that. But the changes are pretty obvious.

    Comment by James H — 1 October 2007 @ 04:12

  7. Nobody write Python write Python like that. The idiomatic Python code is: s = [] for line in lines: s.append(line) s = ''.join(s)

    Comment by nirs — 1 October 2007 @ 05:08

  8. (Probably) the single most obvious Python efficiency tip is don’t concatenate strings. Seriously – I’m a Python newbie and I’ve seen this a bunch of places (most recently Alex Martelli’s Python Beginner’s talk at the Bay Area Python Interest Group).

    The most common replacement idiom is (indentation level indicated with “>”)

    >s = [] >i = 0 >for line in sys.stdin: >>s.append(line) >>i += 1 >>if i % 1000 == 0: print i >#and if you want the string back out at the end >print “”.join(s)

    Some people do more complicated things with the python module StringIO – use file semantics to read and write to a memory buffer like a file.

    Comment by metapundit — 1 October 2007 @ 05:10

  9. The CPython story has another couple twists in the eval loop (“Florian Frank”? was probably just looking in stringobject.c). If no one else has a reference to the ‘a’ string it is safe to realloc it and memcpy just the ‘b’ string onto the end so Python sometimes does this. The string also can’t be interned. Python uses a memory pool to allocate small objects. I’m not intimate with the pool allocs but I think the growing string might be moved around more than once before it gets too big for the object pool, after which it should be fine to realloc it.

    This optimization is NOT part of the python spec, other implementations (Jython, IronPython, PyPy) are free to do their own thing. For joining large numbers of strings it is idiomatic to do ”.join(strings_list) which gives the interpreter a chance to add up their lengths and do a single malloc (followed by memcpys). str.join works the same in all implementations.

    [and yes, WordPress seems to hate comments]

    Comment by Jack Diederich — 1 October 2007 @ 05:15

  10. [wordpress hates me, I keep getting database errors so I’m stripping this comment down until it works]

    The CPython story has another couple twists in the eval loop (Florian Frank was probably just looking in stringobject.c). If no one else has a reference to the A string it is safe to realloc it and memcpy just the B string onto the end so Python sometimes does this. The string also can’t be interned. Python uses a memory pool to allocate small objects. I’m not intimate with the pool allocs but I think the growing string might be moved around more than once before it gets too big for the object pool, after which it should be fine to realloc it.

    This optimization is NOT part of the python spec, other implementations (Jython, IronPython, PyPy) are free to do their own thing. For joining large numbers of strings it is idiomatic to do str.join(strings_list) which gives the interpreter a chance to add up their lengths and do a single malloc (followed by memcpys). str.join works the same in all implementations.

    Comment by Jack Diederich — 1 October 2007 @ 05:18

  11. The Python idiom for gluing together large strings is to do this:

    chunks = [] # An empty list chunks.append(‘Some string’) chunks.append(‘ another string’) … final_string = ”.join(chunks)

    In other words, build a Python list (which is a mutable, dynamically resizing data structure) and use it to collect a large number of strings, then join them all together in to a single string right at the end. You’ll find this is dramatically faster than assembling a string using the += operator.

    Comment by Simon Willison — 1 October 2007 @ 05:34

  12. Please look over here : http://wiki.python.org/moin/PythonSpeed/PerformanceTips#head-bcf69f4e2cacc9683c2f9a1f401e891cac50506f

    for how to make string concatenation in Python faster.

    Comment by Chmouel Boudjnah — 1 October 2007 @ 06:01

  13. I’m getting the hint here that strings and buffers are wished to be one in the same but there are very good reasons to have a simple string type and something like cStringIO (python) and StringBuilder (.net). The example above is clearly the case for a buffer like object. A happier programmer is he who doth not insert square pegs in round holes :)

    Comment by Troy Kruthoff — 1 October 2007 @ 08:40

  14. += should be avoided in if speed is an issue in both ruby and python. If we leave out printing the 1000s, then we can skip the loop and use list comprehensions in python:

    “”.join([line for line in sys.stdin])

    http://wiki.python.org/moin/PythonSpeed/PerformanceTips

    Comment by Brian — 1 October 2007 @ 09:34

  15. In Python one is taught not to concatenate strings using += but to use the join method instead. Your example would become something like the code below (but it re-concatenatess which I would usually avoid):

    import sys

    s = [] i = 0 for line in sys.stdin:   s.append(line)   i += 1   if i % 1000 == 0: print “”.join(s)

    It is not nice, but you are expected to know this aspect of Python, as well as others, as you progress in Python. Search for “Building Strings from Substrings” in this page: Code Like a Pythonista: Idiomatic Python

    • Paddy.

    Comment by Paddy3118 — 1 October 2007 @ 11:07

  16. This comment is completely stupid and asinine, Python users do not use regular string concatenation for long string accumulations (if only because regular concatenation does, as you mentioned, necessarily copy both strings to create a new string as python strings are immutable*). Nobody who’s ever seriously used python would ever do that kind of stuff past the concatenation of about a dozen string.

    The Pythonic Thing is to create a list() object, append() the strings to the list one after the other, and finally join() the list into a single string.

    (*): and it can easily be argued that Python’s string immutability is a Good Thing, as it prevents the need to perform defensive string copies.

    Comment by Masklinn — 1 October 2007 @ 11:25

  17. This shouldn’t be a surprise, but Python does have libraries for dealing with mutable strings, notably the StringIO library.

    Using concatenation, your example took 6.9 seconds on a large file. Using StringIO, it took 0.26 seconds. Using cStringIO, the fast version of StringIO written in C, it took 0.07 seconds.

    In just about any language with immutable strings, it’s also pretty simple to create a list and pop your string changes onto the end of the list. Then just combine the list into a new string when you’re done. It’s a simple optimization that’s always going to be faster than simple “+=” in languages with immutable strings. In my Python tests, that was even faster than using cStringIO.

    Comment by Chadwick Morris — 2 October 2007 @ 04:07

  18. It’s so hard not to come up with contrived examples here. For instance, just printing a number every 1000 iterations prevents you from writing from more optimized code, i.e. in Python I’d just use ‘s = sys.stdin.read()’ and be done if I didn’t have to uselessly output progress. One line is about as optimized as you can get.

    But keeping the spirit of the original code, the optimized Python version would also feel unnatural like the optimized Ruby. You’d use a list to collect the lines and then join them into one string when the loop exited. So it would look like this:

    import sys lst = [] i = 0 for line in sys.stdin: lst.append(line) i += 1 if i % 1000 == 0: print i print ''.join(lst)

    Sometimes you trade readability for speed when you optimize (in whatever language).

    Comment by Brandon Corfman — 2 October 2007 @ 06:21

  19. I’m using python 2.4.1 on debian, and this is somewhat faster: from cStringIO import StringIO f=StringIO() i=0 for line in sys.stdin:   f.write(line)   i += 1   if i % 1000 == 0: print i s=f.getvalue() But its not exactly the same either – you can’t read off the result until the end without killing performance. Another more memory intensive approach is to accumulate the list of strings and join them at the end. Using a generator comprehension is a lot faster, though less comparable still – it doesn’t print the progress report for example. s=''.join(line for line in sys.stdin)

    Comment by Spacebat — 2 October 2007 @ 06:52

  20. Python does have mutable strings. They’re called “lists”.

    Comment by Anonymous — 2 October 2007 @ 07:32

  21. The comparison is idiomatic python, but not idiomatic Ruby. The original author probably just literally translated Python code into Ruby. I have never seen any ruby code doing string concatenation with +=. And why use ‘while’ when you can just iterate over the IO object STDIN:

    s="" STDIN.each_with_index do |line,i|   s << line   puts i if i.modulo(1000).zero? end

    Doesn’t that look a bit more like Ruby? (OK, maybe I stretched it a bit with the modulo method chain :-)

    Comment by Mark Thomas — 2 October 2007 @ 08:13

  22. Modified python code:

    #!/usr/bin/env python
    import sys
    
    L = []
    i = 0
    for line in sys.stdin:
        L.append(line)
        i += 1
        if i % 1000 == 0:
            print i
    s = ''.join(L)
    

    All you do in Python is build lists of strings and then join them.

    ruby stuff.rb < /usr/share/dict/words 1.74s user 0.03s system 93% cpu 1.891 total python stuff.py < /usr/share/dict/words 0.77s user 0.06s system 95% cpu 0.869 total

    That’s using your optimization with the concatenation operator for the ruby code (otherwise the same), and the python code above which though it has slightly different semantics has the same effect.

    Comment by DDP — 2 October 2007 @ 08:52

  23. With Python, use a list, .append(line), and join after the loop. It runs faster in my benchmarks. I’d post the code, but it seems all the layout is messed up, and Python without layout is pretty useless.

    Ruby 0:00.62 real, 0.56 user, 0.05 sys

    Python 2.5 += 0:00.92 real, 0.22 user, 0.69 sys

    Python 2.5 .append() 0:00.39 real, 0.31 user, 0.09 sys

    Python 2.5 StringIO 0:01.00 real, 0.96 user, 0.03 sys

    Comment by Vincent Foley — 2 October 2007 @ 09:15

  24. In Python, you’d make a list of the individual strings, and then use ”.join() to convert them to a single string at the end.

    Python 2.5 does have an optimization for certain common cases (if the refcount on the string is 1, it will mutate it in-place), and that’s why Python is so much faster than the un-optimized Ruby code. String appends like that aren’t considered to be optimal code, though, and the optimization is there because it’s easy and helps a common case, not because that’s how you’re supposed to write a fast string append.

    Idiomatic Python code looks like this: lines = [] for idx, line in enumerate(sys.stdin): lines.append(line) if not idx % 1000: print ''.join(lines)

    WordPress is apparently stripping whitespace, even in code blocks, so you’ll have to get the indentation right yourself :P

    Comment by Anonymous — 2 October 2007 @ 09:30

  25. This only shows that you need to profile and now your libs. My profiling of Ruby 1.8.6 shows it to be 2 times slower than python using <<

    Python has StringIO or cStringIO for that kind of usage. Also you usually collect strings in an array and then join them via “”.join(array).

    Anyway, I like both languages, but ruby 1.8 definitly not for its speed :)

    Comment by Tom — 2 October 2007 @ 09:57

  26. This post has gotten a lot of traffic in comments. It’s too much for me to effectively address all of it here. As such, much of it will be addressed in new “top level” posts at SOB, rather than in comments.

    Comment by apotheon — 2 October 2007 @ 10:37

  27. FOR GETTING INDENTED PYTHON INTO COMMENTS TRY:

    http://paddy3118.blogspot.com/2007/10/indenting-python-for-blog-comments.html

    Comment by Paddy3118 — 2 October 2007 @ 11:07

  28. […] discussion in response to my I learn something new every day — this time about Python and Ruby has gotten a bit lengthy. 25 comments is too many to really address everything properly within […]

    Pingback by Chad Perrin: SOB » Python/Ruby string concatenation discussion — 2 October 2007 @ 11:51

  29. Paddy3118:

    . . . or try using Markdown syntax. Much easier, in my opinion, since all you need to do is attach four extra spaces at the beginning of each line of code. Of course, that only works here and in other places using a Markdown plugin.

    Comment by apotheon — 2 October 2007 @ 12:24

  30. Thanks apotheon for the tip on Markdown syntax.

    Somehow I missed the link to it in my original comment and now that I’ve read it,

    I can see where
        it might
            be useful!
    

    (so long as I indent by four spaces). :-)

    Comment by Paddy3118 — 2 October 2007 @ 02:02

  31. Paddy3118:

    Actually, you don’t have to indent by four spaces within your code — just add four spaces to the beginning of each line of code. For instance:

    this is indented by four spaces so it's rendered as code
      this is indented by six so it's rendered as code indented by two
    

    Comment by apotheon — 3 October 2007 @ 12:10

  32. […] few days ago, in I learn something new every day — this time about Python and Ruby, I posted on the subject of how a core language’s limitation to immutable strings can lead to […]

    Pingback by Chad Perrin: SOB » optimizing operations — 3 October 2007 @ 03:00

  33. for one thing… string concatenation in python using + is known to be slow. if you kept track of the lines in a list and then joined them all at the end of the loop, it would still be super fast. the point being that the problem is not that language x is better than language y or vice versa, it’s that your example is not a valid comparison. you have code that is almost syntactically identical, but each language has it’s “best” way to do certain things. the python example is certainly a good example of how NOT to do it.

    Comment by JMC — 4 October 2007 @ 10:30

  34. string concatenation in python using + is known to be slow.

    Yes, I’m aware of that. Also, there’s no string concatenation optimization in Python anyone has suggested, because of Python’s immutable strings.

    if you kept track of the lines in a list and then joined them all at the end of the loop, it would still be super fast.

    . . . which is not a string concatenation operation, per se, and thus immaterial to my point.

    the point being that the problem is not that language x is better than language y or vice versa

    Yes, that’s true — my point was never that one language is “better” than the other. You seem to have missed that, though.

    it’s that your example is not a valid comparison.

    Sure it is. I pointed out that string concatenation operations are optimizable in Ruby via a means that is disallowed in Python thanks to its immutable strings. How is that not valid? It’s not like I’m saying that you can’t get a larger string by attaching a bunch of smaller strings end-to-end with reasonable quickness in Python. I’m just saying there isn’t an optimized string concatenation operation in Python without reimplementing string concatenation yourself, and that the reason for this is a language design decision.

    you have code that is almost syntactically identical, but each language has it’s “best” way to do certain things.

    I chose almost syntactically identical code examples so the comparison would be understandable. I avoided language-specific idioms. I avoided work-arounds, because I’m not interested in trying to prove there are no work-arounds (since there are always work-arounds).

    the python example is certainly a good example of how NOT to do it.

    This is true. The Python example is a good example of how not to do it — because Python’s immutable strings make direct string concatenation operations a bad idea. That was my point. I was not saying Python is bad. I was not even saying it’s not as good as Ruby. I wasn’t even saying it can’t put together a long string from a bunch of smaller strings with relative quickness. If you think I was saying any of these things, that would explain your otherwise inexplicably contentious comment — but if that’s what you think, you’re wrong.

    Comment by apotheon — 4 October 2007 @ 11:04

  35. […] Read the comments on this post from Chad Perrin about Python and Ruby; these people are trying to optimize a trivial chunk of code. What is even more interesting is the point Chad makes over and over again in his responses (he posts as apotheon): Maybe the intention of the code was to produce that onscreen counter that is eliminated in most of the optimizations. Or maybe, despite it being slower due to the concatenation of immutable strings, there was a knock off effect of that as well. Who knows? The intentions cannot be made through code without having ridiculously verbose comments in the code or having an exacting product spec available. […]

    Pingback by Programming and Development mobile edition — 23 October 2007 @ 09:38

  36. Clearly, something must be wrong with my system, since Python is a lot faster than the Ruby code even after the optimization.

    time ./test.py < test.input

    real 0m1.356s user 0m1.184s sys 0m0.172s

    time ./test.rb < test.input

    real 0m3.048s user 0m2.956s sys 0m0.092s

    Important note: Perl is still faster than both of them. ;)

    #!/usr/bin/perl
    
    $s = '';
    $i = 0;
    while () {
        $s .= $_;
        ++$i;
        print "$i\n" if $i % 1000 == 0;
    }
    

    time ./test.pl < test.input

    real 0m1.092s user 0m0.936s sys 0m0.156s

    Comment by Nilson Santos Figueiredo Junior — 27 November 2007 @ 09:25

  37. That while loop should read: while (<STDIN>). The blog ate my code!

    Comment by Nilson Santos Figueiredo Junior — 27 November 2007 @ 09:27

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

All original content Copyright Chad Perrin: Distributed under the terms of the Open Works License