Chad Perrin: SOB

10 January 2007

Debatable: Spaces Between Sentences

Filed under: Writing — apotheon @ 05:42

A “rule” of English that is, or has become, debatable is the number of spaces that should follow a full stop (aka period) before the beginning of a new sentence. The same applies to exclamation points and question marks, of course, but I shall refer to full stops (a less ambiguous term than period, less prone to conflation with other uses of the term) to mean all sentence terminating punctuation, even including quotation marks that trail the full stop or other terminating character.

There are those who believe that there should always be two spaces between sentences. This tends to be more likely, the older the person. There was a time when pretty much every school in the United States taught “two spaces after a period”. Since then, things have changed.

More recently, especially among people raised in an Internet culture, it has become common to regard a single space after a full stop as the correct way to handle it. While a great many people are probably influenced in large part by the fact that HTML (as well as other, similar markup languages) default to turning every collection of whitespace in text into a single space character, this connection between electronic text and use of a single space between sentences predates the World Wide Web.

Anecdotally speaking, it seem that if this tendency did not start with word processor applications, it must have started with something that was closely linked to word processor software. The more fully-featured word processors these days tend to “prefer” a single space, though they also usually allow setting a configuration option to “prefer” two spaces instead. For instance, I recall that Microsoft Word’s grammar checker will by default actually flag two spaces as incorrect, and its auto-correct will eliminate the “redundant” space, but it can be configured to treat a single space between sentences as incorrect instead. Many of us, educated in more classical formal rules of English, find this quite distracting and downright aggravating. Of course, as I said, this is an anecdotal observation: it’s possible the marriage of electronic text with single-space sentence separation actually started with something entirely different, disparate, and disconnected from word processors. That’s really just the first place I saw it, so it’s what seems like the origin of the phenomenon to me.

In contexts like markup languages, there is a very good excuse for the two space rule being dropped: it’s a technical limitation. One certainly does not want a markup language that doesn’t allow for arbitrary whitespace to be used to format the markup source. Such a state of affairs would lead to markup that is fairly unreadable and unmaintainable most of the time. Markup languages, like programming languages, should be designed for human readability at least as much as for computer readability, and allowing arbitrary whitespace to be used for source formatting is to a great extent an unavoidable component of human readability. Word processors like Microsoft Word, on the other hand, generally have no such excuse. It is certainly not a technical limitation: the abandonment of the second space is more common the more complex and featureful your word processor application: the simplest word processors just leave two spaces after a full stop unmolested, don’t care how many spaces you use in your document, and generally settle the matter by ignoring it, thus leaving the number of spaces between sentences up to the writer. On the back end, each space character you type is dutifully entered into the file source (in some cases, like late-’90s and early-2000s MS Word, documents are saved as binary files, while in others the “file source” is just glorified plain text) when you press the spacebar.

There’s some question in the minds of many, however, whether it is actually desirable in a technical analysis of the problem to use two spaces.

In handwriting, where spacing is arbitrary and flexible, it tends to be the case that people space things out just a touch more between sentences within a paragraph than between words within a sentence. This is certainly not justification for a rule, but it may have been an influencing factor in the early history of using two spaces between sentences. More importantly, however, and more directly applicable, is the fact that conscious decisions were made in early typesetting to separate sentences by greater spaces than words. This was done to aid readability, both by enhancing the sense of conceptual separation between sentences and by providing something of a natural resting point for the eyes while scanning text. This is supported in practice by the fact that people who contend with one of the more prominent hurdles to ease of reading text — dyslexia — tend to prefer two spaces after a full stop. Again, anecdotally speaking, I find sentences separated by two spaces to be easier to parse as well. The only objections relating to readability to using two spaces that I’ve ever encountered are objections to the waste of paper (when printed) in long documents, issues with justified text alignment (addressed below), and complaints that it “looks funny” to people who are more accustomed to a single space.

In days of auld, typesetters began to regard their space characters as precious. One reason for this, of course, is the cost of paper: when publishing, it’s usually in your best interests (all else being equal) to use less paper because of the cost involved. Another, more technical limitation, was the fact that typesetting was originally done by arranging little pieces of metal into a tray, including blank pieces that served as spaces. In a given document page, one would often use hundreds or even thousands of space pieces (depending on a number of factors, including page size — think of the size of a single sheet from a newspaper when unfolded). Using two space pieces after every full stop could lead to running out of spaces. Two different solutions to this problem arose.

In one case, typesetters used half-spaces, leading to a 1.5-space separation between sentences. This became the early de facto rule of sentence separation, as it gave clear visual cues to the structure of sentences and their separation from one another. In the other case, and mostly later on, it became more common to use a single space just as between words within a sentence. Much of the publishing industry uses this convention now, and has for years. Not everyone in publishing does so, however.

With the invention of tools such as typewriters, the real rise of monospace fonting was underway. On these devices, it simply wasn’t reasonable to expect to cram tabs, full-spaces, half-spaces, em-dashes, en-dashes, hyphens, and a number of other variations on very similar-looking characters into a single keyboard. As a result, we have exactly one dash-like character and two space character keys on QWERTY keyboards: the hyphen/underscore key, spacebar, and tab key.

The fact that there was no 0.5-space key on the early typewriters led to use of two spaces as the new standard in text composition. Again, readability played a role — rather than rounding down from 1.5, they rounded up. It is still more common to see two spaces used with monospace fonts than otherwise, even when such monospace fonts are being used in electronically generated text, such as when authoring documents in a word processor application. Proportional fonts, however, reintroduced the ability to use lesser spacing between sentences without rounding down to a single full-size space.

Designers of word processors, however, are largely ignorant of the history and reasoning behind the two-space convention. It gradually became the case that laziness and minor technical limitations in certain edge-case circumstances have won out, and word processors now tend to default to using a single space between sentences when they are designed to pay any attention at all. Some are even moving toward using a markup language that benefits technically from rendering arbitrary whitespace in the source as a single space character — for instance, MS Word is moving toward using XML for document source rather than the previous binary file format.

A minor technical issue that comes to mind when considering the use of two spaces between sentences is the matter of justified text alignment. With justified text, your text lines up neatly at both the left and right margins, and whitespace within the line of text is “stretched” to fit. SOB itself is a good example of this, if you’re using a browser that properly renders XHTML/CSS text alignment: this very paragraph uses justified text alignment, and if you give it a good, long, careful look, I’m sure you’ll be able to recognize the space “stretching” that causes the margins to line up so neatly. If I was using two spaces between sentences here (by inserting explicit nonbreaking space characters between sentences to pad the sentences out a bit), you would see even greater “stretching” in some cases, which would give the text an unnatural and even downright annoying appearance. Because spaces do not have a set, inflexible width in justified text, using a specific number of spaces greater than one to separate sentences loses its readability advantage.

Even with proportional fonts, though, when text is aligned right, left, or center, rather than justified, spaces do have a specific, inflexible width in every implementation I recall seeing. In such instances, the case for using two (or 1.5) spaces between sentences to improve readability is strong (as is the case for avoiding centered alignment of text for blocks of text more than one or two lines long, but that’s another matter entirely).

I, for one, type everything with two spaces between sentences. Luckily, the technical limitations of XHTML serve to discard the second space when I’m typing text that will use justified text alignment online. By far, for me, the preferable behavior would be to somehow contrive to use two, or at least 1.5, spaces unless text is justified. Your mileage may vary.

All original content Copyright Chad Perrin: Distributed under the terms of the Open Works License