Chad Perrin: SOB

14 January 2009

Smart Quotes Considered Harmful

Filed under: Cognition,Geek,RPG,Writing — apotheon @ 12:37

“Smart quotes” is a term often used to describe quotation marks that curve one way or the other depending on where they’re placed. They are also known as “educating quotes” and “curved quotes”. Specifically, when I use quotation marks as scare quotes around the words “smart quotes” (because I don’t think they’re very smart), the double-quote on the left should curve to the right, and the double-quote on the right should curve to the left, when using curved quotes.

This is the old typographical standard, and it’s perfectly reasonable when working with print media. When reading a novel, I expect to see curved quotes used, just as I expect to see curved apostrophes used. Technologies used to format text for print media should definitely include functionality that makes it easy to properly produce directional quotes.

The term “smart quotes” arose as a result of certain word processor programs using a function by that name to translate straight quotes to curved quotes. The idea is that the function is “smart” about curved quotation placement — and as the term became generally adopted to refer to curved quotes in colloquial use, the converse (straight quotes) has come, ironically, to be known by some as “dumb quotes”. The official term for the process of translating from straight quotes to curved quotes is “educating quotes”, however.

In the context of electronic communication where directional quotation marks are automatically selected by a function because the keyboard doesn’t supply directional quotes, I’ll refer to those directional quotes as “smart quotes”. The type of function used to produce smart quotes will be referred to as “educational quotes” or “educating quotes”, depending on context. Directional quotes in print media, on the other hand, are “directional quotes”, “book quotes”, or “curved quotes” when referring specifically to the curved variety.

The Problem�:

With that in mind, smart quotes present a number of problems in electronic media:

  1. Certain character sets don’t support smart quotes at all — a real problem for ASCII compatibility across platforms.
  2. Certain character sets use different encodings for smart quotes, so that even if the platforms’ character sets support smart quotes, quotes may not show up properly across platforms.
  3. Educating quotes functions sometimes make errors — though this is a problem of diminishing frequency in common usage.
  4. Educating quotes functions are usually misapplied to code segments. WordPress, in fact, tends to apply its educating quotes function even to text inside <code></code> tags.
  5. It is pretty much impossible to create text search technology that automatically determines which way you want a quote to curve in (almost) all cases, so that text search capabilities are often broken in text with smart quotes in it. Try using the Ctrl-F text search in Firefox on a typical WordPress Weblog that uses smart quotes to find something with an apostrophe in it some time.
  6. Different languages actually use different types of directional quotes — not all use English-style curved quotes. For instance, some languages use smart quotes with the right-side double quote down at the comma level rather than up at the apostrophe level. Others use directional double-chevron quotes (guillemets). Basically everybody recognizes straight quotes, though, because of the de facto standard set by early computer use.
  7. Keyboard layouts assume straight quotes in many languages.
  8. Copying and pasting text from one electronic textual medium to another can often result in broken quote characters when smart quotes are used.
  9. As pointed out in a comment by medullaoblongata below, copy/paste problems with smart quotes can combine with code quote character issues to produce greater problems, as users who are not clearly aware of the problem can then end up entering smart quotes into an SQL database management client and causing errors. Beware of advocating for automatic translation of directional quotes to straight quotes, or other such hackery, with SQL database management clients, however — as this can serve to make input validation more tricky, and thus increase the likelihood of unintended SQL injection vulnerabilities.

If you’re using a keyboard layout with directional quotes of some sort, and you know for a fact that your intended readership will always enjoy full support for the kind of directional quotes you’re using, this obviously doesn’t apply to you. The point here is that the use of educational quote functions for electronic media — i.e., smart quotes — is harmful. It interferes with compatibility, portability, and readability, as well as correctness in many cases (e.g., code examples).

Only in closed circuit communications where full support for directional quotes is known, or where the generated text will be circulated in print form, should smart quotes be used. On the Internet, however, and in documents otherwise intended to be distributed publicly or to people whose computing environment is unknown (such as the Web or sending English-language Word documents not intended for print to a group of people on different platforms), smart quotes should be considered harmful.

I’m personally sick and tired of text searches failing because I don’t have a curved apostrophe symbol on my QWERTY keyboard. Stop it.

The Solution:

As Fabien indicated in comments below, the solution to this is pretty obvious. Specifically, when dealing with electronic documents, meant to be read on a computer, for public distribution or otherwise without clear and certain knowledge of what’s on the recipient’s computer, the burden of quote translation should be on the recipient’s system — and not on the system that generates the documents for distribution. This way, those who prefer smart quotes for aesthetic reasons (or other reasons — I don’t discriminate) can have their smart quotes, and those who don’t or for some reason are not able to view them properly will not be burdened with documents that contain characters they cannot view as intended or do not care to view. Everybody gets to be happy, that way. This solution would even allow for text searches where the search string includes “straight” quotes, if desired, even while the screen displays curved quotes.

So . . . for those of you distributing text documents for rendering to a screen, Smart Quotes Considered Harmful. Let the viewing software handle it. For those distributing in print form — as I already said above — go ahead and use directional quotes, because nobody’s going to be doing a select/cut/paste or browser-enabled text search on them anyway (at current technological levels, anyway).

Note:

Please feel free to share, if you know of other reasons smart quotes are harmful.

12 Comments

  1. Oh whaaaaa. If you’d spent as much time thinking about how to deal with this issue as you did composing this rant, you’d realize it’s quite easy to work around this problem. Micro$oft has included “smart” quote capability in its word processing program for YEARS. Either get with the 21st century (computer) program, or sell your computer and buy an antique IBM Selectric.

    Oh, and FYI — smart quotes have been around a whole lot longer than word processing programs. They actually date back to hand-set type, back in the days when people actually cared how visually pleasing the stuff they read was.

    Comment by Surfergurl — 14 January 2009 @ 03:03

  2. If you spent a little time building your reading comprehension skills, perhaps you’d notice that what you’re saying doesn’t even address the bulk of the problems I cited for dealing with smart quotes.

    1. Yes, Microsoft Word has generated smart quotes for years. No, this does not solve problems like easy text searches.
    2. Yes, directional quotes have been around a lot longer than computers. No, this does not mean that there is no problem with using smart quotes for digital communication — especially when, as I pointed out above, “smart quotes” is essentially a digital document specific term (the print version is “directional quotes”, “curved quotes”, or even “typographic quotes”).
    3. “Visually pleasing” varies depending on context. For instance, serif fonts are easier to read on print media, while sans serif fonts are easier to read on electronic media.

    The fact you ignored the bulk of the issues I brought up doesn’t mean they don’t exist.

    Try again. Try harder.

    Comment by apotheon — 14 January 2009 @ 03:18

  3. i don’t think surfergurl read the post.

    Comment by Anon — 14 January 2009 @ 03:21

  4. I think you may be right about that.

    Comment by apotheon — 14 January 2009 @ 03:22

  5. Yes, I read the post. I don’t usually bother to respond to stuff I don’t read. Now that that’s out of the way…

    My original point is still the same: If your equipment, and the software you use, is up-to-date, then you shouldn’t be having anything but the occasional problem. If you can’t cope with an occasional problem, then you should eschew technology.

    Duh.

    P.S. There’s a HUGE difference between quote marks and apostrophes, typographically, and electronically, speaking.

    P.P.S. What in the hell makes you think san serif typefaces are easier to read anywhere, let alone on a computer screen? Did you just make that up?? There have been studies done about this kind of stuff, and considered opinion among type experts is that serif typefaces are almost always easier to read, no matter the medium. (I say “almost always” because nobody wants to read more than a couple words at a time that are set in Bodoni Poster.) Actually, when people are presented with words flashed on a screen one at a time, reading comprehension goes through the roof. It matters not what type face the words are set in. What does matter is consistency: the words should all be in the same typeface, same size, same amount of kerning between letters, same amount of distortion throughout all letters. But do you see anybody designing web pages with that in mind?? No. Because most people have never bothered to do a little research, relying on their own (often fallible) wisdom. Do some research: you’ll find that consistent line thicknesses and wide open spaces in the letters themselves and between letters makes for legibility.

    Comment by Surfergurl — 14 January 2009 @ 03:38

  6. Yes, I read the post. I don’t usually bother to respond to stuff I don’t read.

    I guess we’re back to “you need to learn some reading comprehension skills”, then.

    you shouldn’t be having anything but the occasional problem

    Thus my point — smart quotes are harmful. They cause problems.

    There’s a HUGE difference between quote marks and apostrophes, typographically, and electronically, speaking.

    Really? You don’t say . . .

    Educational quote functions typically deal with actual quote marks, apostrophes, and other punctuation characters. “Smart quotes” is a convenient term to refer to all these things under a single heading so that a list of all relevant punctuation characters isn’t necessary every time you wish to refer to the output of directional punctuation character redirection functions. Please try to reply to the points made, rather than making some unnecessarily pedantic attempt to derail the subject of discussion.

    What in the hell makes you think san serif typefaces are easier to read anywhere, let alone on a computer screen?

    Studies conducted in the ’80s and ’90s, for one — starting around the time the monochrome color selection studies for readability and eye strain started winding down as 16-bit color became more widely available.

    Do some research

    I have, thanks.

    you’ll find that consistent line thicknesses and wide open spaces in the letters themselves and between letters makes for legibility.

    It is true that these factors contribute to readability. The fact they contribute does not prove that nothing else does, however. For instance, white-on-black increases readability for longer periods of reading (as compared with the opposite color scheme) due to decreased eye strain when dealing with typical CRT and LCD monitors, while black-on-white increases readability for print media. Note that this is a factor mostly unrelated to line thickness and wide open spaces in letters or between them.

    . . . and you’re still ignoring the bulk of the points brought up in the original SOB entry above. You claim to have read it, but you also appear to be perfectly willing to act as though you didn’t read it.

    Comment by apotheon — 14 January 2009 @ 03:56

  7. REAL BIG SIGH

    OK, whatever…

    1. Certain character sets don’t support smart quotes at all — a real problem for ASCII compatibility across platforms.

    Certain fonts don’t include smart quote characters in their designs. No problem. These fonts have single and double hash marks instead.

    2. Certain character sets use different encodings for smart quotes, so that even if the platforms’ character sets support smart quotes, quotes may not show up properly across platforms.

    Yep, you’re right. Because: In crap fonts there will be crap encoding. That’s one of the great truths of life. Big deal.

    3. Educating quotes functions sometimes make errors — though this is a problem of diminishing frequency in common usage.

    No, not true. What’s wrong about this statement is the “sometimes make errors” part. In such simple coding situations, there IS.NO.”SOMETIMES.” If that were true, then it would mean some devious coder is including “IF/THEN” situations just to piss off people like you.

    4. Educating quotes functions are usually misapplied to code segments. WordPress, in fact, tends to apply its educating quotes function even to text inside tags.

    This may or may not be true. I don’t care. It’s up to you to lobby WordPress and other web sites that allow users to blog to re-write their code to make this all a non-issue. So, really, your beef is with coders, not bloggers.

    5. It is pretty much impossible to create text search technology that automatically determines which way you want a quote to curve in (almost) all cases, so that text search capabilities are often broken in text with smart quotes in it. Try using the Ctrl-F text search in Firefox on a typical WordPress Weblog that uses smart quotes to find something with an apostrophe in it some time.

    What’s the big deal here? If you want to search such text, then copy and paste it into a word processing program of your choice — but be sure to turn off the smart quote function before you do so. Either all quote marks will appear as hash marks or they’ll appear as odd character strings, which are easy to search and replace if your intent is to quote somebody’s writing. Remember, you’re only supposed to quote a little bit of somebody else’s writing, so if this is such a big problem for you, then you’re quoting too much and not giving credit where credit is due.

    6. Different languages actually use different types of directional quotes — not all use English-style curved quotes. For instance, some languages use smart quotes with the right-side double quote down at the comma level rather than up at the apostrophe level. Others use directional double-chevron quotes (guillemets). Basically everybody recognizes straight quotes, though, because of the de facto standard set by early computer use.

    What exactly is your point here? We’re all communicating in English here, aren’t we? This ain’t the same thing as the metric system controversy.

    7. Keyboard layouts assume straight quotes in many languages.

    Again…what’s your point?

    8. Copying and pasting text from one electronic textual medium to another can often result in broken quote characters when smart quotes are used.

    Be uber judicious when quoting someone else who has done all the work you couldn’t bring yourself to do. And if you can’t do that, then remember the old adage: Don’t bite the hand that feeds you.

    I have a Mac. It’s Micro$oft free. I have an occasional problem when I copy and paste text from an email that was generated on a PC. My way of coping with it is to utter a mild cuss word under my breath, copy and paste the text into my Mac word processor, do a search and replace, and go on with my life.

    You should too.

    Comment by Surfergurl — 14 January 2009 @ 04:29

  8. Certain fonts don’t include smart quote characters in their designs. No problem. These fonts have single and double hash marks instead.

    . . . which doesn’t help when you’re trying to read a document that has smart quotes in it. Then, you get something like the Unicode replacement character instead: �

    The problem isn’t only with being unable to type them easily, but being able to read them when delivered in an incompatible character set.

    Yep, you’re right. Because: In crap fonts there will be crap encoding. That’s one of the great truths of life. Big deal.

    So, basically, your answer is “Yes, there are problems, and in my day we all liked it that way. Get off my lawn!” I’m less than sympathetic.

    No, not true. What’s wrong about this statement is the “sometimes make errors” part. In such simple coding situations, there IS.NO.”SOMETIMES.” If that were true, then it would mean some devious coder is including “IF/THEN” situations just to piss off people like you.

    I don’t think you actually know how software works.

    Do you think these situations are only possible if code like the following is used?:

    if Time.new.to_s.match(/16/)
      puts '?'
    else
      puts dwim()
    end
    

    I don’t know about you, but I’ve never seen a universally applicable “Do What I Mean” function. Educational quote functions have to contain complex combinations of conditional code blocks that add up to what the programmer hopes will turn out to be a reasonably good guesser at what the user wants and expects. Sometimes, when software tries to guess what the user wants, it fails.

    This may or may not be true. I don’t care. It’s up to you to lobby WordPress and other web sites that allow users to blog to re-write their code to make this all a non-issue. So, really, your beef is with coders, not bloggers.

    My “beef” was never with “bloggers”, per se, anyway. It’s with people like you, who seem to think that curves are necessarily better than straight line segments, regardless of context or usage.

    It’s with heuristic attempts to solve a “problem” that doesn’t really need solving in a context where the consequences of error are worse than those of having done nothing in the first place, particularly in a manner that is difficult to undo. False positives are bad.

    What’s the big deal here? If you want to search such text, then copy and paste it into a word processing program of your choice

    Why should I have to install some bloated piece of crap word processor, wait the five minutes for it to load, and paste text into it, just so I can do a quick text search on a Web page? That’s especially asinine when I can probably find what I’m looking for much more quickly by scanning with the naked eye — but still more slowly than is as convenient as using the browser’s built-in text search functionality.

    if your intent is to quote somebody’s writing

    What if that’s not my intent? What if my intent is, say, to find a typo in my own writing, in the edit dialog of a Web application, so I can correct it?

    Don’t offer a one-shot “This is what to do in that case!” and expect it to solve the world hunger problem, though. That’s just one example of other cases where this might be a problem. If you don’t have a solution that solves all such problems, you might as well keep your special-case solutions to yourself.

    Remember, you’re only supposed to quote a little bit of somebody else’s writing, so if this is such a big problem for you, then you’re quoting too much and not giving credit where credit is due.

    That’s an irrelevant, accusatory statement that adds nothing to the discussion. Please take your bilious attitude elsewhere.

    We’re all communicating in English here, aren’t we?

    In this discussion — yes. I’m talking about the general case. Please learn some reading comprehension skills.

    7. Keyboard layouts assume straight quotes in many languages.

    Again…what’s your point?

    I don’t even know how to answer such a clear demonstration of willful ignorance.

    Be uber judicious when quoting someone else who has done all the work you couldn’t bring yourself to do.

    Maybe you did read the whole thing — but you’re just incapable of making connections between related statements when they are separated by paragraph breaks. Abstract reasoning is one of the key indicators of human level intelligence, but many people do not apply such abstract reasoning skills very effectively, and many have difficulty applying them across domains. Might that be your problem?

    Maybe that’s not it. Maybe you’re just intentionally pretending every statement exists in a vacuum when you respond to it so you don’t have to do things like, y’know, offer reasonable counterarguments.

    I have an occasional problem when I copy and paste text from an email that was generated on a PC.

    Your tendency to occasionally undermine your own arguments is refreshing.

    My way of coping with it is to utter a mild cuss word under my breath, copy and paste the text into my Mac word processor, do a search and replace, and go on with my life.

    It’s easier to deal with if someone gives me a document using straight quotes.

    You should too.

    Saying it doesn’t make it so.

    Comment by apotheon — 14 January 2009 @ 04:57

  9. If the recipient finds smart quotes more visually pleasing, then the smart-quoting should be done at the recipient’s end, by the recipient’s reading software. That would please everyone, put less burden on the sender and the communication processes, and not break the “search” function, since the recipient’s reading software would receive the “raw” text.

    PS: While we’re talking about fonts: unfortunately, your website uses a weird font that’s very hard to read (at least on Firefox/Windows XP). I had to copy&paste to a text editor to be able to read the post. I definitely prefer (text-oriented) websites that don’t mess with fonts.

    Comment by Fabien — 14 January 2009 @ 06:00

  10. If the recipient finds smart quotes more visually pleasing, then the smart-quoting should be done at the recipient’s end . . .

    I agree with this statement, and your following explanation of how and why, 100%. Consider this Smart Quotes Considered Harmful to be a statement for the supply side of the relationship.

    your website uses a weird font that’s very hard to read

    The lineup of fonts used here, as displayed in CSS, is:

    font-family: 'Lucida Grande', 'Lucida Sans Unicode', Verdana, sans-serif;
    

    Lucida Grande, Lucida Sans Unicode, and Verdana are all common and highly readable fonts. The sans-serif part of that basically specifies that your browser should fall back to its configured default sans serif font when the others are not present. I can only think that, for some reason, your browser isn’t displaying one of those three specified fonts and has a poor-readability font set as its sans serif default.

    Please let me know if you have any more information on why your browser in particular might not be displaying fonts in a very readable form.

    I may consider removing all the specific font options from that list in the stylesheet, and just let all browsers choose — but in my experience a lot of people have browser font defaults set to things they don’t like, so I tend to prefer giving a couple of more-readable fonts before devolving to sans-serif in cases where they aren’t present.

    Comment by apotheon — 14 January 2009 @ 06:53

  11. I run into issues with smart quotes at work quite a bit and contrary to Surfergurl’s assertions, it can be a big pain in the butt.

    The software I support is built on top of an SQL database. Advanced features in the software allow users to enter an abreviated form of SQL to accomplish various items. I often get clients asking me what code to enter and I will email them back the correct information.

    Well, here’s where those smart quotes start to wreak havoc. Microsoft Outlook (and I’m sure other email programs) will often automatically format any apostrophe or quotation mark into smart quotes. The client does a copy/paste from their email into the software. They think everything is all great because they received the code from a support technician. Well, SQL does not read smart quotes as the same thing as straight quotes so it errors out. To make matters worse, the software does not render smart quotes differently than straight quotes, so you can’t tell by looking that you’re using the wrong ones. It can cause quite a headache trying to figure out what went wrong when the code looks correct.

    To deal with this problem, I have three choices: 1. attach a text document with the code in it which the client will have to download from email, locate where they saved it on their computer, open up the text file, copy the text from the file, and then paste the code into the software; 2. instruct the client to open up Notepad, copy the text of the email into Notepad, then copy and paste from Notepad into the software; or 3. type the code from the email directly into the software instead of copying and pasting and hope the client doesn’t forget a period or a parenthesis. Each of those options have their drawbacks, especially since the people I support are not always the most savvy computer users, and it would be so much nicer if Microsoft didn’t insist on making everything pretty and converting everything to the way it thinks you’ll want it.

    So, I agree that smart quotes are harmful and they annoy me a great deal.

    Comment by medullaoblongata — 14 January 2009 @ 07:22

  12. So how are you going to solve the issue that your smart-/educational-/whatever- quotes dont match mine, and hence you can’t easily search my text? Because in Germany quotation marks are typically something like <>? (Those aren’t the exact characters used, but rather an example of what they look like, btw.)

    Comment by Ok, surfergurl — 15 January 2009 @ 08:59

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

All original content Copyright Chad Perrin: Distributed under the terms of the Open Works License