Chad Perrin: SOB

29 April 2006

Reply-To Munging Considered “Big Fat Hairy Deal”

Filed under: Geek — apotheon @ 04:59

In discussion on a Perl mailing list, the subject of munging the Reply-To: headers on a mailing list came up. There’s a fair bit of disagreement on the subject.

One side of the debate consists of people who don’t want their common-case email replies to the mailing list to involve different, and more surprising, behavior than all the rest of their common-case email replies. In other words, they want to be able to use the same behavior to reply to the list as when they reply to other lists and when they reply to individual emails. They want to be able to avoid having to use group reply functionality every single time they reply to a list message, and even if they used group reply they’d like to not have to delete stuff from the recipient list to keep from sending duplicate emails to some people.

The other side of the debate mostly just keeps referring to a Web-posted document titled “Reply-To” Munging Considered Harmful, and make vague references to loss of functionality and “experts” wanting things to work the way they’re “intended” to work instead of according to some arbitrary standard.

I’m very sympathetic to the argument formula used to support the “don’t munge the Reply-To: header” side of things. It draws on all the right principles of how to handle technological functionality and usability. Unfortunately, after reading that linked document that undergirds all these arguments several times, and pondering the whole thing at great length, I find that it mostly just seems to be an empty facade. Let’s examine the arguments of the document in some detail:

The Principle of Minimal Munging The “Principle of Minimal Munging” is a good rule that will keep you out of trouble. It says you should not make any changes to an email header unless you know precisely what you want to do, why you want to do it, and what it will affect. Unless you can articulate a clear reason for munging and understand the full consequences of the action, you should not do it.

So far, so good. My question: What the hell does that have to do with it? There’s a clear benefit to munging the Reply-To: header, and its effects will be discussed shortly.

It Adds Nothing Reply-To munging does not benefit the user with a reasonable mailer. People want to munge Reply-To headers to make “reply back to the list” easy. But it already is easy. Reasonable mail programs have two separate “reply” commands: one that replies directly to the author of a message, and another that replies to the author plus all of the list recipients.

Of course this is true, as far as it goes. On the other hand, when I reply to the list, I want to reply to the list, including the originator of the message, in one swell foop. I don’t want to be spamming the poor guy’s inbox with multiple copies of the same message, nor do I want to spam any others to whom his email might have been sent via CC. I want to reply to the list and only the list. Is that so difficult to understand?

In point of fact, I use a MUA (Mail User Agent) called Mutt that has a list reply capability in addition to reply and group reply. I can use that to reply only to the list. That’s great! Bad news: I keep accidentally sending emails only to the originator, skipping the list entirely, because when I reply to emails I usually just use reply. Muscle memory and default behaviors have trained me that way, and it’s to my advantage to stay trained that way rather than training myself to try the list reply option first all the time as the default because of all the mailing lists and individual correspondents with whom I deal regularly, none but this one list fails to provide the most natural behavior as standard via the Reply-To: header.

If you use a reasonable mailer, Reply-To munging does not provide any new functionality. It, in fact, decreases functionality. Reply-To munging destroys the “reply-to-author” capability. Munging makes this command act effectively the same as the “reply-to-group” function.

Poppycock! Utter balderdash! What is this sophistry?

What actually happens with Reply-To: munging is as follows:

The Reply-To: acts exactly as it always has, in that it defaults to replying to a single recipient address, except that the recipient address is for the list instead of the author — the list of which the author is presumably a member. For most interactive mailing lists, this is the most common case for an email reply. It is also the most commonly desirable manner of handling an email reply for an interactive mailing list, because such mailing lists are designed for sharing information, and not for encouraging people to avoid using them. The ability to email the original author is now relegated to second-class citizen, rather than emailing the list being the second-class citizen: just as without munging you can normally only reply to the author or the list and the author, so now with munging you can normally only reply to the list or the author and the list. No functionality is really removed. You’ve just altered the defaults.

Anyone that thinks replying only to the list is exactly equivalent to replying to the list and the author at the same time with the same text probably hasn’t been the recipient of multiple copies enough to “get it”.

Freedom of Choice Some administrators justify Reply-To munging by saying, “All responses should go directly to the list anyway.” This is arrogant. You should allow me to decide exactly how I wish to respond to a message.

Ut-oh. We’re getting politically spurious now. The word “freedom” has been brought to the table.

The truth is, you can direct it to whoever you like just the same as you always could. You just have to edit the recipients in the less common case for a mailing list now, rather than for the most common case, to avoid a faux pas — and yes, it is a faux pas to send me two copies of every single friggin’ email. Are you afraid one of them is going to get wet in transmission and be illegible when it gets here?

Can’t Find My Way Back Home Reply-To munging can make it impossible to reach the sender of a message.

. . . unless, as you might have gathered from the above, you’re not too lazy to edit the group reply recipients.

Coddling the Brain-Dead, Penalizing the Conscientious There are, unfortunately, poorly implemented mail programs that lack separate reply-to-author and reply-to-group functions. A user saddled with such a brain-dead mailer can benefit from Reply-To munging. It makes it easier for him or her to send responses directly to the list. This change, however, penalizes the conscientious person that uses a reasonable mailer.

This is just more of the same spurious nonsense as the above. You’re punishing one group by making them type stuff so that another group doesn’t have to delete stuff — and, meanwhile, you’re punishing the second group by having to delete stuff anyway if they don’t want to be boorish jackasses that don’t care if someone’s list email traffic is doubled for no good reason. I thought all the old-school computer geeks were big on the light:heat ratio. The particular geek that created that document, along with his followers, apparently prefers heat over light — though I can’t imagine why.

Principle of Least Work

There’s a stupid table here. You can go read it yourself. The guy that created it was really reaching when he came up with this nonsense. It’s not even plausible enough to call it sophistry, unfortunately. If you really want to reply to everyone all the time, just hit the damned g key and don’t bother using the r key at all, on this munged-header mailing list. If you want to reply only to the author, you can always just use the g key and delete recipients you don’t want. Considering that with list traffic the vast majority of people will be replying to the list (only) the vast majority of the time, the work involved in the unmunged-header list is multiplied quite a bit by the number of times a day you may have to do the work, in addition to which there are more accidental missends that have to be dealt with and either a bunch of line noise in the form of duplicate emails or a bunch of extra effort involved in deleting a recipient basically every time you send a reply. The fact that the guy pretty much lied about the requirements for replies on a munged-header list to inflate the apparent effort cost is just icing on the cake after that.

Principle of Least Surprise When I hit the “r” key in Elm, it sends a response to the author of a message. When you munge the Reply-To header you change this action so that it does something entirely different from what I expect. This creates specialized behavior for your mailing list, which increases the potential for surprise. I’m not schooled in the science of human factors, but I suspect surprise is not an element of a robust user interface. Private messages frequently are broadcast across lists that do Reply-To munging. That’s an empirical fact. It’s what happens when you violate the principle of least surprise.

That may have been more commonly true in 1985, but things have changed. Munging the Reply-To: header for mailing lists is the default by now. In the last decade, I might have run across one other mailing list that didn’t munge the Reply-To: header, and in that case it was an oversight that got corrected when people complained. There are apparently three Elm users in the world who share his opinion, and the other six billion of us (including a bunch of Elm users) find that his approach is the more surprising.

Saying something is unsurprising doesn’t make it so.

Principle of Least Damage Consider the damage when things go awry. If you do not munge the Reply-To header and a list subscriber accidentally sends a response via private email instead of to the list, he or she has to follow up with a message that says, “Ooops! I meant to send that to the list. Could you please forward a copy for me.” That’s a hassle, and it happens from time to time. What happens, however, when a person mistakenly broadcasts a private message to the entire list?

This is the first genuinely relevant concern in the entire damned thing. It ignores the damage done by sucking up others’ time by making them process twice as much mail, though, and assumes a far greater incidence of accidentally list-sent private emails than seems to actually occur. I’ve seen about two such instances in a decade of mailing lists, sometimes with hundreds of emails crossing the lists every day, all the lists using a munged Reply-To: header. In one day on this unmunged-header list, there were six missent messages of which I was aware. I have no idea how many simply went unremarked.

Six. On a list that got about thirty messages that day. One in five. At a rate of, say, a hundred a day for a decade, that would be roughly 730 missent messages. Now, add to the frustration pool the fact that probably at minimum 50% of the messages would have been sent by lazy bastards like the guy that wrote that document, who don’t care about the fact that they’re halving the the light:heat efficiency of the list, and that the other 50% of messages are sent by users who then have had to go through the contortions of deleting recipients when doing a group-reply. Tell me what the comparative damage tally is now.

And in the End… If you are not convinced yet, then allow me one final plea. I contribute to the Elm mailer development team. I get to see a lot of the wants and requests from the user community. Guess what feature more and more people are asking for? A third reply command — one that ignores any existing Reply-To header!

Okay. Now tell us what it does reply to! Are you talking about list reply capability like Mutt has? I already mentioned I use Mutt, which has this functionality. My take is that it’s kind of a pain in the butt to have to remember a completely different set of behaviors for list traffic from what I use for individual emails. All it does is make the notion of a group reply less rude: I still have to remember to use it, which I don’t always do.

If this doesn’t refer to a list reply feature, I can only imagine it must be a means of replying with an empty recipient field. I could see that being useful maybe once every century, on average.

Addendum In case you are wondering, yes, I once thought Reply-To munging was a nifty idea. I got better though. When I started running email lists, I munged ’em all. One day I accidentally sent a private, personal reply out over one of my own damn lists. If the list owner can’t remember how to use the list properly, no way will the subscribers be able to sort it out. I stopped munging the very next day. On the whole, it has worked out quite well. Yes, on occasion somebody mistakenly responds directly to the author of a message when they wanted to reply to the group.

That’s asinine. You simply need to realize that you’re responding to a list message. You should be thinking about your audience and the content of your messages all the time anyway. Don’t punish me because you’re careless and had one individual bad experience. Poor thing. I’m suffering (slightly less) bad experiences on a daily basis because people actually think your arguments hold water.

I love the way you define “on occasion” as “several times a day”, and you’re so terribly concerned about the plague of public slip-ups that consiste of “several times a decade”. I think you wore your priorities backwards today.

26 April 2006

OpenDocument Format (ODF) vs. MS OpenXML Format (OpenXML)

Filed under: Geek,Liberty — apotheon @ 07:47

I’ve decided to cobble together a sort of Frankenstein’s Monster entry from a number of different comments I’ve made elsewhere in debates relating to ODF vs. OpenXML file formats. I’ve expanded upon the original phrasing a little, and I’ve tried to clarify my statements and suit them to this venue, but otherwise it’s pretty much just plagiarizing myself. Err, I mean it’s referencing my own work, as academics often do the works of their colleagues.

First off, a disclaimer to hopefully dissuade cries of mere anti-Microsoft bias, or “bashing”, or “zealotry”, or whatever: If Microsoft gets a standard approved then lets it go so that it stays “standard”, I don’t much care that Microsoft developed it. In fact, I’ll applaud Microsoft. What I don’t want to see happening is Microsoft getting a document format approved as a “standard”, then playing silly buggers with it as it has with HTML, CSS, C++, Javascript, and basically everything else on which it has gotten its grubby mitts.

Allow me to summarize my position, clarify the above, and repeat a bit:

I’d be happy to use a document format that was initially designed by anyone, Microsoft included, as long as it becomes a truly open, truly standardized format with clear and public documentation so that everyone can use it, and as long as Microsoft doesn’t sabotage standards compliant adoption of the format by producing software that misuses it.

Until a week or so ago, I didn’t know nearly enough about Microsoft’s proposed format, or even about ODF, in their technical aspects, to be able to comment meaningfully on which is technically better — and my technical knowledge of the formats still has big holes in it so that I try to avoid speaking outside the range of what I actually know. I’d prefer the technically better format, whichever it is, if that’s the only concern. Unfortunately, it’s far from the only concern. Even better would be both formats supported by all major office suites, but I won’t use a format that introduces significant security or stability issues, or that isn’t open and free, except when absolutely necessary — and even then, only under protest — but the security and stability issue doesn’t seem to be even a tertiary concern here, let alone a primary concern.

I’ve been doing some research. Here’s what I found:

  1. The ODF is necessarily a bit more resource hungry because it is more comprehensive than OpenXML. One can argue for either side of that — comprehensiveness and resource efficiency both have their positive points. I find it notable, however, that Microsoft’s sole point of argument here is in direct contradiction with other common Microsoft-sympathizer arguments. Specifically, I often hear the complaint that the reduced resource footprint (and I do mean dramatically reduced) of some piece of open source software as compared with its Microsoft-stack analog is functionally irrelevant because of the increased performance of hardware. See debates about Vista vs. Linux on the basis of resource-hungry operation and hardware requirements for examples. That being the case, one must wonder why there is suddenly such a distinct reversal of argument here, with the (marginally) reduced resource footprint of OpenXML as compared with ODF becoming Microsoft’s rallying cry. Yes, it’s marginal: the “100 times as much” resource footprint of ODF cited in some arguments is as compared with Microsoft’s binary formats, under very specific conditions, using very specific test conditions narrowly defined, which mixes application resource usage with document resource requirements liberally. It does not compare ODF with its XML-based text data formats, which show the above-mentioned marginal resource usage advantage.
  2. While Microsoft has signed a covenant of nonlitigation, this doesn’t actually open the format at all. It only opens the implementation of it. While this might at first glance appear to be nitpicking, it’s worth noting that Microsoft could easily pull the old bait-and-switch as it historically, and consistently, has with almost all its technologies. All it needs to do is get the standard approved, convince everyone that it’s “just as open as” (or even “more open than”) ODF, get it widely implemented to the extent that market dominance is maintained for its office software, then change the format specification for its next office suite release (or the next service pack, for that matter) without telling anyone until the day the new implementation hits the market. This artificially creates and reinforces a technical advantage by turning the document format upon which the industry standardizes into a moving target. The market dominance practices of Microsoft in this regard are clear and well demonstrated by Microsoft’s intention of supporting its own “open” format without also supporting the competing ODF, while its competition does everything in its power to support Microsoft’s formats alongside native and open formats. So long as Microsoft retains the ability to unilaterally and at its sole discretion alter the format specification (even if it must get “approval” of the new format each time, though the notion that it must do so is a dubious one at best), its format is not truly open, due to a conflict of interest for the sole specification-maintaining party.
  3. Aside from performance concerns, the sole technical benefit of OpenXML is the more inclusive ability of it to incorporate additional custom-designed shemas, both in loosely and tightly coupled manners. Despite propaganda to the contrary, ODF is capable of easily incorporating custom schemas, primarily by way of embedded ability to support W3C-standard XForms. XForms support is intentionally constrained in its ability to support custom schemas, as compared with OpenXML’s support for custom schemas, for the purpose of obviating the detrimental aspects of custom schema definition and inclusion that are endemic to OpenXML’s specification. The primary reason this less constrained implementation of custom schema inclusion is considered undesirable is the fact that it fosters creation of nonportable documents: while conforming to the OpenXML document specification, they would include nonportable scripting and data formatting. Of particular concern here is the fact that this would create increased opportunity for Microsoft Office to be designed by Microsoft to leverage an “open” format for the purposes of producing nonportable documents, again to promote and maintain market dominance.
  4. Microsoft’s substantive excuse for preferring OpenXML is centered around making a document format backward compatible, when backward compatibility with closed formats while designing a new, supposedly “open”, document format should be confined to making an application backward compatible. Making a document format backward compatible with other (primarily binary) proprietary document formats is actually counterproductive to the purposes of designing and adopting an open document format standard. Rather than making the documents backward compatible (specifically with previous Microsoft document formats, ignoring other older document formats), make your new application that supports the new document format backward compatible so that it can translate freely between the two document formats. This solves the problem for the user and it provides encouragement for the real purpose of the open document format: moving documents, both old and new, to a format that makes better sense in terms of both portability and accessibility. In any case, there’s certainly nothing preventing Microsoft from implementing both ODF and OpenXML, one for widespread compatibility and the other for backward document format compatibility, other than Microsoft’s own intention of freezing out competitors through anticompetitive practices. Additionally, Microsoft’s history of ignoring document format compatibility between versions of its own applications, and providing only rudimentary and temprorary application support for earlier formats, strikes me as a pretty clear indicator of its true intent: to manufacture excuses for trying to ensure sole control by Microsoft of the widely adopted “open” document format of the future.

According to Wikipedia, the Danish government’s definition of an “open standard” (and the Danish definition is accepted EU-wide as the minimum set of requirements to qualify as an “open standard”) is as follows:

  • The costs for the use of the standard are low.
  • The standard has been published.
  • The standard is adopted on the basis of an open decision-making procedure.
  • The intellectual property rights to the standard are vested in a not-for-profit organisation, which operates a completely free access policy.
  • There are no constraints on the re-use of the standard.

OpenXML basically fails, at least in part, on all but two of those points, and it has the potential to eventually fail on one or both of those exceptions.

Also according to Wikipedia:

“The primary goal of open formats is to guarantee long-term access to data without current or future uncertainty with regard to legal rights or technical specification.”

. . . and . . .

“A common secondary goal of open formats is to enable competition, instead of allowing a vendor’s control over a proprietary format to inhibit use of competing products.”

For an example of the benefits of open formats:

HTML, and its successor XHTML, are open standards. You can readily see what damage has been done to interoperability by Microsoft’s domination of the web browser market in the fact that there are a lot of websites that have (thanks to MS’s anticompetitive practices, leveraging market dominance to increase market dominance) been coded specifically to Internet Explorer’s quirks so that other browsers are “shut out”. Imagine for a moment how much worse it would be if there were no (X)HTML standard. As things currently stand, Microsoft’s stubborn use of a bastardized markup implementation in IE is finally being challenged, in large part because it is in egregious violation of standards.

As a result, we are seeing increased competition in the browser niche of the application market years after IE had pretty much sewn up that niche by destroying Netscape’s ability to compete. That competition is not only resulting in the advancement of browser technology in Microsoft’s competitors, but is also forcing Microsoft to try to keep up with the Jonses by improving upon IE and related software after years of technological stagnation. OneCare, the Windows Firewall, inclusion of tabs and other advanced browser features in IE7, the ability to turn off ActiveX capabilities, addition of granular control over script execution in the browser, sandboxing, and many other security “improvements” (or at least attempts at the appearance thereof) can almost directly be attributed, at least in part, to competition in the web browser market niche.

If there were not an open markup standard from which Microsoft couldn’t just deviate completely without incurring some negative consequences, Microsoft’s “HTML” would be something (unrecognizable to HTML) else entirely that nobody else would be allowed to use, XHTML would never have been invented, and competition in that niche and other, related niches would be nothing more than a fond memory.

Additionally, a couple of terms to keep in mind as reasons to avoid proprietary formats:

  • “vendor lock-in”
  • “embrace, extend, extinguish”

Someone recently asked for a list of reasons for preferring open formats for documents over closed/proprietary formats. Part of the problem with answering that question is that it is asking for a list (by which I think is meant “a list of very short statements about advantages to an open standard”), when lengthy explanations are needed above and beyond mere bullet point items to get the point across. I took a whack at it anyway. Don’t blame me if the reasons for some of these advantages are not immediately obvious within the context of the list, though. In addition, some of the list items overlap others because of the fact that I tried to address much of the explanatory necessities of trying to get the point across, which required looking at different angles of the same issues.

For fun, I presented this list within a simple Perl script that can be run on a unix-like system to spit out randomly generated selections from the list when the program is called from the command line. I’m not entirely familiar with use of Perl on a Windows machine, but this script should be 100% portable by simply replacing the shebang line of the script with whatever Windows needs in its place.

  #!/usr/bin/perl
  use strict;
  use warnings;

  srand;

  my @reason = (
  'Open formats eliminate legal restrictions on implementation.',
  'Open formats ensure the full specification is available to implementors.',
  'Open formats are far more difficult to leverage for anticompetitive practices.',
  'Open formats do not lock organizations into reliance on a specific vendor.',
  'Open formats provide greater ease of access over wide distributions of data to varying populations.',
  'Open formats development tends to be pressured by common needs rather than marketability concerns.',
  'Open formats do not change for no reason other than driving new application version uptake.',
  'Open formats do not tend to involve sneaky ways to slip proprietary data formats into them.',
  'Open formats foster inter-application compatibility.',
  'Open formats do not allow imposition of royalty fees on implementors.',
  'Open formats are more conducive to third-party software innovation.'
  );

  print @reason[rand(@reason)], "\\n";

From what I’ve seen thus far, it looks like both OpenXML and ODF specify a system of interrelated, modularized XML files to define a single document. Both, to some extent, allow for these to be combined into a single XML file for an alternative document saving format, or at least a drastic reduction in that modularization. When saved as a collection of interrelated files, however, they are then compressed using Zip-compatible compression. This bothers me, not only because the Zip algorithm is proprietary (though apparently free of implementation encumbrances), but also because the saved document format is no longer human-readable. Both format specifications claim human readability by pointing out the fact that they’re XML documents (and complex XML’s human readability is suspicious anyway), but ignore the fact that by storing the files in Zip archives they are rendered in a binary compressed format that requires translation to a human-readable form. In this respect, both document formats fall down. I am, understandably, disappointed. What the hell were the ODF people thinking? Microsoft, of course, doesn’t really give a fig — they’d rather make it as human-unreadable as possible, to improve on the vendor lock-in characteristics of MS file formats — but this seems antithetical to the aims of ODF.

Anyhow, there it is.

a bit about the blog spam situation

Filed under: Metalog — apotheon @ 06:59

I started getting a lot of blog spam a little while ago. I started using the automatic post moderation feature that sends posts into moderation if they contain too many links. This worked for a little while, though I still had to delete the posts from the moderation queue myself. Better that than getting false positives and never knowing it, I reasoned.

I started getting hit by blog spammers who included only one link in the body of the comment, or no links at all, and only used the ability to use a URL to make a link of the poster’s name to create links to whatever they were spamming. I of course was somewhat troubled by this, and after a while of deleting several a day I decided to change things.

I instituted a policy here at SOB where anyone that wanted to post a comment needed to register. Thus far, I haven’t gotten any blog spam at all, and the amount of discussion that my posts generate doesn’t seem particularly reduced by this. Of course, a problem here is that potential legitimate discussion that doesn’t happen is never noticed: I don’t know if someone refuses to comment due to the mandatory registration. I’m considering removing the necessity of an email address when someone registers, because I know that might be a barrier to casual commenting even when it isn’t spam, but I’m also hesitant to open the door even that much to spammers.

It was suggested by a reader known here as Alex that I should use Akismet and the WordPress Spam Image plugin, in his comments to my entry about requiring registration. I’ve looked at Akismet, and it appears to be a heuristic spam filter based on spam example blacklists, which if well-executed would be an excellent approach to the matter. I’ve chosen to eschew it for now, however, for reasons not easily articulated. Perhaps I’ll revisit this later. The Spam Image plugin looks easy to use and probably reasonably effective, though as long as I require registration I’m not sure it’s actually necessary. I’ll have to think about it.

Speaking of the Spam Image plugin, there’s something similar being used over at Chip’s Quips, another weblog I make an effort to follow. He always has interesting stuff to say, and has said quite a bit about blog spam. In particular, he seems disappointed with the performance of WordPress in blocking spam based on my own reports in comments to his weblog, but he probably shouldn’t be: I’ve done almost nothing about stemming the tide until I started requiring user registration to post comments, and I’ve done nothing since. I haven’t actually used any of the more advanced anti-spam technologies available to WordPress users, and thus really don’t have anything to say about them, positive or negative, except as a visitor who compares what he sees on others’ weblogs. What I see is this:

The image plugin being used to keep spammers out of Chip’s Quips is awful. Half the time, the characters you have to enter aren’t even legible to the visitor, let alone to a spambot. I tried to comment on this a couple times, and couldn’t get through the spam filtering to post the commentary, so I gave up. Sorry, Sterling: I’d rather have told you in a less public way than this, but I can’t get through. It’s like those child-proof caps that some manufacturers use that are so effective they even keep the adults out. This is the flip side of the “false positives” coin, and something I really would like to avoid: I don’t want to make it difficult for people to post legitimate commentary. I’m going to try to alert him to this post with a comment to one of his, but I don’t know if it will get through. I recommend checking out his posts on the matter of blog spam, in any case, which are basically all linked-to through this entry about ham, and jam, and spam.

Older Posts »

All original content Copyright Chad Perrin: Distributed under the terms of the Open Works License