Chad Perrin: SOB

20 September 2007

PHP has given markup-embedded code a bad name.

Filed under: Geek,Profession — apotheon @ 11:03

PHP started life as a simple web templating system written in Perl, intended to be a way to more easily use Perl to generate web pages by combining markup templates with stored or dynamically generated content. Eventually, PHP struck out on its own, becoming a new language.

The key characteristics of PHP that helped it become a success on its own, separate from Perl, in the web development market were the web templating oriented core functions and the way it could be embedded in markup. The former made it very accessible to beginners to web programming who wouldn’t have to use separate language libraries to perform common operations — such as the common usefulness of the nl2br() function for formatting text. The latter unified the presentation logic of a web template with the visual structure of the template for the simplest use cases, which made getting projects started and running smoothly an absolutely brainless task. In absence of anything else with those particular characteristics, one must ask: How could it fail to catch on?

Fast-forward a few years. The people doing lightweight work with PHP and the people doing complex heavy-lifting coding for massively high-load “mission critical” systems have met in the middle. There’s a sweet spot in the middle that has given rise to newer ways to use languages like Perl, Python, and Ruby for web development, including ever-easier ways to write useful code quickly. Languages commonly associated with that enterprisey heavy-lifting coding employed by multinational risk-averse corporations have learned some lessons from those dynamic languages in the middle ground, adopting ever-more dynamic framework development capabilities. PHP has learned from them as well, eventually even spawning some kind of half-baked approximation of object oriented programming capabilities — mostly useful only for making up for the lack of proper namespace support, but still an obvious attempt to make PHP more scalable.

PHP programmers and “real” programmers are overlapping a lot more these days. It has long since gotten to the point where a lot of PHP programmers are “real” programmers too — and PHP provides the capabilities you need to do “real” programming, even if it may not do it in quite the way you’d like. After all, if the Wikimedia Foundation can run one of the most popular websites in the world on a web application written in PHP (I’m talking about MediaWiki, of course), PHP isn’t quite the bush-league “Personal Home Page” templating tool that it once was.

That, though, is the problem. People keep pushing the boundaries of what’s sane to do with PHP. There are some really significant pieces of software being written in PHP out there — and because of that, PHP’s limitations have become all too clear to the programmers using it. I could speculate about why programmers are finding themselves in this position in the first place (and I have some pretty good ideas about that), but that’s a bit off-topic. The salient point is that some programmers are getting really frustrated with PHP. Whether they came from other languages to work on a growing project that just happened to be written in PHP, started with PHP and gradually grew more ambitious in terms of the software they wanted to write while learning enough about other languages to have an effective basis for comparison with PHP, or somehow ended up in that position by some other means, many programmers are becoming frustrated with the limitations of PHP — and they’re forming opinions about why PHP suffers those limitations.

In certain circles, it is very popular to hold the opinion that software architecture should be strictly divided into specific components based on easily defined parts of a sort of software workflow. With the growing popularity of MVC frameworks for web development, those circles are growing. In general, this is a good thing, but it can be taken too far. A lot of people — including those who have become enamored with probably the most hyped MVC framework of all time (Rails) — are looking at everything for ways to break things up into components.

Somewhere along the way, I think someone started looking at all the unmaintainable spaghetti code out there written in PHP and started looking for reasons that this might be the case — reasons specific to the language PHP. There are some reasons specific to the language, of course, technical reasons, like the fact that function names aren’t named according to any kind of well-structured plan and thus end up conforming to the naming convention of no naming conventions, and the fact that there’s an entrenched pathological aversion to breaking backward compatibility to the extent that something that needs fixing simply won’t get fixed (but they’ll add a new, redundant version of a core function with a slightly different name so you can do it right while still avoiding breaking backward compatibility with the way people have been doing it wrong all these years). There are also social reasons, like the ease of picking up the language and making software that works (for some definition of the term “works”) without having any clue what you’re doing or why you shouldn’t be doing it that way. There’s even some evidence of a deeply rooted tradition of blaming the user when there’s something wrong with the language rather than fixing it, kinda like that whole Kingdom of Nouns thing with Java. These are not reasons that don’t exist anywhere else, though, even if the particular collection of problems with PHP involves a combination that is somewhat unique.

People looking for technical problems with PHP that they can use as indicators of how to make things better in the future, how to derive principles of good programming from the mistakes around them, tend to want to find a single “You know what your problem is?” kind of issue with PHP to explain away all this unscalable code that someone tried to scale anyway. The single most visible characteristic of PHP that is, in practice, largely unique to PHP is the way it combines actual programming with markup. This provides plenty of opportunity for the Technique Du Jour — splitting everything up into components as much as possible, aka “modularizing” everything — to be applied as the panacea that PHP needs. Thus, the diagnosis is this: Program logic shouldn’t be combined with presentation structure.

The idea, then, is that your code should sit somewhere in the back, divorced entirely from the markup that the browser sees. You should have some library like Perl’s CGI.pm generating markup so you don’t have to. Make sure you separate logic from presentation such that never the twain shall meet again. To do otherwise is verboten and heretical. You don’t want to be a heretic, do you?

It’s all poppycock, though. The ability to embed code in markup is one of the things that made PHP strong in the first place. If you can’t embed code in markup, you can’t create intuitively clear templates, easily reasoned through and maintained. In the attempt to achieve that sort of maintainability of your templates while keeping the program logic and the markup separate, people are creating and using libraries designed specifically to allow web developers to write code that looks like markup. Seriously. I see it all the time, and it makes me cringe. The absurdity of seeing someone use a function that does nothing but output opening and closing tag with content in the middle that is expressly fed to the function as an argument drives me up the wall. What the hell good is it to pretend you’re “modularizing” your code when all you’re really doing is replacing <h1>Text Goes Here</h1> with something that looks like $foo->h1('Text Goes Here');? All that does is make your template harder to read.

Markup-embedded coding is a good thing. Oh, sure, it can be overused or just misused and abused. Don’t do that. I trust the halfway-decent (or better) programmers I know to avoid doing that. Avoiding it like the plague because you take the concept of componentized coding too far is just making life difficult for yourself (and for the maintenance programmer that comes after you) for no good reason. The problem here is, I think, that saying “separate the presentation logic from the business logic” doesn’t guarantee someone listening to you won’t hear “separate the presentation from the program logic”. Someone hearing something different from what you’ve actually said like that will then go off and try to make use of the Smarty templating system, making the whole process of web programming hopelessly “meta” in an attempt to divide all program logic from the presentation markup to “fix” the problems with writing huge, complex programs in PHP — and they’ll utterly fail to realize the irony of inserting yet another templating language between PHP and the final product as part of an effort to “simplify” program design. Suddenly, you’ve got a templating language translating between markup and another templating language. That’s where PHP shines best (or at all, at any rate): as a templating language.

So, in summary: It looks like because of issues with PHP, there are a lot of people out there who are down on anything that looks like a Turing-complete markup-embedded templating language. Too many times, I’ve seen people blame the markup-embedded capabilities of PHP for the problems PHP has introduced into their lives, rather than realizing that the real problem is that they and others like them have tried to use PHP like C++ or Java.

This all occurred to me tonight because I was thinking about the sense of freedom I have as I become more familiar with eruby, a tiny little executable that gets deposited in the cgi-bin directory so that I can execute markup-embedded Ruby without having to do so via a framework like Rails if I don’t want to. I get that sense of freedom because, unlike a lot of people who think no program logic should ever directly touch presentation markup, I realize that what has really been frustrating me isn’t markup-embedded code — it’s the language I’ve had to use for markup-embedded code for all these years.

addendum: Just for the heck of it, let’s compare some ways to output a heading, with a variable containing the heading text.

  • Object-oriented Perl with CGI.pm

    use CGI;
    my $cgi = CGI->new;
    my $heading = 'Heading Text';
    print $cgi->h1($heading);

  • Embedded pseudo-Perl using Template Toolkit (added Fri Sep 21 10:54:26 thanks to Michael Peters)

    [% heading = 'Heading Text' %]
    <h1>[% heading %]</h1>

  • Embedded PHP

    <?php $heading = 'Heading Text' ?>
    <h1><?php echo $heading ?></h1>

  • Embedded Ruby using eRuby

    <% heading = 'Heading Text' %>
    <h1><%= heading %></h1>

I have my favorite. What’s yours?

All original content Copyright Chad Perrin: Distributed under the terms of the Open Works License