Chad Perrin: SOB

20 September 2007

PHP has given markup-embedded code a bad name.

Filed under: Geek,Profession — apotheon @ 11:03

PHP started life as a simple web templating system written in Perl, intended to be a way to more easily use Perl to generate web pages by combining markup templates with stored or dynamically generated content. Eventually, PHP struck out on its own, becoming a new language.

The key characteristics of PHP that helped it become a success on its own, separate from Perl, in the web development market were the web templating oriented core functions and the way it could be embedded in markup. The former made it very accessible to beginners to web programming who wouldn’t have to use separate language libraries to perform common operations — such as the common usefulness of the nl2br() function for formatting text. The latter unified the presentation logic of a web template with the visual structure of the template for the simplest use cases, which made getting projects started and running smoothly an absolutely brainless task. In absence of anything else with those particular characteristics, one must ask: How could it fail to catch on?

Fast-forward a few years. The people doing lightweight work with PHP and the people doing complex heavy-lifting coding for massively high-load “mission critical” systems have met in the middle. There’s a sweet spot in the middle that has given rise to newer ways to use languages like Perl, Python, and Ruby for web development, including ever-easier ways to write useful code quickly. Languages commonly associated with that enterprisey heavy-lifting coding employed by multinational risk-averse corporations have learned some lessons from those dynamic languages in the middle ground, adopting ever-more dynamic framework development capabilities. PHP has learned from them as well, eventually even spawning some kind of half-baked approximation of object oriented programming capabilities — mostly useful only for making up for the lack of proper namespace support, but still an obvious attempt to make PHP more scalable.

PHP programmers and “real” programmers are overlapping a lot more these days. It has long since gotten to the point where a lot of PHP programmers are “real” programmers too — and PHP provides the capabilities you need to do “real” programming, even if it may not do it in quite the way you’d like. After all, if the Wikimedia Foundation can run one of the most popular websites in the world on a web application written in PHP (I’m talking about MediaWiki, of course), PHP isn’t quite the bush-league “Personal Home Page” templating tool that it once was.

That, though, is the problem. People keep pushing the boundaries of what’s sane to do with PHP. There are some really significant pieces of software being written in PHP out there — and because of that, PHP’s limitations have become all too clear to the programmers using it. I could speculate about why programmers are finding themselves in this position in the first place (and I have some pretty good ideas about that), but that’s a bit off-topic. The salient point is that some programmers are getting really frustrated with PHP. Whether they came from other languages to work on a growing project that just happened to be written in PHP, started with PHP and gradually grew more ambitious in terms of the software they wanted to write while learning enough about other languages to have an effective basis for comparison with PHP, or somehow ended up in that position by some other means, many programmers are becoming frustrated with the limitations of PHP — and they’re forming opinions about why PHP suffers those limitations.

In certain circles, it is very popular to hold the opinion that software architecture should be strictly divided into specific components based on easily defined parts of a sort of software workflow. With the growing popularity of MVC frameworks for web development, those circles are growing. In general, this is a good thing, but it can be taken too far. A lot of people — including those who have become enamored with probably the most hyped MVC framework of all time (Rails) — are looking at everything for ways to break things up into components.

Somewhere along the way, I think someone started looking at all the unmaintainable spaghetti code out there written in PHP and started looking for reasons that this might be the case — reasons specific to the language PHP. There are some reasons specific to the language, of course, technical reasons, like the fact that function names aren’t named according to any kind of well-structured plan and thus end up conforming to the naming convention of no naming conventions, and the fact that there’s an entrenched pathological aversion to breaking backward compatibility to the extent that something that needs fixing simply won’t get fixed (but they’ll add a new, redundant version of a core function with a slightly different name so you can do it right while still avoiding breaking backward compatibility with the way people have been doing it wrong all these years). There are also social reasons, like the ease of picking up the language and making software that works (for some definition of the term “works”) without having any clue what you’re doing or why you shouldn’t be doing it that way. There’s even some evidence of a deeply rooted tradition of blaming the user when there’s something wrong with the language rather than fixing it, kinda like that whole Kingdom of Nouns thing with Java. These are not reasons that don’t exist anywhere else, though, even if the particular collection of problems with PHP involves a combination that is somewhat unique.

People looking for technical problems with PHP that they can use as indicators of how to make things better in the future, how to derive principles of good programming from the mistakes around them, tend to want to find a single “You know what your problem is?” kind of issue with PHP to explain away all this unscalable code that someone tried to scale anyway. The single most visible characteristic of PHP that is, in practice, largely unique to PHP is the way it combines actual programming with markup. This provides plenty of opportunity for the Technique Du Jour — splitting everything up into components as much as possible, aka “modularizing” everything — to be applied as the panacea that PHP needs. Thus, the diagnosis is this: Program logic shouldn’t be combined with presentation structure.

The idea, then, is that your code should sit somewhere in the back, divorced entirely from the markup that the browser sees. You should have some library like Perl’s generating markup so you don’t have to. Make sure you separate logic from presentation such that never the twain shall meet again. To do otherwise is verboten and heretical. You don’t want to be a heretic, do you?

It’s all poppycock, though. The ability to embed code in markup is one of the things that made PHP strong in the first place. If you can’t embed code in markup, you can’t create intuitively clear templates, easily reasoned through and maintained. In the attempt to achieve that sort of maintainability of your templates while keeping the program logic and the markup separate, people are creating and using libraries designed specifically to allow web developers to write code that looks like markup. Seriously. I see it all the time, and it makes me cringe. The absurdity of seeing someone use a function that does nothing but output opening and closing tag with content in the middle that is expressly fed to the function as an argument drives me up the wall. What the hell good is it to pretend you’re “modularizing” your code when all you’re really doing is replacing <h1>Text Goes Here</h1> with something that looks like $foo->h1('Text Goes Here');? All that does is make your template harder to read.

Markup-embedded coding is a good thing. Oh, sure, it can be overused or just misused and abused. Don’t do that. I trust the halfway-decent (or better) programmers I know to avoid doing that. Avoiding it like the plague because you take the concept of componentized coding too far is just making life difficult for yourself (and for the maintenance programmer that comes after you) for no good reason. The problem here is, I think, that saying “separate the presentation logic from the business logic” doesn’t guarantee someone listening to you won’t hear “separate the presentation from the program logic”. Someone hearing something different from what you’ve actually said like that will then go off and try to make use of the Smarty templating system, making the whole process of web programming hopelessly “meta” in an attempt to divide all program logic from the presentation markup to “fix” the problems with writing huge, complex programs in PHP — and they’ll utterly fail to realize the irony of inserting yet another templating language between PHP and the final product as part of an effort to “simplify” program design. Suddenly, you’ve got a templating language translating between markup and another templating language. That’s where PHP shines best (or at all, at any rate): as a templating language.

So, in summary: It looks like because of issues with PHP, there are a lot of people out there who are down on anything that looks like a Turing-complete markup-embedded templating language. Too many times, I’ve seen people blame the markup-embedded capabilities of PHP for the problems PHP has introduced into their lives, rather than realizing that the real problem is that they and others like them have tried to use PHP like C++ or Java.

This all occurred to me tonight because I was thinking about the sense of freedom I have as I become more familiar with eruby, a tiny little executable that gets deposited in the cgi-bin directory so that I can execute markup-embedded Ruby without having to do so via a framework like Rails if I don’t want to. I get that sense of freedom because, unlike a lot of people who think no program logic should ever directly touch presentation markup, I realize that what has really been frustrating me isn’t markup-embedded code — it’s the language I’ve had to use for markup-embedded code for all these years.

addendum: Just for the heck of it, let’s compare some ways to output a heading, with a variable containing the heading text.

  • Object-oriented Perl with

    use CGI;
    my $cgi = CGI->new;
    my $heading = 'Heading Text';
    print $cgi->h1($heading);

  • Embedded pseudo-Perl using Template Toolkit (added Fri Sep 21 10:54:26 thanks to Michael Peters)

    [% heading = 'Heading Text' %]
    <h1>[% heading %]</h1>

  • Embedded PHP

    <?php $heading = 'Heading Text' ?>
    <h1><?php echo $heading ?></h1>

  • Embedded Ruby using eRuby

    <% heading = 'Heading Text' %>
    <h1><%= heading %></h1>

I have my favorite. What’s yours?


  1. Your Ruby example at the bottom looks like like JSP and ASP/ASP.Net. J2EE and ASP.Net (not ASP though) learned the PHP lesson quite well as both of them allow you to divorce the logic from the markup, and put as much (or as little) code into the markup to hook into the codebehind. ASP.Net takes it even further, with the “master pages” system (which relies upon partial classes). Overall, J2EE and ASP.Net are doing a pretty good job at not being PHP in a different language (JSP and ASP tend to be like that, though).

    Where PHP does well is in a system (what I call a “dynamic Web site” as opposed to a “Web application”) in which the page model actually makes sense; no function appears in more than one page, functions can be run in complete isolation of each other, etc. etc. etc. In other words, if you could refactor your program to eliminate procedures (the code equaivalent of a database denormalization) and you do not see any repetition of code, you have a page model “dynamic Web site”. MediaWiki, wikis in general, blog software, all are great examples of where the page model makes sense, and PHP apps dominate those markets for just that reason.

    But nowadays, J2EE and ASP.Net make it just as easy to write PHP-like code (if that’s what you like and/or want to be writing) with all of the logic in the markup as it is to be writing more “application” type code, with interlocking features and functionality.


    Comment by Justin James — 21 September 2007 @ 05:25

  2. You’re Perl example is a little misleading. Yes, that’s how people did it in the 90’s, but most Perl web development uses templates. Your example using the Template Toolkit looks like this: [% heading = 'Heading Text' %] <h1>[% heading %]</h1>

    Pretty darn close to the eruby example. And I know that it’s a matter of preference, but ‘[% %]’ is easier to spot in an HTML template than ” so it’s easier to see where the important bits are.

    Comment by Michael Peters — 21 September 2007 @ 07:34

  3. And to comment on the main points of the article… I agree with most everything you say. There is application logic and there is display logic. If you’re application is simple, mixing the 2 doesn’t really hurt. But if you have to scale so that different people care about application logic than care about the display logic, or you need to have the same application logic output to different display formats, then you need to separate them. It’s nothing specific to the web, it’s how all software separation of concerns should work.

    Comment by Michael Peters — 21 September 2007 @ 07:36

  4. Justin James:

    It’s easy enough to employ embedded PHP to produce a back-end design that doesn’t repeat functions, et cetera. Judicious use of PHP functions like include_once() come in handy for that sort of thing. Thus, embedded code is useful for more than just a “page-based” model. That’s really the key to the benefits of templating systems.

    But nowadays, J2EE and ASP.Net make it just as easy to write PHP-like code (if that’s what you like and/or want to be writing) with all of the logic in the markup as it is to be writing more “application” type code, with interlocking features and functionality.

    I don’t think I’d say “just as easy”. The markup-embedded code itself is (almost) as easy to write, but a lot more has to be done “behind the scenes” to allow that sort of thing to work with systems like JSP and ASP.

    Michael Peters:

    You’re Perl example is a little misleading. Yes, that’s how people did it in the 90’s, but most Perl web development uses templates.

    I think you misunderstood the intent of my Perl example. I didn’t mean to make it seem that Perl is somehow inferior to PHP and Ruby because of the syntax necessary to use to create a presentation template. I just used methods to show what sort of result you get from trying to separate all program logic from presentation structure, regardless of the language.

    By the way, you can use Markdown to format text in your comments. I suspect some of your code example vanished in the first of your two comments here. You might want to have a look at the syntax guide for Markdown, but to address the immediate problem of displaying source code properly, surrounding code examples with backticks will ensure that WordPress doesn’t filter the code out or try to apply it to your text when you just want it displayed.

    Comment by apotheon — 21 September 2007 @ 09:50

  5. […] Speaking of the deficiencies of PHP, another one is the complete lack of built-in debugging capabilities.  Sure, you can find debuggers for PHP, if you’ve got the time and ability to configure them on your server.  I guess if I was really serious about PHP I’d have to get one of those, but so far I only use PHP for WordPress customization, so it hardly seems worth the effort. […]

    Pingback by Wiping the mystery out of WordPress debugging -- Chip’s Tips for Developers — 24 September 2007 @ 05:55

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

All original content Copyright Chad Perrin: Distributed under the terms of the Open Works License