The problem with routing

Frameworks are happening in php, and have been for some time now. There are numerous MVC frameworks, micro or not so micro. One thing they all seem to have in common - apart from the MVC part that is - is a routing engine. 

A routing engine is a mechanism to change incoming request urls to actionable information, through a set of configurable rules. In effect they make it possible to change url's like this:

http://www.example.com/myapplication.php?action=add&title=Something 

Into this:

http://www.example.com/myapplication.php/add/Someting

Or with the help of e.g. Apache's mod_rewrite / mod_alias into this:

http://www.example.com/add/Something

You can do more with routing, you can create multileveled routes and subroutes and much more I'm sure. In effect it allows you to create nice readable URL's where once you would put all the needed information in the query part of the URL.

The only beef I have with this is that routing (usually) makes the component parts or logic of your application visible in the URL, instead of the information or data. So instead of an url like this:

http://www.example.com/a_main_subject/an_article_about_this

You get:

http://www.example.com/articles/an_article_about_main_subject

You are probably not interested in the fact that the page you are viewing is some kind of article. And you certainly won't be happy if you change the url and get http://www.example.com/articles/. Even if the site returns some sensible page, there is nothing about the main subject specifically in there, since that information is lost.

The first URL is guaranteed to be more usable by humans. It creates a logical hierarchy of information, as humans see it. The url does not specify what type of page you are looking at, only the subject of the page can be inferred. The type of the page - by which I mean how it should look and what 'widgets' are on the page - is something that should not have a place in the url.

You could fix it a bit, by allowing some kind of 'subject' part in the url, e.g.:

http://www.example.com/articles/a_main_subject/an_article_about_this

But still you would expose the type of a page where no human has a need for that information. In addition in this version you would have seperate roots for a_main_subject based on the type of a page. A better fix would be:

http://www.example.com/a_main_subject/articles/an_article_about_this

Which at least allows you to combine multiple types of pages about a single subject into a single root. It would actually make the ../articles/ url somewhat usefull. However, the url is still shouting out type information instead of subject information.

There are tools to add completely independant human readable URL's, but most of these try to fix these URL's retroactively. I think a much better approach is to create human friendly URL's by default. This makes it impossible to forget and skips an unnecessary part of website maintenance.

How did we get here?

In the early days of PHP a website was usually just a set of html pages on a filesystem. Wherever some extra functionality was needed you just inserted a bit of PHP and things worked fine. The urls were a natural extension of the hierarchy some editor made, based on human distinctions.

This system got torn to pieces when programmers arrived and said 'information should be in a database'. Fine as a concept, unfortunately the idea of a hierarchy of information based on subject went out the window as well. Instead the programmers have defined the hierarchy, based on what is easier to program, not what is easier to use.

The solution is to make the URL a part of the user interface of a website again and let the editor of the website naturally decide the hierarchy. One way of doing this is to use a content repository instead of a 'flat' database. A content repository knows about content hierarchy - just like a filesystem - and can be easily changed by an editor - also just like the filesystem.

For PHP there isn't much choice for content repositories, there is Midgard2 ( http://www.midgard-project.org/ ) and PHPCR ( http://phpcr.github.com/ ) which uses Jackalope - a complex java content repository.) In addition I found simPCoRe ( http://www.simpcore.org/ ) which looks like a good match, but I haven't tried it yet. And finally I've been working on Ariadne-CMS for the last 14-odd years, which uses a simple content repository internally - unfortunately it isn't yet usable as a standalone component.

PS. Routing engines can be a useful tool, when you are building a web application, not a website. In that case the content hierarchy will match naturally with your application structure.

July 16th 2012
Auke van Slooten

blog comments powered by Disqus