Day 5 - Taming legacy websites with Catalyst and wget
Taming legacy websites with Catalyst and wget.
The problem
I was asked to update a rather old website which has traditionally been maintained with Dreamweaver, and uses nested tables to provide the layout. I needed to draw from a variety of sources - my own and outside databases, and data gathered from screen scrapers, RSS feeds, and so on. So my models were a mess, my employer was completely inflexible about the tools used to do the job, and in any case the same tools are due to be replaced with a new (expensive) content management system in a couple of months' time. To ease maintenance headaches, and to be able to leverage Catalyst's ability to use any model you like, I took the following approach:
- Create a catalyst appliation for running with the built in server.
- Make a static mirror of the application with
wget
.
Stage 1 - Deciphering the templating system.
AKA the nightmare of 17 nested tables.
With no flexibility about how to lay the site out, and with the graphics and styles already prescribed, I sat down and worked out the structure of the web site. It turned out that there were 17 nested tables, and all I was interested in modifying were the content pane and the menu pane. Everything else was fixed.
With my architecture, this meant that the application template would be
pretty much the same for every request, so in
MyApp::Controller::Root->auto
I set my Template Toolkit template:
$c->stash->{template} = 'index.tt';
which I serve out of MyApp/root/base
in this case.
For the menus and the content page for each controller that's accessed I do the following to set the dynamic content of the page:
$c->stash->{maincontent} = 'path_to/template.tt';
and somewhere in index.tt
, or in the horrible mess of INCLUDEs I put
together to make sense of the 17 nested tables, I have this:
[% INCLUDE $c.stash.maincontent %]
The principle is basically the same for any other parts of the template that I want to have dynamic as well. You can make calls to a model to populate data structures or whatever else you want to do here.
Authentication
I quite like being able to edit some pages in the browser for purposes
of instant gratification and rapid sanity checking. Because this
application is for internal use and is only publicly deployed as a
static site, authentication is particularly simple. In
MyApp::Controller::Root
I have the following:
$c->stash->{edit} = 1 if $ENV{'EDIT'};
and in the template, anywhere I have a link to an edit screen I have the following:
[% IF c.stash.edit %] <a href="[% c.uri_for(thing) %]">Edit</a> [% END %]
so when I run the server like this:
$ script/myapp_server.pl
There are no edit pages available through the application. However when I run it like this:
$ EDIT=1 script/myapp_server.pl
I can edit away, and it's still possible to delegate editing to a small trusted team on the network without having to worry too much about security.
Deployment
There are two parts to this process:
- In
MyApp::Controller::Root->auto
if ($c->req->path && $c->req->path !~ /\/$/ && ! $c->req->param) { $c->res->redirect('/' . $c->req->path . '/', 301); return 0; }
This forces requests without a trailing slash and with no POST or GET parameters to redirect to a url with a trailing slash. This makeswget
behave in the expected manner. - Run
wget
to make a static mirror$ wget -r -k http://localhost:3000
Andwget
creates a subdirectory calledlocalhost:3000
which contains your entire site with relative links (but be warned: if any of your links return a404 not found
status,wget
will not rewrite them, and you will have the-k
flag passed so wget will not rewrite that link as relative). After this it's a simple matter of copying these files to the desired location on your web server. In some cases after this you may want to change some things to suit your environment, in which case the below should be a very powerful tool (after appropriate and careful testing):$ find . -name '*html' | xargs perl -p -i -e 's/something/somethang/g'
Wrap up
And that's it: using Catalyst to make sense of a set of static web pages when you want to draw data for them from a range of sources. I'd particularly like to hear about extensions of this technique as I think it's an underrated approach to web application programming.
AUTHOR
Kieren Diment <diment@gmail.com>
COPYRIGHT.
Copyright 2006 Kieren Diment. This document can be freely redistributed and can be modified and re-distributed under the same conditions as Perl itself.