Catalyst and nginx

NOTE: This article has been moved to the Catalyst Wiki at http://dev.catalyst.perl.org/wiki/adventcalendararticles/2008/02-catalyst_and_nginx.

In the spirit of Perl, in that there is always more than one way to do it, there are also many ways to deploy your Catalyst application.

First, I will summarize the available options and then go into the details of my own choice for application deployment.

Available Deployment Options

The Built-in Server
Apache and mod_perl
FastCGI

External FastCGI
Apache mod_fastcgi

The Built-in Server

The first method is the Perl-based standalone server. This method is actually several methods, as it has several engines. This is the server that is used when the development server is initiated via:

 script/myapp_server.pl

The default development server (as it is commonly referred to) is a simple single threaded server. A secondary Perl-based engine that is more robust for production usage is the HTTP::Prefork. This engine is more similar to what mongrel is in the Rails world, and is a Perl implementation of a prefork server that can handle simultaneous connections. At the end of this article, I'll show how to use nginx to proxy to the Prefork server.

Apache and mod_perl

Apache is the stalwart of the web servers, and mod_perl is firmly attached to Apache. It has a tremendous number of merits to it, but it is very complex to simply get an application going properly. My general recommendation on this deployment methodology is to not use it unless you have distinct reasons to. There are numerous valid reasons, but it tends to be a heavyweight deployment mechanism that isn't worth the tradeoffs for simple web applications. If you know enough to use mod_perl, you won't be looking here!

FastCGI

Finally, we have FastCGI, a simple protocol that acts as a gateway between an application and a web server. FastCGI scripts are handled via socket communications (either Unix sockets or TCP sockets) between the webserver and the application. Most deployment mechanisms make usage of external scripts, with an external process manager that spawns the application FastCGI processes.

This may sound daunting, so let's break down FastCGI components into primitive concepts. But first, a simple definition as to what FastCGI is: FastCGI is simply a common protocol used for applications to talk to a webserver.

The first is the webserver. This is really an interchangeable piece of your deployment. For external FastCGI scripts, you can safely ignore the webserver.

The second piece is the FastCGI process manager. In the case of mod_fcgid and mod_fastcgi (both Apache modules), there is support for a simple FastCGI process manager. These modules handle spawning, reaping, and maintaining the child processes. In an external setup, there is some third party that does this task. Currently, Catalyst (by default) uses FCGI::ProcManager. This handles spawning the individual children.

The third and final piece is the individual application process. This connects to the FastCGI socket, and handles the incoming requests while sending the response upstream to the socket, which the webserver listens to and sends the response to the originating client.

The merits of using FastCGI are numerous, the most distinct of which is the capability to restart your application without downtime or without touching the webserver. The user running your application doesn't even need to be the same user as the webserver.

This zero downtime is achieved because multiple FastCGI processes can listen to the same socket. This means that you can start your next version on the socket, then shut down the old version. The upstream webserver doesn't even notice, or need to be notified.

I hope this article has been convincing enough to make use of FastCGI, because it is a harder sell to switch webservers. I don't have any good reason as to why I originally started playing with nginx (pronounced "engine-x"). It was simply idle curiosity that grew into appreciation of a piece of software with a sane and, comparatively speaking, beautiful configuration syntax as well as being rock solid for my tests and subsequent deployments.

Apache has never let me down, but it has frustrated me. Nginx won me over because it has a very non-intrusive and intuitive configuration format and has a great number of features, so it is hard to miss Apache for any modern projects. It certainly doesn't have all the features that Apache has (most importantly, no traditional CGI support) but it works very well, and is very, very fast.

While the front-end FastCGI proxy won't really affect an application's processing speed, it does affect how quickly static files and related resources are served, as well as the client communications. This is accomplished by reducing the time it takes for the client to receive the first byte, and other related metrics. While these efficiencies seem trivial, they do matter, and they add up, especially for high performance applications that require analysis of load order and blocking elements in the generated markup. Oftentimes the response from users is that things simply "feel" faster.

Nginx Configuration

Configuring a location in nginx to be handled by FastCGI is trivial. It's a simple one-liner, nestled into a location block, which points out the location to the socket:

    fastcgi_pass  unix:/var/www/MyApp/fastcgi.socket;

You can use Unix sockets or TCP sockets:

    fastcgi_pass  127.0.0.1:10003;

So, to put this together into a virtual host setting, your configuration will look something like this:

    server {
        listen       8080;
        server_name  www.myapp.com;
        location / {
            include fastcgi_params; # We'll discuss this later
            fastcgi_pass  unix:/tmp/myapp.socket;
        }
    }

This configuration block will send everything to your Catalyst application, which you don't want. You always want to have static files served directly from your webserver. To accomplish this, assuming you use a static directory, create another location block:

     location /static {
            root  /var/www/MyApp/root;
     }

Now, every request to a file in /static will be served directly from nginx. Speedy!

The fastcgi_params

For the most part, you can copy and paste this file if it isn't already available. Most packages include this, or an example of it, but in case you don't have it around just copy it verbatim into your nginx directory. If you do already have it, make sure to add the SCRIPT_NAME line so Catalyst can figure out the right path to your application.

    fastcgi_param  QUERY_STRING       $query_string;
    fastcgi_param  REQUEST_METHOD     $request_method;
    fastcgi_param  CONTENT_TYPE       $content_type;
    fastcgi_param  CONTENT_LENGTH     $content_length;

    # Catalyst requires SCRIPT_NAME to be passed in
    fastcgi_param  SCRIPT_NAME        $fastcgi_script_name;
    fastcgi_param  REQUEST_URI        $request_uri;
    fastcgi_param  DOCUMENT_URI       $document_uri;
    fastcgi_param  DOCUMENT_ROOT      $document_root;
    fastcgi_param  SERVER_PROTOCOL    $server_protocol;

    fastcgi_param  GATEWAY_INTERFACE  CGI/1.1;
    fastcgi_param  SERVER_SOFTWARE    nginx/$nginx_version;

    fastcgi_param  REMOTE_ADDR        $remote_addr;
    fastcgi_param  REMOTE_PORT        $remote_port;
    fastcgi_param  SERVER_ADDR        $server_addr;
    fastcgi_param  SERVER_PORT        $server_port;
    fastcgi_param  SERVER_NAME        $server_name;




Using Catalyst::Engine::HTTP::Prefork

This is one of main reasons why I love nginx over Apache. To switch from fastcgi to using prefork, you just have to change one line. Instead of fastcgi_pass, you simply have proxy_pass in its place and then set the proxy headers:

    location /myapp/ { # Or, simply "location /"
        proxy_set_header Host $http_host;
        proxy_set_header X-Forwarded-Host $http_host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

        proxy_pass  http://localhost:3000/;
    }

Now, all connections for that location go to your application which is running on the built-in server (hopefully with Catalyst::Engine::HTTP::Prefork). If you use this method, you will have to enable the "using_frontend_proxy" option in your Catalyst application.

One more thing...

Deployment is remarkably painless with nginx, but there is a huge caveat when dealing with FastCGI. Your application must exist at "/" as far as nginx is concerned. If you have a front-end proxy ahead of your application (like using the configuration above with proxy_pass), you can mitigate this issue there. The core issue is the lack of standardization on how FastCGI applications parse and setup their environment. This limitation, via a patch just for nginx, will be fixed in Catalyst 5.8.

AUTHOR

Jay Shirley <jshirley@coldhardcode.com>