Using Gearman with Catalyst to Create a Simple Image Thumbnailer

SYNOPSIS

Gearman is a distributed job queue system that excels in doing things quickly, and asynchronously run job processes for things that need batch processesing, or those which you don't want your web application having to deal with. It's quick, and good for things like thumbnailing, which we will be talking about today.

REQUIREMENTS

Gearman::Server

This is the actual job server that keeps track of jobs, and what the worker processes connect to. Initialized with gearmand.

Gearman::Worker

This is your worker instance, which actually does the job you want done. HINT: We want to do thumbnailing, so this will contain the code for creating thumbnails from images.

Gearman::Client

This connects to the server(s) and communicates what jobs to execute etc.

Catalyst

Duh.

The Process

So, here's how this goes down:

The web browser uploads the image to your Catalyst app. Catalyst writes out the image to your given directory on the file system, maybe inserts a pointer to that in the database, and then at the same time, creates a job in the Gearman worker pool to create a thumbnail of said image. Gearman either queues it up because it's working on other images, or it takes care of it right away if it's just twiddling its thumbs. Then, presto, you have a thumbnail of your image(s)!

Again, a small ascii flowchart of this process should remove all questions you have of this process:



    Browser uploads image -> Catalyst writes this to the filesystem 
                      -> Catalyst creates a job in the Gearman worker pool -> Gearman queues job, then takes care of it -> your thumbnail appears wherever needed!
                                                                           |
                                                                         < -
                        Catalyst doesn't sit and wait for the thumbnail to be created, 
                        instead, sends confirmation page back to the browser 







Finally, Some Code

Okay, so we need:

A Server

Simply enough, we just run gearmand --daemonize and we have our server.

A Worker

Here's the code for the thumbnail job I came up with:

    package Worker;
    use Moose;
    use namespace::autoclean;
    use Gearman::Worker;
    use Deimos::ConfigContainer;
    use Imager;
    use IO::Scalar;
    use File::Basename;

    has worker => ( 
        is => 'ro', 
        isa => 'Gearman::Worker', 
        required => 1, 
        lazy => 1,
        default => sub { Gearman::Worker->new }
    );




    has 'imager' => (
        is         => 'ro',
        required   => 1,
        lazy => 1,
        default => sub { Imager->new }
    );

    has 'config' => (
       is => 'ro',
       required => 1,
       lazy => 1,
       default => sub { # your config object goes here },
    );

    sub BUILD {
        my $self = shift;
        $self->worker->job_servers('127.0.0.1'); 
        $self->worker->register_function(thumbinate => sub { $self->thumbinate($_[0]->arg) });
    }




    sub thumbinate {
        my ($self, $file) = @_;

        my $image = $self->imager;
        my $scaled;
        if ( $image->read( file => $file ) ) {
                 $scaled = $image->scale( ypixels => $self->config->{thumbnail_max_height} );

                 ## write our image to disk
                 binmode STDOUT;
                 $| = 1;
                 my $data;
                 $scaled->write( data => \$data, type => 'png'  )
                   or die $scaled->errstr;
                 return $file;

             } else {
                 die "file not read, " . $image->errstr;
             }
    }
    1;

Quick breakdown:

    has worker => ( 
        is => 'ro', 
        isa => 'Gearman::Worker', 
        required => 1, 
        lazy => 1,
        default => sub { Gearman::Worker->new }
    );

This creates our Gearman::Worker object.

    has 'imager' => (
        is         => 'ro',
        required   => 1,
        lazy => 1,
        default => sub { Imager->new }
    );

This creates our Imager object, which we use to create the thumbnails.

 sub BUILD {
        my $self = shift;
        $self->worker->job_servers('127.0.0.1'); 
        $self->worker->register_function(thumbinate => sub { $self->thumbinate($_[0]->arg) });
  }

This tells our Gearman::Worker object that our server (a list of server IPs can be passed here) resides at 127.0.0.1, and registers the job "thumbinate" with the associated job defined in this class, with the proper arguments passed.

    sub thumbinate {
        my ($self, $file) = @_;

        my $image = $self->imager;
        my $scaled;
        if ( $image->read( file => $file ) ) {
                 $scaled = $image->scale( ypixels => $self->config->{thumbnail_max_height} );

                 ## write our image to disk
                 binmode STDOUT;
                 $| = 1;
                 my $data;
                 $scaled->write( data => \$data, type => 'png'  )
                   or die $scaled->errstr;
                 return $file;

             } else {
                 die "file not read, " . $image->errstr;
             }
    }

This is where the thumbnailing takes place, and it is the job that we register with Gearman.

A Worker Initializer
    #!/usr/bin/env perl

    use strict;
    use Jobs::Worker;

    my $worker = Jobs::Worker->new;
    $worker->worker->work while 1;

This acts as the daemon for workers, that basically listens for jobs from the gearmand instance. Start with perl -Ilib script/jobs.pl 2>workerlog.log & or some such.

A Catalyst app and Model to Glue it all Together
    package MyApp::Model::Job;

    use parent 'Catalyst::Model';
    use Gearman::Client;
    use Moose;
    use namespace::autoclean;

    has 'gearman' => (
        is => 'ro',
        isa => 'Gearman::Client',
        required => 1,
        lazy => 1,
        default => sub { Gearman::Client->new }
    );

    has 'job_servers' => (
        is => 'ro',
        required => 1,
        lazy => 1,
        default => '127.0.0.1',
    );

    sub add {
        my ($self, @tasks) = @_;
        my $gm = $self->gearman;
        $gm->job_servers($self->job_servers);
        my $res = $gm->do_task($tasks[0] => $tasks[1]);
        return $res;
    }

    1;

Quick synopsis:

The important method here is add. Basically, we set our job servers, and call ->do_task with the taskname and the associated job name defined in our worker class. Not much going on here.

Finally, we need to have code that actually calls this when we upload an image.

    package MyApp::Controller::Media;
    # the usual Catalyst controller stuff goes here 

    # this is actually a method using ActionClass('REST'), so that's where the ->status_* stuff comes from
    sub add_asset : Local  {  # or Chained, if you prefer like I do
        my ( $self, $c ) = @_;
        my $data = $c->req->data || $c->req->params;
        $c->log->debug( "uploads: " . Dumper $c->req->uploads );
        return $self->status_bad_request( $c, message => "you must upload a file" )
          unless $c->req->upload('file') || $c->req->upload('qqfile');
        my $upload = $c->req->upload('file') || $c->req->upload('qqfile');
        my $filename = $upload->filename;
        my $media_type = ( by_suffix $data->{'file'} )[0];
        try {

            my $media = $c->model('Database::Attachment')->create(
                {
                    name => $data->{'name'} || $filename,
                    owner     => $c->user->get("userid"),
                    published => $data->{'published'},
                    mediatype => $media_type, 
                    file      => $upload->fh,
                }
            );
            $c->log->debug("Thumbnail: " . $media->thumbnail . "");
            my $thumb;
            if ( $media_type =~ /^image\/.+/i ) {
                $c->log->debug("matched image");
                $thumb = $c->model('Job')->add('thumbinate' => $media->file . "");
            }
            if ( $data->{entry} ) {
                $c->log->debug( "adding an attachment to " . $data->{entry} );
                $media->add_to_entries( { entry => $data->{entry} } );
                $c->model('CMS')->entry->publish($data->{entry}) or die "Can't publish entry; $!";
            }

            return $self->status_created(
                $c,
                location => $c->req->uri->as_string,
                entity   => { message => "file uploaded" }
            );
        }
        catch {
            return $self->status_bad_request( $c, message => "Failed to save '$filename': $_" );
        };

    }

There is some extra code in here that describes other things you might do, say, update a record with the thumbnail image's location, check to see if there IS a thumbnail, etc.

The real important bits are:

Get the filename


    my $upload = $c->req->upload('file') || $c->req->upload('qqfile');
    my $filename = $upload->filename;

Get the MIME type:

(Obviously you need to use MIME::Types qw/by_suffix/; for this)

     my $media_type = ( by_suffix $data->{'file'} )[0];

Create the Job
     if ( $media_type =~ /^image\/.+/i ) {
            $c->log->debug("matched image");
            $thumb = $c->model('Job')->add('thumbinate' => $media->file . "");
     }

And that's that. You now have the ability to create thumbnails cleanly and smoothly with Gearman and Catalyst.

Final Notes

Note that some of you may have done this with TheSchwartz. Which is all good and fine. However, it's concerned more about reliability than speed. Gearman doesn't check to make sure jobs got done successfully, but it is much more suited for creating thumbnails because of speed. Thumbnails can also be trivially recreated if something fails.

Author

Devin Austin <dhoss@cpan.org>