Nonblocking and Streaming - Part 1

Overview

Modern versions of Catalyst allow you to take charge of how your write operations work. This enables better support for working inside common event loops such as AnyEvent and IO::Async to support non blocking writes to stream large files.

Lets explore how PSGI implements delayed and streaming responses and how we can use streaming response with an event loop to enable use of nonblocking IO. Then lets see how Catalyst exposes that API. This part one of a four part series.

Introduction

When Catalyst was ported to run as a native PSGI application, great pains were taken to make sure we could properly support delayed and streaming responses. However only recently was that ability properly exposed as Catalyst API. In order to better understand how Catalyst works, lets step back and refresh on the PSGI specification.

PSGI Delayed Response and Streaming

You are likely familiar with the basic PSGI 'hello world' application which goes something like this:

    my $psgi_app = sub {
      my $env = (@_);
      return [ 200,
        [ 'Content-Type' => 'text/plain' ],
        [ 'Hello World!'] ];
    };

If you put the above into a file hello-classic.psgi you could run it under a decent webserver very easily:

    $ plackup -s Starman bin/hello-classic.psgi

    013/11/27-09:18:04 Starman::Server (type Net::Server::PreFork) starting! pid(69860)
    Resolved [*]:5000 to [0.0.0.0]:5000, IPv4
    Binding to TCP port 5000 on host 0.0.0.0 with IPv4
    Setting gid to "20 20 20 12 61 79 80 81 98 33 100 204 398 399"
    Starman: Accepting connections at http://*:5000/

And now you can use a commandline tool like telnet to hit that server and execute HTTP requests:

    $ telnet 127.0.0.1 5000
    Trying 127.0.0.1...
    Connected to localhost.
    Escape character is '^]'.
    GET / HTTP/1.1
    Host: www.test.org

    HTTP/1.1 200 OK
    Content-Type: text/plain
    Transfer-Encoding: chunked
    Date: Wed, 27 Nov 2013 22:49:39 GMT
    Connection: keep-alive

    c
    Hello World!
    0

    Connection closed by foreign host.

Take note of the Response header 'Transfer-Encoding: chunked'. Since Starman understand HTTP/1.1 it can send chunked responses in the case when you fail to provide the content length (or, you give an expected length but specify 'chunked' anyway). In this case that 'c' that you see is the length in hexadecimal (12 bytes). That final '0' means there's no more chunks to send.

Now, PSGI offers a few more tricks for the case when you don't want to fully pre-generate your response. In this case instead of returning a tuple (Status, Headers, Body) from your PSGI application, you return a code reference:

    my $psgi_app = sub {
      my $env = shift;
      return my $delayed_response = sub {
        my $responder = shift;
        my $writer = $responder->(
          [200, [ 'Content-Type' => 'text/plain' ]]);

        $writer->write('Hello');
        $writer->write(' ');
        $writer->write('World');
        $writer->close;
      };
    };

In this case the $delayed_response coderef gets run by the underlying server, which is required to pass it a second coderef, the $responder. So basically your application is 'delayed' it doesn't actually run until the server runs it, which can allow you to take advantage of event loops like AnyEvent or IO::Async to schedule your response better. But its still useful even under a blocking server like Starman.

Lets look a bit more at the code example. The $responder coderef can accept the full $Status, \@Headers, \@Body tuple, but a more interesting use is to give it just the first two elements of the tuple, as we do in this example application. In this case the $responder returns the server's $writer object, which basically is like a filehandle that points at the client which originated the request (for example a web browser or the telnet commandline application). This object defines two methods, write and close. Each time you call write you stream some data out to the client. Generally the connection stays open until you call close, which means you can use this for long responses or for techniques like cometd where you keep a connection to the client open for a long time.

*Note: Even thought I say the $writer object can be thought of as a file handle to the client, the underlying architecture is both more complex and yet lacking in some of the features we expect from a filehandle. For example, in the current PSGI spec, there's no clear and common way to detect when you can't write due to some issue with the client or network. Also, there's nothing stopping the server from buffering your writes before sending them on to the client, if it is necessary to do so for scheduling purposes. For example, the server might buffer your write in memory, or to some other temporary store if needed.

Hopefully someday the plack-cabal can get together on a specification to expose and clarify some of these important missing bits.

Let's see what this application looks like under telnet. We start the psgi application with plackup as in the previous case.

    $ plackup -s Starman bin/hello-delayed.psgi 
    2013/11/27-09:52:15 Starman::Server (type Net::Server::PreFork) starting! pid(69989)
    Resolved [*]:5000 to [0.0.0.0]:5000, IPv4
    Binding to TCP port 5000 on host 0.0.0.0 with IPv4
    Setting gid to "20 20 20 12 61 79 80 81 98 33 100 204 398 399"
    Starman: Accepting connections at http://*:5000/

And hit it with telnet:

    $ telnet 127.0.0.1 5000
    Trying 127.0.0.1...
    Connected to localhost.
    Escape character is '^]'.
    GET / HTTP/1.1
    Host: www.test.org

    HTTP/1.1 200 OK
    Content-Type: text/plain
    Transfer-Encoding: chunked
    Date: Wed, 27 Nov 2013 22:57:11 GMT
    Connection: keep-alive

    5
    Hello
    1

    5
    World
    0

    Connection closed by foreign host.

You'll note this time each of the writes is a separate chunk, with a separate length. Now, we used the chunked encoding here to help illustrate how each call to ->write pushes a separate 'chunk', but to be clear, we could have used this streaming interface without chunked transfer encoding. For example, the same code with estimated length in the header:

    my $psgi_app = sub {
      my $env = shift;
      return sub {
        my $responder = shift;
        my $writer = $responder->(
          [200, [ 
            'Content-Type' => 'text/plain',
            'Content-Length' => '11', ]]);

        $writer->write('Hello');
        $writer->write(' ');
        $writer->write('World');
        $writer->close;
      };
    };

And the telent trace:

    $ telnet 127.0.0.1 5000
    Trying 127.0.0.1...
    Connected to localhost.
    Escape character is '^]'.
    GET / HTTP/1.1
    Host: www.test.org

    HTTP/1.1 200 OK
    Content-Type: text/plain
    Content-Length: 11
    Date: Wed, 27 Nov 2013 23:06:33 GMT
    Connection: keep-alive

    Hello WorldConnection closed by foreign host.

so we still streamed the information, just its not as easy to see from looking at the telent trace, as it is when using chunked transfer encoding. But its definitely still streaming, as you can see if you start Starman in DEBUG mode and examine the debug output for that request:

    >$ STARMAN_DEBUG=1 plackup -s Starman bin/hello-delayed-with-length.psgi 
    2013/11/27-17:09:44 Starman::Server (type Net::Server::PreFork) starting! pid(71887)
    Resolved [*]:5000 to [0.0.0.0]:5000, IPv4
    Binding to TCP port 5000 on host 0.0.0.0 with IPv4
    Setting gid to "20 20 20 12 61 79 80 81 98 33 100 204 398 399"
    Using no serialization
    Starman: Accepting connections at http://*:5000/
    Beginning prefork (5 processes)
    Starting "5" children
    Child Preforked (71888)
    Child Preforked (71889)
    Child Preforked (71890)
    Parent ready for children.
    Child Preforked (71891)
    Child Preforked (71892)

    2013/11/27-17:09:51 CONNECT TCP Peer: "[127.0.0.1]:61580" Local: "[127.0.0.1]:5000"
    [71888] Read 16 bytes: "GET / HTTP/1.1\r\n"
    [71888] Read 20 bytes: "Host: www.test.org\r\n"
    [71888] Read 2 bytes: "\r\n"
    127.0.0.1 - - [27/Nov/2013:17:09:52 -0600] "GET / HTTP/1.1" 200 - "-" "-"
    [71888] Wrote 126 bytes
    [71888] Wrote 5 bytes
    [71888] Wrote 1 byte
    [71888] Wrote 5 bytes
    [71888] Request done
    [71888] Waiting on previous connection for keep-alive request...
    [71888] Closing connection

Here you can see the output very clearly. Starman sends 126 bytes to the client (this is the HTTP header information) followed by three writes of 5, 1 and 5 bytes each.

BTW, if you actually go at run the code you'll actually see the 1 second 'keep alive' delay at the end of the response. Basically after the write calls ->close, it spends a bit of time waiting to see if that connection sends a a request to hold the connection open. For Starman this is 1 second, but you can configure it. This keep alive is a big part of HTTP 1.1 and all HTTP 1.1 connects are considered persistent unless declared otherwise, so that's why its important for Starman and other HTTP 1.1 servers to have that ability to timeout if the client isn't doing anything with the persistent connection.

You can actually see this keep alive in action with STARMAN_DEBUG and if you are speedy with the second request (or bump up the starman keepalive timeout). For example:

    $ telnet 127.0.0.1 5000
    Trying 127.0.0.1...
    Connected to localhost.
    Escape character is '^]'.
    GET / HTTP/1.1              (first request which initiates the connection)
    Host: www.test.org

    HTTP/1.1 200 OK
    Content-Type: text/plain
    Content-Length: 11
    Date: Wed, 27 Nov 2013 23:20:13 GMT
    Connection: keep-alive

    Hello World

    GET / HTTP/1.1              (second request over the same connection)
    Host: www.test.org

    HTTP/1.1 200 OK
    Content-Type: text/plain
    Content-Length: 11
    Date: Wed, 27 Nov 2013 23:20:14 GMT
    Connection: keep-alive

    Hello World

    Connection closed by foreign host.

Here's the Starman debugging output.

    2013/11/27-17:20:12 CONNECT TCP Peer: "[127.0.0.1]:61630" Local: "[127.0.0.1]:5000"
    [71889] Read 16 bytes: "GET / HTTP/1.1\r\n"
    [71889] Read 20 bytes: "Host: www.test.org\r\n"
    [71889] Read 2 bytes: "\r\n"
    127.0.0.1 - - [27/Nov/2013:17:20:13 -0600] "GET / HTTP/1.1" 200 - "-" "-"
    [71889] Wrote 126 bytes
    [71889] Wrote 5 bytes
    [71889] Wrote 1 byte
    [71889] Wrote 5 bytes
    [71889] Request done
    [71889] Waiting on previous connection for keep-alive request...
    [71889] Read 36 bytes: "GET / HTTP/1.1\r\nHost: www.test.org\r\n"
    [71889] Read 2 bytes: "\r\n"
    127.0.0.1 - - [27/Nov/2013:17:20:14 -0600] "GET / HTTP/1.1" 200 - "-" "-"
    [71889] Wrote 126 bytes
    [71889] Wrote 5 bytes
    [71889] Wrote 1 byte
    [71889] Wrote 5 bytes
    [71889] Request done
    [71889] Waiting on previous connection for keep-alive request...
    [71889] Closing connection

So you can see one connection, two requests. This keep alive forms the basis of many long polling and similar techniques (like cometd) for keeping a persistent connection between the client and the server, for the purposes of enabling speedy (semi - realtime) type interfaces. Although of course since Starman is a forked, blocking server, you'd be limited to the number of persistent connections by the number of forked children (in Starman the default is 5). If you want to scale such a thing, you probably need to switch to a non blocking server, which we will discuss later.

Again, keep alive and chunked responses are separate matters, just it makes it a bit easier to visually see what is going on. Some reasons for streaming (in my mind) are for when you have very large responses that you'd rather not buffer in memory or to a temp file, but would prefer to send in bits (sort of like when you have a very large SQL query and you use a cursor to iterate row by row rather than load all the rows into an array). On the other hand, chunked encoding is useful when you don't know upfront the length you are sending (such as when getting the length is computationally expensive, or can't be known initially, or when the intention is to send infinitely. The two concepts overlap in use case but the technology themselves is separate.

Great, so that's how PSGI does delayed response and streaming. In the following article, we'll take look at how this works under Catalyst.

For More Information

Code associated with this article:

https://github.com/perl-catalyst/2013-Advent-Staging/tree/master/1to4-Nonblocking-Streaming

Summary

We've reviewed various parts of the PSGI specification, and set a good foundation of understanding to proceed with the remaining articles.

Author

John Napiorkowski jjnapiork@cpan.org