Tornado

A Whirlwind Presentation

Me

I am (not) Jeremy Kelley.
If I were actually Jeremy Kelley, I would work at Indeed and amaze you with my wit and knowledge.
I would also have the flu.

What is Tornado?

You know, non-meteorologically

Tornado is a simple web framework...

...around a fast HTTP server...

...around a low-level event system.

It can be used for...

...building web apps...

...HTTP REST APIs...

...authentication frameworks...

...even custom TCP-based services.

The Zen of Tornado

            
from tornado.web import Application, RequestHandler
from tornado.ioloop import IOLoop


class HomeHandler(RequestHandler):

    def get(self):
        self.set_header("Content-type", "text/plain")
        self.write("Hello, world!")


if __name__ == "__main__":
    app = Application([("/", HomeHandler)])
    app.listen(8080)
    IOLoop.instance().start()

Disclaimer

I'm going to talk through how Tornado works.

This means some libraries you might not use.

Tornado is actually very simple.

The Mighty Stack

Event Libraries

Tornado selects the best library for your platform.

For Linux, this means epoll.

For Mac, this means kqueue.

For Windows, this means old / boring select.

These are not the codes you are looking for.

IOLoop Library

The IOLoop abstracts those polling libraries away:

            
from tornado.ioloop import IOLoop
ioloop = IOLoop.instance()

# watch for events on a socket / file descriptor
ioloop.add_handler(sock.fileno(), callback, ioloop.READ)

# add a callback to run on the next "cycle"
ioloop.add_callback(callback)

# add a callback to run at (or after) a specific time
ioloop.add_timeout(time.time() + 60, callback)

IOLoop.instance()

Singletons -- great until they murder you

Use as default, but allow passing it in

            
class TCPServer(object):

    def __init__(self, ioloop=None):
        self._ioloop = ioloop or IOLoop.instance()

    def send(self, message):
      #...

ADD_CALLBACK / ADD_TIMEOUT

All you need unless you're writing a TCP client / server.

Use these to split up blocking operations, defer tasks, etc.

            
# blocking operation...
results = DB.slow_find()

def parse_results(results):
    for result in results:
        # do something

# let other things work for a while
IOLoop.instance().add_callback(parse_results, results)

ADD_HANDLER, UPDATE_HANDLER, etc.

If you ARE writing a server, you need to:

            
# add a callback for a file descriptor on certain events
ioloop.add_handler(sock.fileno(), callback, ioloop.READ)

# update for new events
ioloop.update_handler(sock.fileno(), ioloop.WRITE)

# remove file descriptors when finished
ioloop.remove_handler(sock.fileno())

...but stayed tuned...

IOStream Library

IOStream wraps sockets:

            
ioloop = IOLoop.instance()
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
stream = IOStream(sock, io_loop=ioloop)

stream.connect(("google.com", 80), connect_callback)
stream.read_bytes(4096, read_callback)
# etc...

It provides plenty of helpers:

            
# read until exact string match
stream.read_until("\n\n", header_callback)
# read until a regular expression is matched
stream.read_until_regex("BYTES: \d+\s+", bytes_callback)
# read until the socket closes -- optional streaming
stream.read_until_close(finish_callback, streaming_callback)
# write out to the socket
stream.write("Foobar!", write_callback)
# if a socket is closed, call a callback
stream.set_close_callback(on_close)
stream.closed() or stream.reading() or stream.writing()

Use IOStream when someone else hasn't done it already.

(For instance, your favorite new NoSQL database.)

Don't use IOStream for HTTP clients -- that's coming up.

HTTPServer Library

HTTPServer is (generally) invisible.

It leverages IOStream, etc. to support HTTP/1.1.

Runs implicitly when Application.listen() is called.

You probably won't need to touch it, but it's there.

Uses HTTPRequest.

HTTPRequest is used heavily in the request handlers.

It wraps information / helpers for client connections:

            
request.body # the request body string, if present
request.headers # dictionary of headers
request.cookies # dictionary of cookies
request.path # /foobar/path
request.query # ?key=value
request.remote_ip # actual request ip
request.scheme # actual request scheme (http[s])

Web Libraries

tornado.web contains the real meat of Tornado.

Most work (in web apps) is done here.

Encapsulates the "V" and "C" of "MVC"

Request Handler

            
from tornado.web import RequestHandler, Application

class Controller(RequestHandler):
    # regex groups in route passed to method
    def get(self, username):
        ip = self.request.remote_ip # request on handler
        # tornado includes its own templating system
        self.render(
          "template.html", ip=ip, username=username)

app = Application([
  # routes are regex passed into Application
  ("/(\w+)", Controller)
])

Template System

            
{% extends "base.html" %}
{% block content %}
<h1>Hello, {{ username.uppercase() }}.</h1>
{# it's mostly just python. #}
{% if ip == "127.0.0.1" %}
<p>You must still be working on this.</p>
{% else %}
<p>You are accessing us from {{ ip }}.</p>
{% end if %}
{% end block %}

{# You can even... #}
{% import requests %}
{{ requests.get("http://www.google.com").text }}
{# ...but don't. That's really, really stupid. #}

Template Module

It has just enough for what you need...

...with enough rope to hang yourself twice.

So be smart.

Application

            
class Users(RequestHandler):
    def post(self):
        content = json.loads(self.request.body)
        # application is available from the handler
        id = self.application.settings["database"].save(content)
        # a dictionary passed to write == JSON
        self.write({
          "id": "%s" % (id),
          "name": content["name"]
        })

app = Application([("/users", Users)],
    debug=True, # for automatic reloading
    database=pymongo.Connection()["mydatabase"])

WSGI-Mode

            
from tornado.wsgi import WSGIApplication
app = WSGIApplication([("/", Handler)])
# for giggles...
server = gevent.pywsgi.WSGIServer(('', 8080), app)
server.serve_forever()

Can't do Tornado's async stuff, though.

Authentication

              
  class BaseHandler(RequestHandler):

      @property
      def db(self):
        return self.application.settings["db"]

      def get_current_user(self):
          token = self.get_secure_cookie("token")
          user = self.db.users.find_one({"token": token})
          return user or None

Authentication

              
  class LoginHandler(BaseHandler):

      def post(self):
          if self.current_user:
              self.redirect("/dashboard")
          username = self.get_argument("username")
          password = self.get_argument("password")
          user = self.db.find_one({"username": username})
          if not user or user.password != password:
              return self.redirect("/login")
          self.set_secure_cookie("token", user.token)
          self.redirect("/dashboard")

Authentication

              
  class Dashboard(BaseHandler):

      @tornado.web.authenticated
      def get(self):
          self.render("dashboard.htm", user=self.current_user)

Asynchronous

            
class AsyncHandler(RequestHandler):

    @tornado.web.asynchronous
    def get(self):
        service.fetch("/other/thing", callback=self._callback)
        # request stays open...

    def _callback(self, response):
        if response:
            self.write(response)
        else:
          raise HTTPError(500, "We couldn't find it.")
        # without finish, the request never closes
        self.finish()

And More

There are lots of other helpers in the web stack.

(Websockets, error helpers, etc.)

Dig around in the documentation.

HTTPClient

Simple Usage

            
from tornado.httpclient import AsyncHTTPClient

def callback(response):
    print response.code
    print response.body

client = AsyncHTTPClient()
client.fetch("http://foo.com/some/resource", callback)

More Involved

            
def fetch_callback(response):
    print response.headers
    print response.body

client = AsyncHTTPClient()
client.fetch("https://foo.com/some/resource",
    method="POST", body=json.dumps({"key": "value"}),
    headers={"Content-type": "application/json"},
    auth_username="foo", auth_password="bar",
    validate_cert=False, callback=fetch_callback)

You'll probably use this a lot.

It's especially useful when used with auth.

(seeggwwaaayyy...)

Auth Libraries

Tornado includes lots of third party auth.

This fits nicely with the async nature of Tornado.

Also, you can be lazy about your auth system.

            
from tornado.auth import TwitterMixin

class TwitterAuthHandler(RequestHandler, TwitterMixin):
    @tornado.web.asynchronous
    def get(self):
        if self.get_argument("oauth_token", None):
            return self.get_authenticated_user(self._on_auth)
        self.authorize_redirect()

    def _on_auth(self, user):
        if not user:
            raise HTTPError(500, "Couldn't auth.")
        user = self.db.create({"username": user["username"]})
        self.set_secure_cookie("token", user.token)
        self.redirect("/dashboard")

TCP-Based Libraries

Items of Interest

Deployment

To Async or Not?

Testing

Libraries of Import

Questions?

@joshmarshall on the Twitters.

@joshmarshall on the GitHub.