Open Source

Concurrency and Python

By Shannon ,jj Behrens, February 03, 2008

Stackless Python, Erlang, and greenlets are interesting approaches to concurrency

Asynchronous Programming

What happens if you have a lot of sockets that are waiting to read or write data? Asynchronous programming lets you write code that basically says, "Call my callback when you actually have something for me." Although this approach is used all the time in C, it's even nicer in Python because Python has first-class functions.

These days, there are many servers written asynchronously. nginx is a "simplified version" of Apache that is both very fast and highly concurrent. Squid, the popular open source Web proxy, is also written asynchronously. This makes a lot of sense if you think about what a Web proxy does. It spends all of its time managing a ton of sockets, funneling data between clients and servers.

Asynchronous programming starts with operating system APIs such as select, poll, kqueue, aio, and epoll. These APIs let you write code that basically says, "These are the file descriptors I'm working with. Which of them is ready for me to do some reading or writing?" In Python, libraries like the built-in asyncore module and the popular Twisted framework take these low-level APIs and orchestrate callback systems on top of them.

Let's look at an example of asynchronous code. First, the linear (non-asynchronous) code in Example 4.

def handle_request(request):
    data = talk_to_database()
    print "Processing request with data from database."

Example 4: Non-asynchronous Code

Re-written asynchronously, you end up with something like Example 5. (You can move use_data into a new top-level function after handle_request, but it's convenient to do it this way to maintain access to request via a closure.)

def handle_request(request):
    def use_data(data):
        print "Processing request with data from database."
    deferred = talk_to_database()
    deferred.addCallback(use_data)

Example 5: Asynchronous Code.

Notice that the talk_to_database function no longer returns a value directly. Rather, it returns a deferred object to which you can attach callbacks.

This is called "continuation passing style". Rather than waiting for a function to simply return, you must pass a callback detailing how to continue once the data is obtained. Because you must use continuation passing style anytime you call a function that might block, it soon permeates your codebase. This can be painful and prevents you from using any library that does blocking I/O unless it's written using continuation passing style.

On the other hand, living in the asynchronous ghetto has its benefits. Aside from the clear concurrency benefits, the Twisted codebase is widely regarded as well-written code, and it provides implementations for most popular protocols.

Subroutines Versus Coroutines

In the beginning, there was the GOTO. It didn't take any parameters, and it was a one-way trip.

A coroutine is like a subroutine, except it doesn't necessarily return. With subroutines, you can do things like:

f -> g -> h (return to g, return to f)

With coroutines, you can do things like:

f -> g -> h -> f

Coroutines can be used for simple cooperative multitasking. The Python Cookbook has a great recipe for coroutines based on generators. Example 6 is a simple version of it.

import itertools
def my_coro(name):
    count = 0
    while True:
        count += 1
        print "%s %s" % (name, count)
        yield
coros = [my_coro('coro1'), my_coro('coro2')]
for coro in itertools.cycle(coros):  # A round-robin scheduler :)
    coro.next()
# Produces:
#
# coro1 1
# coro2 1
# coro1 2
# coro2 2
# ...

Example 6: Generator-based Coroutines

Using generators to implement coroutines is definitely a cute hack. By the way, this same trick can be used in Twisted to alleviate some of the need to use callbacks everywhere.

On the other hand, there are some limitations to this technique. Specifically, you can only call yield in the generator. What happens if my_coro calls some function f and f wants to yield? There are some workarounds, but the limitation is actually pretty core to Python. (Because Python isn't stackless, it can't support true continuations in the same way that Scheme can.) I've written about this topic in detail on my blog.

Previous 1 2 3 4 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Open Source

Concurrency and Python

Asynchronous Programming

Subroutines Versus Coroutines

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Open Source Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

Open Source

Concurrency and Python

Asynchronous Programming

Subroutines Versus Coroutines

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Open Source Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content