PEP: 342 Title: Coroutines via Enhanced Generators Version: $Revision$
Last-Modified: $Date$ Author: Guido van Rossum, Phillip J. Eby Status:
Final Type: Standards Track Content-Type: text/x-rst Created:
10-May-2005 Python-Version: 2.5 Post-History:

Introduction

This PEP proposes some enhancements to the API and syntax of generators,
to make them usable as simple coroutines. It is basically a combination
of ideas from these two PEPs, which may be considered redundant if this
PEP is accepted:

-   PEP 288, Generators Attributes and Exceptions. The current PEP
    covers its second half, generator exceptions (in fact the throw()
    method name was taken from PEP 288). PEP 342 replaces generator
    attributes, however, with a concept from an earlier revision of PEP
    288, the yield expression.
-   PEP 325, Resource-Release Support for Generators. PEP 342 ties up a
    few loose ends in the PEP 325 spec, to make it suitable for actual
    implementation.

Motivation

Coroutines are a natural way of expressing many algorithms, such as
simulations, games, asynchronous I/O, and other forms of event-driven
programming or co-operative multitasking. Python's generator functions
are almost coroutines -- but not quite -- in that they allow pausing
execution to produce a value, but do not provide for values or
exceptions to be passed in when execution resumes. They also do not
allow execution to be paused within the try portion of try/finally
blocks, and therefore make it difficult for an aborted coroutine to
clean up after itself.

Also, generators cannot yield control while other functions are
executing, unless those functions are themselves expressed as
generators, and the outer generator is written to yield in response to
values yielded by the inner generator. This complicates the
implementation of even relatively simple use cases like asynchronous
communications, because calling any functions either requires the
generator to block (i.e. be unable to yield control), or else a lot of
boilerplate looping code must be added around every needed function
call.

However, if it were possible to pass values or exceptions into a
generator at the point where it was suspended, a simple co-routine
scheduler or trampoline function would let coroutines call each other
without blocking -- a tremendous boon for asynchronous applications.
Such applications could then write co-routines to do non-blocking socket
I/O by yielding control to an I/O scheduler until data has been sent or
becomes available. Meanwhile, code that performs the I/O would simply do
something like this:

    data = (yield nonblocking_read(my_socket, nbytes))

in order to pause execution until the nonblocking_read() coroutine
produced a value.

In other words, with a few relatively minor enhancements to the language
and to the implementation of the generator-iterator type, Python will be
able to support performing asynchronous operations without needing to
write the entire application as a series of callbacks, and without
requiring the use of resource-intensive threads for programs that need
hundreds or even thousands of co-operatively multitasking pseudothreads.
Thus, these enhancements will give standard Python many of the benefits
of the Stackless Python fork, without requiring any significant
modification to the CPython core or its APIs. In addition, these
enhancements should be readily implementable by any Python
implementation (such as Jython) that already supports generators.

Specification Summary

By adding a few simple methods to the generator-iterator type, and with
two minor syntax adjustments, Python developers will be able to use
generator functions to implement co-routines and other forms of
co-operative multitasking. These methods and adjustments are:

1.  Redefine yield to be an expression, rather than a statement. The
    current yield statement would become a yield expression whose value
    is thrown away. A yield expression's value is None whenever the
    generator is resumed by a normal next() call.
2.  Add a new send() method for generator-iterators, which resumes the
    generator and sends a value that becomes the result of the current
    yield-expression. The send() method returns the next value yielded
    by the generator, or raises StopIteration if the generator exits
    without yielding another value.
3.  Add a new throw() method for generator-iterators, which raises an
    exception at the point where the generator was paused, and which
    returns the next value yielded by the generator, raising
    StopIteration if the generator exits without yielding another value.
    (If the generator does not catch the passed-in exception, or raises
    a different exception, then that exception propagates to the
    caller.)
4.  Add a close() method for generator-iterators, which raises
    GeneratorExit at the point where the generator was paused. If the
    generator then raises StopIteration (by exiting normally, or due to
    already being closed) or GeneratorExit (by not catching the
    exception), close() returns to its caller. If the generator yields a
    value, a RuntimeError is raised. If the generator raises any other
    exception, it is propagated to the caller. close() does nothing if
    the generator has already exited due to an exception or normal exit.
5.  Add support to ensure that close() is called when a generator
    iterator is garbage-collected.
6.  Allow yield to be used in try/finally blocks, since garbage
    collection or an explicit close() call would now allow the finally
    clause to execute.

A prototype patch implementing all of these changes against the current
Python CVS HEAD is available as SourceForge patch #1223381
(https://bugs.python.org/issue1223381).

Specification: Sending Values into Generators

New generator method: send(value)

A new method for generator-iterators is proposed, called send(). It
takes exactly one argument, which is the value that should be sent in to
the generator. Calling send(None) is exactly equivalent to calling a
generator's next() method. Calling send() with any other value is the
same, except that the value produced by the generator's current yield
expression will be different.

Because generator-iterators begin execution at the top of the
generator's function body, there is no yield expression to receive a
value when the generator has just been created. Therefore, calling
send() with a non-None argument is prohibited when the generator
iterator has just started, and a TypeError is raised if this occurs
(presumably due to a logic error of some kind). Thus, before you can
communicate with a coroutine you must first call next() or send(None) to
advance its execution to the first yield expression.

As with the next() method, the send() method returns the next value
yielded by the generator-iterator, or raises StopIteration if the
generator exits normally, or has already exited. If the generator raises
an uncaught exception, it is propagated to send()'s caller.

New syntax: Yield Expressions

The yield-statement will be allowed to be used on the right-hand side of
an assignment; in that case it is referred to as yield-expression. The
value of this yield-expression is None unless send() was called with a
non-None argument; see below.

A yield-expression must always be parenthesized except when it occurs at
the top-level expression on the right-hand side of an assignment. So

    x = yield 42
    x = yield
    x = 12 + (yield 42)
    x = 12 + (yield)
    foo(yield 42)
    foo(yield)

are all legal, but

    x = 12 + yield 42
    x = 12 + yield
    foo(yield 42, 12)
    foo(yield, 12)

are all illegal. (Some of the edge cases are motivated by the current
legality of yield 12, 42.)

Note that a yield-statement or yield-expression without an expression is
now legal. This makes sense: when the information flow in the next()
call is reversed, it should be possible to yield without passing an
explicit value (yield is of course equivalent to yield None).

When send(value) is called, the yield-expression that it resumes will
return the passed-in value. When next() is called, the resumed
yield-expression will return None. If the yield-expression is a
yield-statement, this returned value is ignored, similar to ignoring the
value returned by a function call used as a statement.

In effect, a yield-expression is like an inverted function call; the
argument to yield is in fact returned (yielded) from the currently
executing function, and the return value of yield is the argument passed
in via send().

Note: the syntactic extensions to yield make its use very similar to
that in Ruby. This is intentional. Do note that in Python the block
passes a value to the generator using send(EXPR) rather than
return EXPR, and the underlying mechanism whereby control is passed
between the generator and the block is completely different. Blocks in
Python are not compiled into thunks; rather, yield suspends execution of
the generator's frame. Some edge cases work differently; in Python, you
cannot save the block for later use, and you cannot test whether there
is a block or not. (XXX - this stuff about blocks seems out of place
now, perhaps Guido can edit to clarify.)

Specification: Exceptions and Cleanup

Let a generator object be the iterator produced by calling a generator
function. Below, g always refers to a generator object.

New syntax: yield allowed inside try-finally

The syntax for generator functions is extended to allow a
yield-statement inside a try-finally statement.

New generator method: throw(type, value=None, traceback=None)

g.throw(type, value, traceback) causes the specified exception to be
thrown at the point where the generator g is currently suspended (i.e.
at a yield-statement, or at the start of its function body if next() has
not been called yet). If the generator catches the exception and yields
another value, that is the return value of g.throw(). If it doesn't
catch the exception, the throw() appears to raise the same exception
passed it (it falls through). If the generator raises another exception
(this includes the StopIteration produced when it returns) that
exception is raised by the throw() call. In summary, throw() behaves
like next() or send(), except it raises an exception at the suspension
point. If the generator is already in the closed state, throw() just
raises the exception it was passed without executing any of the
generator's code.

The effect of raising the exception is exactly as if the statement:

    raise type, value, traceback

was executed at the suspension point. The type argument must not be
None, and the type and value must be compatible. If the value is not an
instance of the type, a new exception instance is created using the
value, following the same rules that the raise statement uses to create
an exception instance. The traceback, if supplied, must be a valid
Python traceback object, or a TypeError occurs.

Note: The name of the throw() method was selected for several reasons.
Raise is a keyword and so cannot be used as a method name. Unlike raise
(which immediately raises an exception from the current execution
point), throw() first resumes the generator, and only then raises the
exception. The word throw is suggestive of putting the exception in
another location, and is already associated with exceptions in other
languages.

Alternative method names were considered: resolve(), signal(),
genraise(), raiseinto(), and flush(). None of these seem to fit as well
as throw().

New standard exception: GeneratorExit

A new standard exception is defined, GeneratorExit, inheriting from
Exception. A generator should handle this by re-raising it (or just not
catching it) or by raising StopIteration.

New generator method: close()

g.close() is defined by the following pseudo-code:

    def close(self):
        try:
            self.throw(GeneratorExit)
        except (GeneratorExit, StopIteration):
            pass
        else:
            raise RuntimeError("generator ignored GeneratorExit")
        # Other exceptions are not caught

New generator method: __del__()

g.__del__() is a wrapper for g.close(). This will be called when the
generator object is garbage-collected (in CPython, this is when its
reference count goes to zero). If close() raises an exception, a
traceback for the exception is printed to sys.stderr and further
ignored; it is not propagated back to the place that triggered the
garbage collection. This is consistent with the handling of exceptions
in __del__() methods on class instances.

If the generator object participates in a cycle, g.__del__() may not be
called. This is the behavior of CPython's current garbage collector. The
reason for the restriction is that the GC code needs to break a cycle at
an arbitrary point in order to collect it, and from then on no Python
code should be allowed to see the objects that formed the cycle, as they
may be in an invalid state. Objects hanging off a cycle are not subject
to this restriction.

Note that it is unlikely to see a generator object participate in a
cycle in practice. However, storing a generator object in a global
variable creates a cycle via the generator frame's f_globals pointer.
Another way to create a cycle would be to store a reference to the
generator object in a data structure that is passed to the generator as
an argument (e.g., if an object has a method that's a generator, and
keeps a reference to a running iterator created by that method). Neither
of these cases are very likely given the typical patterns of generator
use.

Also, in the CPython implementation of this PEP, the frame object used
by the generator should be released whenever its execution is terminated
due to an error or normal exit. This will ensure that generators that
cannot be resumed do not remain part of an uncollectable reference
cycle. This allows other code to potentially use close() in a
try/finally or with block (per PEP 343) to ensure that a given generator
is properly finalized.

Optional Extensions

The Extended continue Statement

An earlier draft of this PEP proposed a new continue EXPR syntax for use
in for-loops (carried over from PEP 340), that would pass the value of
EXPR into the iterator being looped over. This feature has been
withdrawn for the time being, because the scope of this PEP has been
narrowed to focus only on passing values into generator-iterators, and
not other kinds of iterators. It was also felt by some on the Python-Dev
list that adding new syntax for this particular feature would be
premature at best.

Open Issues

Discussion on python-dev has revealed some open issues. I list them
here, with my preferred resolution and its motivation. The PEP as
currently written reflects this preferred resolution.

1.  What exception should be raised by close() when the generator yields
    another value as a response to the GeneratorExit exception?

    I originally chose TypeError because it represents gross misbehavior
    of the generator function, which should be fixed by changing the
    code. But the with_template decorator class in PEP 343 uses
    RuntimeError for similar offenses. Arguably they should all use the
    same exception. I'd rather not introduce a new exception class just
    for this purpose, since it's not an exception that I want people to
    catch: I want it to turn into a traceback which is seen by the
    programmer who then fixes the code. So now I believe they should
    both raise RuntimeError. There are some precedents for that: it's
    raised by the core Python code in situations where endless recursion
    is detected, and for uninitialized objects (and for a variety of
    miscellaneous conditions).

2.  Oren Tirosh has proposed renaming the send() method to feed(), for
    compatibility with the consumer interface (see
    http://effbot.org/zone/consumer.htm for the specification.)

    However, looking more closely at the consumer interface, it seems
    that the desired semantics for feed() are different than for send(),
    because send() can't be meaningfully called on a just-started
    generator. Also, the consumer interface as currently defined doesn't
    include handling for StopIteration.

    Therefore, it seems like it would probably be more useful to create
    a simple decorator that wraps a generator function to make it
    conform to the consumer interface. For example, it could warm up the
    generator with an initial next() call, trap StopIteration, and
    perhaps even provide reset() by re-invoking the generator function.

Examples

1.  A simple consumer decorator that makes a generator function
    automatically advance to its first yield point when initially
    called:

        def consumer(func):
            def wrapper(*args,**kw):
                gen = func(*args, **kw)
                gen.next()
                return gen
            wrapper.__name__ = func.__name__
            wrapper.__dict__ = func.__dict__
            wrapper.__doc__  = func.__doc__
            return wrapper

2.  An example of using the consumer decorator to create a reverse
    generator that receives images and creates thumbnail pages, sending
    them on to another consumer. Functions like this can be chained
    together to form efficient processing pipelines of consumers that
    each can have complex internal state:

        @consumer
        def thumbnail_pager(pagesize, thumbsize, destination):
            while True:
                page = new_image(pagesize)
                rows, columns = pagesize / thumbsize
                pending = False
                try:
                    for row in xrange(rows):
                        for column in xrange(columns):
                            thumb = create_thumbnail((yield), thumbsize)
                            page.write(
                                thumb, col*thumbsize.x, row*thumbsize.y )
                            pending = True
                except GeneratorExit:
                    # close() was called, so flush any pending output
                    if pending:
                        destination.send(page)

                    # then close the downstream consumer, and exit
                    destination.close()
                    return
                else:
                    # we finished a page full of thumbnails, so send it
                    # downstream and keep on looping
                    destination.send(page)

        @consumer
        def jpeg_writer(dirname):
            fileno = 1
            while True:
                filename = os.path.join(dirname,"page%04d.jpg" % fileno)
                write_jpeg((yield), filename)
                fileno += 1


        # Put them together to make a function that makes thumbnail
        # pages from a list of images and other parameters.
        #
        def write_thumbnails(pagesize, thumbsize, images, output_dir):
            pipeline = thumbnail_pager(
                pagesize, thumbsize, jpeg_writer(output_dir)
            )

            for image in images:
                pipeline.send(image)

            pipeline.close()

3.  A simple co-routine scheduler or trampoline that lets coroutines
    call other coroutines by yielding the coroutine they wish to invoke.
    Any non-generator value yielded by a coroutine is returned to the
    coroutine that called the one yielding the value. Similarly, if a
    coroutine raises an exception, the exception is propagated to its
    caller. In effect, this example emulates simple tasklets as are used
    in Stackless Python, as long as you use a yield expression to invoke
    routines that would otherwise block. This is only a very simple
    example, and far more sophisticated schedulers are possible. (For
    example, the existing GTasklet framework for Python
    (http://www.gnome.org/~gjc/gtasklet/gtasklets.html) and the
    peak.events framework (http://peak.telecommunity.com/) already
    implement similar scheduling capabilities, but must currently use
    awkward workarounds for the inability to pass values or exceptions
    into generators.)

        import collections

        class Trampoline:
            """Manage communications between coroutines"""

            running = False

            def __init__(self):
                self.queue = collections.deque()

            def add(self, coroutine):
                """Request that a coroutine be executed"""
                self.schedule(coroutine)

            def run(self):
                result = None
                self.running = True
                try:
                    while self.running and self.queue:
                       func = self.queue.popleft()
                       result = func()
                    return result
                finally:
                    self.running = False

            def stop(self):
                self.running = False

            def schedule(self, coroutine, stack=(), val=None, *exc):
                def resume():
                    value = val
                    try:
                        if exc:
                            value = coroutine.throw(value,*exc)
                        else:
                            value = coroutine.send(value)
                    except:
                        if stack:
                            # send the error back to the "caller"
                            self.schedule(
                                stack[0], stack[1], *sys.exc_info()
                            )
                        else:
                            # Nothing left in this pseudothread to
                            # handle it, let it propagate to the
                            # run loop
                            raise

                    if isinstance(value, types.GeneratorType):
                        # Yielded to a specific coroutine, push the
                        # current one on the stack, and call the new
                        # one with no args
                        self.schedule(value, (coroutine,stack))

                    elif stack:
                        # Yielded a result, pop the stack and send the
                        # value to the caller
                        self.schedule(stack[0], stack[1], value)

                    # else: this pseudothread has ended

                self.queue.append(resume)

4.  A simple echo server, and code to run it using a trampoline
    (presumes the existence of nonblocking_read, nonblocking_write, and
    other I/O coroutines, that e.g. raise ConnectionLost if the
    connection is closed):

        # coroutine function that echos data back on a connected
        # socket
        #
        def echo_handler(sock):
            while True:
                try:
                    data = yield nonblocking_read(sock)
                    yield nonblocking_write(sock, data)
                except ConnectionLost:
                    pass  # exit normally if connection lost

        # coroutine function that listens for connections on a
        # socket, and then launches a service "handler" coroutine
        # to service the connection
        #
        def listen_on(trampoline, sock, handler):
            while True:
                # get the next incoming connection
                connected_socket = yield nonblocking_accept(sock)

                # start another coroutine to handle the connection
                trampoline.add( handler(connected_socket) )

        # Create a scheduler to manage all our coroutines
        t = Trampoline()

        # Create a coroutine instance to run the echo_handler on
        # incoming connections
        #
        server = listen_on(
            t, listening_socket("localhost","echo"), echo_handler
        )

        # Add the coroutine to the scheduler
        t.add(server)

        # loop forever, accepting connections and servicing them
        # "in parallel"
        #
        t.run()

Reference Implementation

A prototype patch implementing all of the features described in this PEP
is available as SourceForge patch #1223381
(https://bugs.python.org/issue1223381).

This patch was committed to CVS 01-02 August 2005.

Acknowledgements

Raymond Hettinger (PEP 288) and Samuele Pedroni (PEP 325) first formally
proposed the ideas of communicating values or exceptions into
generators, and the ability to close generators. Timothy Delaney
suggested the title of this PEP, and Steven Bethard helped edit a
previous version. See also the Acknowledgements section of PEP 340.

References

TBD.

Copyright

This document has been placed in the public domain.