PEP: 275 Title: Switching on Multiple Values Author: Marc-André Lemburg
<mal@lemburg.com> Status: Rejected Type: Standards Track Content-Type:
text/x-rst Created: 10-Nov-2001 Python-Version: 2.6 Post-History:

Rejection Notice

A similar PEP for Python 3000, PEP 3103, was already rejected, so this
proposal has no chance of being accepted either.

Abstract

This PEP proposes strategies to enhance Python's performance with
respect to handling switching on a single variable having one of
multiple possible values.

Problem

Up to Python 2.5, the typical way of writing multi-value switches has
been to use long switch constructs of the following type:

    if x == 'first state':
        ...
    elif x == 'second state':
        ...
    elif x == 'third state':
        ...
    elif x == 'fourth state':
        ...
    else:
        # default handling
        ...

This works fine for short switch constructs, since the overhead of
repeated loading of a local (the variable x in this case) and comparing
it to some constant is low (it has a complexity of O(n) on average).
However, when using such a construct to write a state machine such as is
needed for writing parsers the number of possible states can easily
reach 10 or more cases.

The current solution to this problem lies in using a dispatch table to
find the case implementing method to execute depending on the value of
the switch variable (this can be tuned to have a complexity of O(1) on
average, e.g. by using perfect hash tables). This works well for state
machines which require complex and lengthy processing in the different
case methods. It does not perform well for ones which only process one
or two instructions per case, e.g.

    def handle_data(self, data):
        self.stack.append(data)

A nice example of this is the state machine implemented in pickle.py
which is used to serialize Python objects. Other prominent cases include
XML SAX parsers and Internet protocol handlers.

Proposed Solutions

This PEP proposes two different but not necessarily conflicting
solutions:

1.  Adding an optimization to the Python compiler and VM which detects
    the above if-elif-else construct and generates special opcodes for
    it which use a read-only dictionary for storing jump offsets.
2.  Adding new syntax to Python which mimics the C style switch
    statement.

The first solution has the benefit of not relying on adding new keywords
to the language, while the second looks cleaner. Both involve some
run-time overhead to assure that the switching variable is immutable and
hashable.

Both solutions use a dictionary lookup to find the right jump location,
so they both share the same problem space in terms of requiring that
both the switch variable and the constants need to be compatible to the
dictionary implementation (hashable, comparable, a==b =>
hash(a)==hash(b)).

Solution 1: Optimizing if-elif-else

Implementation:

It should be possible for the compiler to detect an if-elif-else
construct which has the following signature:

    if x == 'first':...
    elif x == 'second':...
    else:...

i.e. the left hand side always references the same variable, the right
hand side a hashable immutable builtin type. The right hand sides need
not be all of the same type, but they should be comparable to the type
of the left hand switch variable.

The compiler could then setup a read-only (perfect) hash table, store it
in the constants and add an opcode SWITCH in front of the standard
if-elif-else byte code stream which triggers the following run-time
behaviour:

At runtime, SWITCH would check x for being one of the well-known
immutable types (strings, unicode, numbers) and use the hash table for
finding the right opcode snippet. If this condition is not met, the
interpreter should revert to the standard if-elif-else processing by
simply skipping the SWITCH opcode and proceeding with the usual
if-elif-else byte code stream.

Issues:

The new optimization should not change the current Python semantics (by
reducing the number of __cmp__ calls and adding __hash__ calls in
if-elif-else constructs which are affected by the optimization). To
assure this, switching can only safely be implemented either if a "from
__future__" style flag is used, or the switching variable is one of the
builtin immutable types: int, float, string, unicode, etc. (not
subtypes, since it's not clear whether these are still immutable or not)

To prevent post-modifications of the jump-table dictionary (which could
be used to reach protected code), the jump-table will have to be a
read-only type (e.g. a read-only dictionary).

The optimization should only be used for if-elif-else constructs which
have a minimum number of n cases (where n is a number which has yet to
be defined depending on performance tests).

Solution 2: Adding a switch statement to Python

New Syntax

    switch EXPR:
        case CONSTANT:
            SUITE
        case CONSTANT:
            SUITE
        ...
        else:
            SUITE

(modulo indentation variations)

The "else" part is optional. If no else part is given and none of the
defined cases matches, no action is taken and the switch statement is
ignored. This is in line with the current if-behaviour. A user who wants
to signal this situation using an exception can define an else-branch
which then implements the intended action.

Note that the constants need not be all of the same type, but they
should be comparable to the type of the switch variable.

Implementation

The compiler would have to compile this into byte code similar to this:

    def whatis(x):
        switch(x):
            case 'one':
                print '1'
            case 'two':
                print '2'
            case 'three':
                print '3'
            else:
                print "D'oh!"

into (omitting POP_TOP's and SET_LINENO's):

    6  LOAD_FAST         0 (x)
    9  LOAD_CONST        1 (switch-table-1)
    12  SWITCH            26 (to 38)

    14  LOAD_CONST        2 ('1')
    17  PRINT_ITEM
    18  PRINT_NEWLINE
    19  JUMP 43

    22  LOAD_CONST        3 ('2')
    25  PRINT_ITEM
    26  PRINT_NEWLINE
    27  JUMP 43

    30  LOAD_CONST        4 ('3')
    33  PRINT_ITEM
    34  PRINT_NEWLINE
    35  JUMP 43

    38  LOAD_CONST        5 ("D'oh!")
    41  PRINT_ITEM
    42  PRINT_NEWLINE

    >>43  LOAD_CONST        0 (None)
    46  RETURN_VALUE

Where the 'SWITCH' opcode would jump to 14, 22, 30 or 38 depending on
'x'.

Thomas Wouters has written a patch which demonstrates the above. You can
download it from[1].

Issues

The switch statement should not implement fall-through behaviour (as
does the switch statement in C). Each case defines a complete and
independent suite; much like in an if-elif-else statement. This also
enables using break in switch statements inside loops.

If the interpreter finds that the switch variable x is not hashable, it
should raise a TypeError at run-time pointing out the problem.

There have been other proposals for the syntax which reuse existing
keywords and avoid adding two new ones ("switch" and "case"). Others
have argued that the keywords should use new terms to avoid confusion
with the C keywords of the same name but slightly different semantics
(e.g. fall-through without break). Some of the proposed variants:

    case EXPR:
        of CONSTANT:
            SUITE
        of CONSTANT:
            SUITE
        else:
            SUITE

    case EXPR:
        if CONSTANT:
             SUITE
        if CONSTANT:
            SUITE
        else:
            SUITE

    when EXPR:
        in CONSTANT_TUPLE:
            SUITE
        in CONSTANT_TUPLE:
            SUITE
        ...
    else:
         SUITE

The switch statement could be extended to allow multiple values for one
section (e.g. case 'a', 'b', 'c': ...). Another proposed extension would
allow ranges of values (e.g. case 10..14: ...). These should probably be
post-poned, but already kept in mind when designing and implementing a
first version.

Examples

The following examples all use a new syntax as proposed by solution 2.
However, all of these examples would work with solution 1 as well.

    switch EXPR:                   switch x:
        case CONSTANT:                 case "first":
            SUITE                          print x
        case CONSTANT:                 case "second":
            SUITE                          x = x**2
        ...                                print x
        else:                          else:
            SUITE                          print "whoops!"


    case EXPR:                     case x:
        of CONSTANT:                   of "first":
            SUITE                          print x
        of CONSTANT:                   of "second":
            SUITE                          print x**2
        else:                          else:
            SUITE                          print "whoops!"


    case EXPR:                     case state:
        if CONSTANT:                   if "first":
             SUITE                         state = "second"
        if CONSTANT:                   if "second":
            SUITE                          state = "third"
        else:                          else:
            SUITE                          state = "first"


    when EXPR:                     when state:
        in CONSTANT_TUPLE:             in ("first", "second"):
            SUITE                          print state
        in CONSTANT_TUPLE:                 state = next_state(state)
            SUITE                      in ("seventh",):
        ...                                print "done"
    else:                                  break    # out of loop!
         SUITE                     else:
                                       print "middle state"
                                       state = next_state(state)

Here's another nice application found by Jack Jansen (switching on
argument types):

    switch type(x).__name__:
        case 'int':
            SUITE
        case 'string':
            SUITE

Scope

XXX Explain "from __future__ import switch"

Credits

-   Martin von Löwis (issues with the optimization idea)
-   Thomas Wouters (switch statement + byte code compiler example)
-   Skip Montanaro (dispatching ideas, examples)
-   Donald Beaudry (switch syntax)
-   Greg Ewing (switch syntax)
-   Jack Jansen (type switching examples)

References

Copyright

This document has been placed in the public domain.

[1] https://sourceforge.net/tracker/index.php?func=detail&aid=481118&group_id=5470&atid=305470