PEP: 215 Title: String Interpolation Author: Ka-Ping Yee <ping@zesty.ca>
Status: Superseded Type: Standards Track Created: 24-Jul-2000
Python-Version: 2.1 Post-History: Superseded-By: 292

292

Abstract

This document proposes a string interpolation feature for Python to
allow easier string formatting. The suggested syntax change is the
introduction of a '$' prefix that triggers the special interpretation of
the '$' character within a string, in a manner reminiscent to the
variable interpolation found in Unix shells, awk, Perl, or Tcl.

Copyright

This document is in the public domain.

Specification

Strings may be preceded with a '$' prefix that comes before the leading
single or double quotation mark (or triplet) and before any of the other
string prefixes ('r' or 'u'). Such a string is processed for
interpolation after the normal interpretation of backslash-escapes in
its contents. The processing occurs just before the string is pushed
onto the value stack, each time the string is pushed. In short, Python
behaves exactly as if '$' were a unary operator applied to the string.
The operation performed is as follows:

The string is scanned from start to end for the '$' character (\x24 in
8-bit strings or \u0024 in Unicode strings). If there are no '$'
characters present, the string is returned unchanged.

Any '$' found in the string, followed by one of the two kinds of
expressions described below, is replaced with the value of the
expression as evaluated in the current namespaces. The value is
converted with str() if the containing string is an 8-bit string, or
with unicode() if it is a Unicode string.

1.  A Python identifier optionally followed by any number of trailers,
    where a trailer consists of:

    -   a dot and an identifier,
    -   an expression enclosed in square brackets, or

    - an argument list enclosed in parentheses (This is exactly the
    pattern expressed in the Python grammar by "NAME trailer*", using
    the definitions in Grammar/Grammar.)

2.  Any complete Python expression enclosed in curly braces.

Two dollar-signs ("$$") are replaced with a single "$".

Examples

Here is an example of an interactive session exhibiting the expected
behaviour of this feature. :

    >>> a, b = 5, 6
    >>> print $'a = $a, b = $b'
    a = 5, b = 6
    >>> $u'uni${a}ode'
    u'uni5ode'
    >>> print $'\$a'
    5
    >>> print $r'\$a'
    \5
    >>> print $'$$$a.$b'
    $5.6
    >>> print $'a + b = ${a + b}'
    a + b = 11
    >>> import sys
    >>> print $'References to $a: $sys.getrefcount(a)'
    References to 5: 15
    >>> print $"sys = $sys, sys = $sys.modules['sys']"
    sys = <module 'sys' (built-in)>, sys = <module 'sys' (built-in)>
    >>> print $'BDFL = $sys.copyright.split()[4].upper()'
    BDFL = GUIDO

Discussion

'$' is chosen as the interpolation character within the string for the
sake of familiarity, since it is already used for this purpose in many
other languages and contexts.

It is then natural to choose '$' as a prefix, since it is a mnemonic for
the interpolation character.

Trailers are permitted to give this interpolation mechanism even more
power than the interpolation available in most other languages, while
the expression to be interpolated remains clearly visible and free of
curly braces.

'$' works like an operator and could be implemented as an operator, but
that prevents the compile-time optimization and presents security
issues. So, it is only allowed as a string prefix.

Security Issues

"$" has the power to eval, but only to eval a literal. As described here
(a string prefix rather than an operator), it introduces no new security
issues since the expressions to be evaluated must be literally present
in the code.

Implementation

The Itpl module at[1] provides a prototype of this feature. It uses the
tokenize module to find the end of an expression to be interpolated,
then calls eval() on the expression each time a value is needed. In the
prototype, the expression is parsed and compiled again each time it is
evaluated.

As an optimization, interpolated strings could be compiled directly into
the corresponding bytecode; that is, :

    $'a = $a, b = $b'

could be compiled as though it were the expression :

    ('a = ' + str(a) + ', b = ' + str(b))

so that it only needs to be compiled once.

References

[1] http://www.lfw.org/python/Itpl.py