PEP: 498 Title: Literal String Interpolation Version: $Revision$
Last-Modified: $Date$ Author: Eric V. Smith <eric@trueblade.com> Status:
Final Type: Standards Track Content-Type: text/x-rst Created:
01-Aug-2015 Python-Version: 3.6 Post-History: 07-Aug-2015, 30-Aug-2015,
04-Sep-2015, 19-Sep-2015, 06-Nov-2016 Resolution:
https://mail.python.org/pipermail/python-dev/2015-September/141526.html

Abstract

Python supports multiple ways to format text strings. These include
%-formatting[1], str.format()[2], and string.Template [3]. Each of these
methods have their advantages, but in addition have disadvantages that
make them cumbersome to use in practice. This PEP proposed to add a new
string formatting mechanism: Literal String Interpolation. In this PEP,
such strings will be referred to as "f-strings", taken from the leading
character used to denote such strings, and standing for "formatted
strings".

This PEP does not propose to remove or deprecate any of the existing
string formatting mechanisms.

F-strings provide a way to embed expressions inside string literals,
using a minimal syntax. It should be noted that an f-string is really an
expression evaluated at run time, not a constant value. In Python source
code, an f-string is a literal string, prefixed with 'f', which contains
expressions inside braces. The expressions are replaced with their
values. Some examples are:

    >>> import datetime
    >>> name = 'Fred'
    >>> age = 50
    >>> anniversary = datetime.date(1991, 10, 12)
    >>> f'My name is {name}, my age next year is {age+1}, my anniversary is {anniversary:%A, %B %d, %Y}.'
    'My name is Fred, my age next year is 51, my anniversary is Saturday, October 12, 1991.'
    >>> f'He said his name is {name!r}.'
    "He said his name is 'Fred'."

A similar feature was proposed in PEP 215. PEP 215 proposed to support a
subset of Python expressions, and did not support the type-specific
string formatting (the __format__() method) which was introduced with
PEP 3101.

Rationale

This PEP is driven by the desire to have a simpler way to format strings
in Python. The existing ways of formatting are either error prone,
inflexible, or cumbersome.

%-formatting is limited as to the types it supports. Only ints, strs,
and doubles can be formatted. All other types are either not supported,
or converted to one of these types before formatting. In addition,
there's a well-known trap where a single value is passed:

    >>> msg = 'disk failure'
    >>> 'error: %s' % msg
    'error: disk failure'

But if msg were ever to be a tuple, the same code would fail:

    >>> msg = ('disk failure', 32)
    >>> 'error: %s' % msg
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: not all arguments converted during string formatting

To be defensive, the following code should be used:

    >>> 'error: %s' % (msg,)
    "error: ('disk failure', 32)"

str.format() was added to address some of these problems with
%-formatting. In particular, it uses normal function call syntax (and
therefore supports multiple parameters) and it is extensible through the
__format__() method on the object being converted to a string. See PEP
3101 for a detailed rationale. This PEP reuses much of the str.format()
syntax and machinery, in order to provide continuity with an existing
Python string formatting mechanism.

However, str.format() is not without its issues. Chief among them is its
verbosity. For example, the text value is repeated here:

    >>> value = 4 * 20
    >>> 'The value is {value}.'.format(value=value)
    'The value is 80.'

Even in its simplest form there is a bit of boilerplate, and the value
that's inserted into the placeholder is sometimes far removed from where
the placeholder is situated:

    >>> 'The value is {}.'.format(value)
    'The value is 80.'

With an f-string, this becomes:

    >>> f'The value is {value}.'
    'The value is 80.'

F-strings provide a concise, readable way to include the value of Python
expressions inside strings.

In this sense, string.Template and %-formatting have similar
shortcomings to str.format(), but also support fewer formatting options.
In particular, they do not support the __format__ protocol, so that
there is no way to control how a specific object is converted to a
string, nor can it be extended to additional types that want to control
how they are converted to strings (such as Decimal and datetime). This
example is not possible with string.Template:

    >>> value = 1234
    >>> f'input={value:#06x}'
    'input=0x04d2'

And neither %-formatting nor string.Template can control formatting such
as:

    >>> date = datetime.date(1991, 10, 12)
    >>> f'{date} was on a {date:%A}'
    '1991-10-12 was on a Saturday'

No use of globals() or locals()

In the discussions on python-dev[4], a number of solutions where
presented that used locals() and globals() or their equivalents. All of
these have various problems. Among these are referencing variables that
are not otherwise used in a closure. Consider:

    >>> def outer(x):
    ...     def inner():
    ...         return 'x={x}'.format_map(locals())
    ...     return inner
    ...
    >>> outer(42)()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 3, in inner
    KeyError: 'x'

This returns an error because the compiler has not added a reference to
x inside the closure. You need to manually add a reference to x in order
for this to work:

    >>> def outer(x):
    ...     def inner():
    ...         x
    ...         return 'x={x}'.format_map(locals())
    ...     return inner
    ...
    >>> outer(42)()
    'x=42'

In addition, using locals() or globals() introduces an information leak.
A called routine that has access to the callers locals() or globals()
has access to far more information than needed to do the string
interpolation.

Guido stated[5] that any solution to better string interpolation would
not use locals() or globals() in its implementation. (This does not
forbid users from passing locals() or globals() in, it just doesn't
require it, nor does it allow using these functions under the hood.)

Specification

In source code, f-strings are string literals that are prefixed by the
letter 'f' or 'F'. Everywhere this PEP uses 'f', 'F' may also be used.
'f' may be combined with 'r' or 'R', in either order, to produce raw
f-string literals. 'f' may not be combined with 'b': this PEP does not
propose to add binary f-strings. 'f' may not be combined with 'u'.

When tokenizing source files, f-strings use the same rules as normal
strings, raw strings, binary strings, and triple quoted strings. That
is, the string must end with the same character that it started with: if
it starts with a single quote it must end with a single quote, etc. This
implies that any code that currently scans Python code looking for
strings should be trivially modifiable to recognize f-strings (parsing
within an f-string is another matter, of course).

Once tokenized, f-strings are parsed in to literal strings and
expressions. Expressions appear within curly braces '{' and '}'. While
scanning the string for expressions, any doubled braces '{{' or '}}'
inside literal portions of an f-string are replaced by the corresponding
single brace. Doubled literal opening braces do not signify the start of
an expression. A single closing curly brace '}' in the literal portion
of a string is an error: literal closing curly braces must be doubled
'}}' in order to represent a single closing brace.

The parts of the f-string outside of braces are literal strings. These
literal portions are then decoded. For non-raw f-strings, this includes
converting backslash escapes such as '\n', '\"', "\'", '\xhh', '\uxxxx',
'\Uxxxxxxxx', and named unicode characters '\N{name}' into their
associated Unicode characters[6].

Backslashes may not appear anywhere within expressions. Comments, using
the '#' character, are not allowed inside an expression.

Following each expression, an optional type conversion may be specified.
The allowed conversions are '!s', '!r', or '!a'. These are treated the
same as in str.format(): '!s' calls str() on the expression, '!r' calls
repr() on the expression, and '!a' calls ascii() on the expression.
These conversions are applied before the call to format(). The only
reason to use '!s' is if you want to specify a format specifier that
applies to str, not to the type of the expression.

F-strings use the same format specifier mini-language as str.format.
Similar to str.format(), optional format specifiers maybe be included
inside the f-string, separated from the expression (or the type
conversion, if specified) by a colon. If a format specifier is not
provided, an empty string is used.

So, an f-string looks like:

    f ' <text> { <expression> <optional !s, !r, or !a> <optional : format specifier> } <text> ... '

The expression is then formatted using the __format__ protocol, using
the format specifier as an argument. The resulting value is used when
building the value of the f-string.

Note that __format__() is not called directly on each value. The actual
code uses the equivalent of type(value).__format__(value, format_spec),
or format(value, format_spec). See the documentation of the builtin
format() function for more details.

Expressions cannot contain ':' or '!' outside of strings or parentheses,
brackets, or braces. The exception is that the '!=' operator is allowed
as a special case.

Escape sequences

Backslashes may not appear inside the expression portions of f-strings,
so you cannot use them, for example, to escape quotes inside f-strings:

    >>> f'{\'quoted string\'}'
      File "<stdin>", line 1
    SyntaxError: f-string expression part cannot include a backslash

You can use a different type of quote inside the expression:

    >>> f'{"quoted string"}'
    'quoted string'

Backslash escapes may appear inside the string portions of an f-string.

Note that the correct way to have a literal brace appear in the
resulting string value is to double the brace:

    >>> f'{{ {4*10} }}'
    '{ 40 }'
    >>> f'{{{4*10}}}'
    '{40}'

Like all raw strings in Python, no escape processing is done for raw
f-strings:

    >>> fr'x={4*10}\n'
    'x=40\\n'

Due to Python's string tokenizing rules, the f-string
f'abc {a['x']} def' is invalid. The tokenizer parses this as 3 tokens:
f'abc {a[', x, and ']} def'. Just like regular strings, this cannot be
fixed by using raw strings. There are a number of correct ways to write
this f-string: with a different quote character:

    f"abc {a['x']} def"

Or with triple quotes:

    f'''abc {a['x']} def'''

Code equivalence

The exact code used to implement f-strings is not specified. However, it
is guaranteed that any embedded value that is converted to a string will
use that value's __format__ method. This is the same mechanism that
str.format() uses to convert values to strings.

For example, this code:

    f'abc{expr1:spec1}{expr2!r:spec2}def{expr3}ghi'

Might be evaluated as:

    'abc' + format(expr1, spec1) + format(repr(expr2), spec2) + 'def' + format(expr3) + 'ghi'

Expression evaluation

The expressions that are extracted from the string are evaluated in the
context where the f-string appeared. This means the expression has full
access to local and global variables. Any valid Python expression can be
used, including function and method calls.

Because the f-strings are evaluated where the string appears in the
source code, there is no additional expressiveness available with
f-strings. There are also no additional security concerns: you could
have also just written the same expression, not inside of an f-string:

    >>> def foo():
    ...   return 20
    ...
    >>> f'result={foo()}'
    'result=20'

Is equivalent to:

    >>> 'result=' + str(foo())
    'result=20'

Expressions are parsed with the equivalent of
ast.parse('(' + expression + ')', '<fstring>', 'eval')[7].

Note that since the expression is enclosed by implicit parentheses
before evaluation, expressions can contain newlines. For example:

    >>> x = 0
    >>> f'''{x
    ... +1}'''
    '1'

    >>> d = {0: 'zero'}
    >>> f'''{d[0
    ... ]}'''
    'zero'

Format specifiers

Format specifiers may also contain evaluated expressions. This allows
code such as:

    >>> width = 10
    >>> precision = 4
    >>> value = decimal.Decimal('12.34567')
    >>> f'result: {value:{width}.{precision}}'
    'result:      12.35'

Once expressions in a format specifier are evaluated (if necessary),
format specifiers are not interpreted by the f-string evaluator. Just as
in str.format(), they are merely passed in to the __format__() method of
the object being formatted.

Concatenating strings

Adjacent f-strings and regular strings are concatenated. Regular strings
are concatenated at compile time, and f-strings are concatenated at run
time. For example, the expression:

    >>> x = 10
    >>> y = 'hi'
    >>> 'a' 'b' f'{x}' '{c}' f'str<{y:^4}>' 'd' 'e'

yields the value:

    'ab10{c}str< hi >de'

While the exact method of this run time concatenation is unspecified,
the above code might evaluate to:

    'ab' + format(x) + '{c}' + 'str<' + format(y, '^4') + '>de'

Each f-string is entirely evaluated before being concatenated to
adjacent f-strings. That means that this:

    >>> f'{x' f'}'

Is a syntax error, because the first f-string does not contain a closing
brace.

Error handling

Either compile time or run time errors can occur when processing
f-strings. Compile time errors are limited to those errors that can be
detected when scanning an f-string. These errors all raise SyntaxError.

Unmatched braces:

    >>> f'x={x'
      File "<stdin>", line 1
    SyntaxError: f-string: expecting '}'

Invalid expressions:

    >>> f'x={!x}'
      File "<stdin>", line 1
    SyntaxError: f-string: empty expression not allowed

Run time errors occur when evaluating the expressions inside an
f-string. Note that an f-string can be evaluated multiple times, and
work sometimes and raise an error at other times:

    >>> d = {0:10, 1:20}
    >>> for i in range(3):
    ...     print(f'{i}:{d[i]}')
    ...
    0:10
    1:20
    Traceback (most recent call last):
      File "<stdin>", line 2, in <module>
    KeyError: 2

or:

    >>> for x in (32, 100, 'fifty'):
    ...   print(f'x = {x:+3}')
    ...
    'x = +32'
    'x = +100'
    Traceback (most recent call last):
      File "<stdin>", line 2, in <module>
    ValueError: Sign not allowed in string format specifier

Leading and trailing whitespace in expressions is ignored

For ease of readability, leading and trailing whitespace in expressions
is ignored. This is a by-product of enclosing the expression in
parentheses before evaluation.

Evaluation order of expressions

The expressions in an f-string are evaluated in left-to-right order.
This is detectable only if the expressions have side effects:

    >>> def fn(l, incr):
    ...    result = l[0]
    ...    l[0] += incr
    ...    return result
    ...
    >>> lst = [0]
    >>> f'{fn(lst,2)} {fn(lst,3)}'
    '0 2'
    >>> f'{fn(lst,2)} {fn(lst,3)}'
    '5 7'
    >>> lst
    [10]

Discussion

python-ideas discussion

Most of the discussions on python-ideas[8] focused on three issues:

-   How to denote f-strings,
-   How to specify the location of expressions in f-strings, and
-   Whether to allow full Python expressions.

How to denote f-strings

Because the compiler must be involved in evaluating the expressions
contained in the interpolated strings, there must be some way to denote
to the compiler which strings should be evaluated. This PEP chose a
leading 'f' character preceding the string literal. This is similar to
how 'b' and 'r' prefixes change the meaning of the string itself, at
compile time. Other prefixes were suggested, such as 'i'. No option
seemed better than the other, so 'f' was chosen.

Another option was to support special functions, known to the compiler,
such as Format(). This seems like too much magic for Python: not only is
there a chance for collision with existing identifiers, the PEP author
feels that it's better to signify the magic with a string prefix
character.

How to specify the location of expressions in f-strings

This PEP supports the same syntax as str.format() for distinguishing
replacement text inside strings: expressions are contained inside
braces. There were other options suggested, such as string.Template's
$identifier or ${expression}.

While $identifier is no doubt more familiar to shell scripters and users
of some other languages, in Python str.format() is heavily used. A quick
search of Python's standard library shows only a handful of uses of
string.Template, but hundreds of uses of str.format().

Another proposed alternative was to have the substituted text between \{
and } or between \{ and \}. While this syntax would probably be
desirable if all string literals were to support interpolation, this PEP
only supports strings that are already marked with the leading 'f'. As
such, the PEP is using unadorned braces to denoted substituted text, in
order to leverage end user familiarity with str.format().

Supporting full Python expressions

Many people on the python-ideas discussion wanted support for either
only single identifiers, or a limited subset of Python expressions (such
as the subset supported by str.format()). This PEP supports full Python
expressions inside the braces. Without full expressions, some desirable
usage would be cumbersome. For example:

    >>> f'Column={col_idx+1}'
    >>> f'number of items: {len(items)}'

would become:

    >>> col_number = col_idx+1
    >>> f'Column={col_number}'
    >>> n_items = len(items)
    >>> f'number of items: {n_items}'

While it's true that very ugly expressions could be included in the
f-strings, this PEP takes the position that such uses should be
addressed in a linter or code review:

    >>> f'mapping is { {a:b for (a, b) in ((1, 2), (3, 4))} }'
    'mapping is {1: 2, 3: 4}'

Similar support in other languages

Wikipedia has a good discussion of string interpolation in other
programming languages[9]. This feature is implemented in many languages,
with a variety of syntaxes and restrictions.

Differences between f-string and str.format expressions

There is one small difference between the limited expressions allowed in
str.format() and the full expressions allowed inside f-strings. The
difference is in how index lookups are performed. In str.format(), index
values that do not look like numbers are converted to strings:

    >>> d = {'a': 10, 'b': 20}
    >>> 'a={d[a]}'.format(d=d)
    'a=10'

Notice that the index value is converted to the string 'a' when it is
looked up in the dict.

However, in f-strings, you would need to use a literal for the value of
'a':

    >>> f'a={d["a"]}'
    'a=10'

This difference is required because otherwise you would not be able to
use variables as index values:

    >>> a = 'b'
    >>> f'a={d[a]}'
    'a=20'

See[10] for a further discussion. It was this observation that led to
full Python expressions being supported in f-strings.

Furthermore, the limited expressions that str.format() understands need
not be valid Python expressions. For example:

    >>> '{i[";]}'.format(i={'";':4})
    '4'

For this reason, the str.format() "expression parser" is not suitable
for use when implementing f-strings.

Triple-quoted f-strings

Triple quoted f-strings are allowed. These strings are parsed just as
normal triple-quoted strings are. After parsing and decoding, the normal
f-string logic is applied, and __format__() is called on each value.

Raw f-strings

Raw and f-strings may be combined. For example, they could be used to
build up regular expressions:

    >>> header = 'Subject'
    >>> fr'{header}:\s+'
    'Subject:\\s+'

In addition, raw f-strings may be combined with triple-quoted strings.

No binary f-strings

For the same reason that we don't support bytes.format(), you may not
combine 'f' with 'b' string literals. The primary problem is that an
object's __format__() method may return Unicode data that is not
compatible with a bytes string.

Binary f-strings would first require a solution for bytes.format(). This
idea has been proposed in the past, most recently in
461#proposed-variations. The discussions of such a feature usually
suggest either

-   adding a method such as __bformat__() so an object can control how
    it is converted to bytes, or
-   having bytes.format() not be as general purpose or extensible as
    str.format().

Both of these remain as options in the future, if such functionality is
desired.

!s, !r, and !a are redundant

The !s, !r, and !a conversions are not strictly required. Because
arbitrary expressions are allowed inside the f-strings, this code:

    >>> a = 'some string'
    >>> f'{a!r}'
    "'some string'"

Is identical to:

    >>> f'{repr(a)}'
    "'some string'"

Similarly, !s can be replaced by calls to str() and !a by calls to
ascii().

However, !s, !r, and !a are supported by this PEP in order to minimize
the differences with str.format(). !s, !r, and !a are required in
str.format() because it does not allow the execution of arbitrary
expressions.

Lambdas inside expressions

Because lambdas use the ':' character, they cannot appear outside of
parentheses in an expression. The colon is interpreted as the start of
the format specifier, which means the start of the lambda expression is
seen and is syntactically invalid. As there's no practical use for a
plain lambda in an f-string expression, this is not seen as much of a
limitation.

If you feel you must use lambdas, they may be used inside of
parentheses:

    >>> f'{(lambda x: x*2)(3)}'
    '6'

Can't combine with 'u'

The 'u' prefix was added to Python 3.3 in PEP 414 as a means to ease
source compatibility with Python 2.7. Because Python 2.7 will never
support f-strings, there is nothing to be gained by being able to
combine the 'f' prefix with 'u'.

Examples from Python's source code

Here are some examples from Python source code that currently use
str.format(), and how they would look with f-strings. This PEP does not
recommend wholesale converting to f-strings, these are just examples of
real-world usages of str.format() and how they'd look if written from
scratch using f-strings.

Lib/asyncio/locks.py:

    extra = '{},waiters:{}'.format(extra, len(self._waiters))
    extra = f'{extra},waiters:{len(self._waiters)}'

Lib/configparser.py:

    message.append(" [line {0:2d}]".format(lineno))
    message.append(f" [line {lineno:2d}]")

Tools/clinic/clinic.py:

    methoddef_name = "{}_METHODDEF".format(c_basename.upper())
    methoddef_name = f"{c_basename.upper()}_METHODDEF"

python-config.py:

    print("Usage: {0} [{1}]".format(sys.argv[0], '|'.join('--'+opt for opt in valid_opts)), file=sys.stderr)
    print(f"Usage: {sys.argv[0]} [{'|'.join('--'+opt for opt in valid_opts)}]", file=sys.stderr)

References

Copyright

This document has been placed in the public domain.



  Local Variables: mode: indented-text indent-tabs-mode: nil
  sentence-end-double-space: t fill-column: 70 coding: utf-8 End:

[1] %-formatting
(https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting)

[2] str.format
(https://docs.python.org/3/library/string.html#formatstrings)

[3] string.Template documentation
(https://docs.python.org/3/library/string.html#template-strings)

[4] Formatting using locals() and globals()
(https://mail.python.org/pipermail/python-ideas/2015-July/034671.html)

[5] Avoid locals() and globals()
(https://mail.python.org/pipermail/python-ideas/2015-July/034701.html)

[6] String literal description
(https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals)

[7] ast.parse() documentation
(https://docs.python.org/3/library/ast.html#ast.parse)

[8] Start of python-ideas discussion
(https://mail.python.org/pipermail/python-ideas/2015-July/034657.html)

[9] Wikipedia article on string interpolation
(https://en.wikipedia.org/wiki/String_interpolation)

[10] Differences in str.format() and f-string expressions
(https://mail.python.org/pipermail/python-ideas/2015-July/034726.html)