PEP: 502 Title: String Interpolation - Extended Discussion Author: Mike
G. Miller Status: Rejected Type: Informational Content-Type: text/x-rst
Created: 10-Aug-2015 Python-Version: 3.6

Abstract

PEP 498: Literal String Interpolation, which proposed "formatted
strings" was accepted September 9th, 2015. Additional background and
rationale given during its design phase is detailed below.

To recap that PEP, a string prefix was introduced that marks the string
as a template to be rendered. These formatted strings may contain one or
more expressions built on the existing syntax of str.format().[1][2] The
formatted string expands at compile-time into a conventional string
format operation, with the given expressions from its text extracted and
passed instead as positional arguments.

At runtime, the resulting expressions are evaluated to render a string
to given specifications:

    >>> location = 'World'
    >>> f'Hello, {location} !'      # new prefix: f''
    'Hello, World !'                # interpolated result

Format-strings may be thought of as merely syntactic sugar to simplify
traditional calls to str.format().

PEP Status

This PEP was rejected based on its using an opinion-based tone rather
than a factual one. This PEP was also deemed not critical as PEP 498 was
already written and should be the place to house design decision
details.

Motivation

Though string formatting and manipulation features are plentiful in
Python, one area where it falls short is the lack of a convenient string
interpolation syntax. In comparison to other dynamic scripting languages
with similar use cases, the amount of code necessary to build similar
strings is substantially higher, while at times offering lower
readability due to verbosity, dense syntax, or identifier duplication.

These difficulties are described at moderate length in the original post
to python-ideas that started the snowball (that became PEP 498)
rolling.[3]

Furthermore, replacement of the print statement with the more consistent
print function of Python 3 (PEP 3105) has added one additional minor
burden, an additional set of parentheses to type and read. Combined with
the verbosity of current string formatting solutions, this puts an
otherwise simple language at an unfortunate disadvantage to its peers:

    echo "Hello, user: $user, id: $id, on host: $hostname"              # bash
    say  "Hello, user: $user, id: $id, on host: $hostname";             # perl
    puts "Hello, user: #{user}, id: #{id}, on host: #{hostname}\n"      # ruby
                                                                        # 80 ch -->|
    # Python 3, str.format with named parameters
    print('Hello, user: {user}, id: {id}, on host: {hostname}'.format(**locals()))

    # Python 3, worst case
    print('Hello, user: {user}, id: {id}, on host: {hostname}'.format(user=user,
                                                                      id=id,
                                                                      hostname=
                                                                        hostname))

In Python, the formatting and printing of a string with multiple
variables in a single line of code of standard width is noticeably
harder and more verbose, with indentation exacerbating the issue.

For use cases such as smaller projects, systems programming, shell
script replacements, and even one-liners, where message formatting
complexity has yet to be encapsulated, this verbosity has likely lead a
significant number of developers and administrators to choose other
languages over the years.

Rationale

Goals

The design goals of format strings are as follows:

1.  Eliminate need to pass variables manually.
2.  Eliminate repetition of identifiers and redundant parentheses.
3.  Reduce awkward syntax, punctuation characters, and visual noise.
4.  Improve readability and eliminate mismatch errors, by preferring
    named parameters to positional arguments.
5.  Avoid need for locals() and globals() usage, instead parsing the
    given string for named parameters, then passing them
    automatically.[4][5]

Limitations

In contrast to other languages that take design cues from Unix and its
shells, and in common with Javascript, Python specified both single (')
and double (") ASCII quote characters to enclose strings. It is not
reasonable to choose one of them now to enable interpolation, while
leaving the other for uninterpolated strings. Other characters, such as
the "Backtick" (or grave accent ) are also `constrained by history_ as a
shortcut for repr().

This leaves a few remaining options for the design of such a feature:

-   An operator, as in printf-style string formatting via %.
-   A class, such as string.Template().
-   A method or function, such as str.format().
-   New syntax, or
-   A new string prefix marker, such as the well-known r'' or u''.

The first three options above are mature. Each has specific use cases
and drawbacks, yet also suffer from the verbosity and visual noise
mentioned previously. All options are discussed in the next sections.

Background

Formatted strings build on several existing techniques and proposals and
what we've collectively learned from them. In keeping with the design
goals of readability and error-prevention, the following examples
therefore use named, not positional arguments.

Let's assume we have the following dictionary, and would like to print
out its items as an informative string for end users:

    >>> params = {'user': 'nobody', 'id': 9, 'hostname': 'darkstar'}

Printf-style formatting, via operator

This venerable technique continues to have its uses, such as with
byte-based protocols, simplicity in simple cases, and familiarity to
many programmers:

    >>> 'Hello, user: %(user)s, id: %(id)s, on host: %(hostname)s' % params
    'Hello, user: nobody, id: 9, on host: darkstar'

In this form, considering the prerequisite dictionary creation, the
technique is verbose, a tad noisy, yet relatively readable. Additional
issues are that an operator can only take one argument besides the
original string, meaning multiple parameters must be passed in a tuple
or dictionary. Also, it is relatively easy to make an error in the
number of arguments passed, the expected type, have a missing key, or
forget the trailing type, e.g. (s or d).

string.Template Class

The string.Template class from PEP 292 (Simpler String Substitutions) is
a purposely simplified design, using familiar shell interpolation
syntax, with safe-substitution feature, that finds its main use cases in
shell and internationalization tools:

    Template('Hello, user: $user, id: ${id}, on host: $hostname').substitute(params)

While also verbose, the string itself is readable. Though functionality
is limited, it meets its requirements well. It isn't powerful enough for
many cases, and that helps keep inexperienced users out of trouble, as
well as avoiding issues with moderately-trusted input (i18n) from
third-parties. It unfortunately takes enough code to discourage its use
for ad-hoc string interpolation, unless encapsulated in a convenience
library such as flufl.i18n.

PEP 215 - String Interpolation

PEP 215 was a former proposal of which this one shares a lot in common.
Apparently, the world was not ready for it at the time, but considering
recent support in a number of other languages, its day may have come.

The large number of dollar sign ($) characters it included may have led
it to resemble Python's arch-nemesis Perl, and likely contributed to the
PEP's lack of acceptance. It was superseded by the following proposal.

str.format() Method

The str.format() syntax of PEP 3101 is the most recent and modern of the
existing options. It is also more powerful and usually easier to read
than the others. It avoids many of the drawbacks and limits of the
previous techniques.

However, due to its necessary function call and parameter passing, it
runs from verbose to very verbose in various situations with string
literals:

    >>> 'Hello, user: {user}, id: {id}, on host: {hostname}'.format(**params)
    'Hello, user: nobody, id: 9, on host: darkstar'

    # when using keyword args, var name shortening sometimes needed to fit :/
    >>> 'Hello, user: {user}, id: {id}, on host: {host}'.format(user=user,
                                                                id=id,
                                                                host=hostname)
    'Hello, user: nobody, id: 9, on host: darkstar'

The verbosity of the method-based approach is illustrated here.

PEP 498 -- Literal String Formatting

PEP 498 defines and discusses format strings, as also described in the
Abstract above.

It also, somewhat controversially to those first exposed, introduces the
idea that format-strings shall be augmented with support for arbitrary
expressions. This is discussed further in the Restricting Syntax section
under Rejected Ideas.

PEP 501 -- Translation ready string interpolation

The complimentary PEP 501 brings internationalization into the
discussion as a first-class concern, with its proposal of the i-prefix,
string.Template syntax integration compatible with ES6 (Javascript),
deferred rendering, and an object return value.

Implementations in Other Languages

String interpolation is now well supported by various programming
languages used in multiple industries, and is converging into a standard
of sorts. It is centered around str.format() style syntax in minor
variations, with the addition of arbitrary expressions to expand
utility.

In the Motivation section it was shown how convenient interpolation
syntax existed in Bash, Perl, and Ruby. Let's take a look at their
expression support.

Bash

Bash supports a number of arbitrary, even recursive constructs inside
strings:

    > echo "user: $USER, id: $((id + 6)) on host: $(echo is $(hostname))"
    user: nobody, id: 15 on host: is darkstar

-   Explicit interpolation within double quotes.
-   Direct environment variable access supported.
-   Arbitrary expressions are supported.[6]
-   External process execution and output capture supported.[7]
-   Recursive expressions are supported.

Perl

Perl also has arbitrary expression constructs, perhaps not as well
known:

    say "I have @{[$id + 6]} guanacos.";                # lists
    say "I have ${\($id + 6)} guanacos.";               # scalars
    say "Hello { @names.join(', ') } how are you?";     # Perl 6 version

-   Explicit interpolation within double quotes.
-   Arbitrary expressions are supported.[8][9]

Ruby

Ruby allows arbitrary expressions in its interpolated strings:

    puts "One plus one is two: #{1 + 1}\n"

-   Explicit interpolation within double quotes.
-   Arbitrary expressions are supported.[10][11]
-   Possible to change delimiter chars with %.
-   See the Reference Implementation(s) section for an implementation in
    Python.

Others

Let's look at some less-similar modern languages recently implementing
string interpolation.

Scala

Scala interpolation is directed through string prefixes. Each prefix has
a different result:

    s"Hello, $name ${1 + 1}"                    # arbitrary
    f"$name%s is $height%2.2f meters tall"      # printf-style
    raw"a\nb"                                   # raw, like r''

These prefixes may also be implemented by the user, by extending Scala's
StringContext class.

-   Explicit interpolation within double quotes with literal prefix.
-   User implemented prefixes supported.
-   Arbitrary expressions are supported.

ES6 (Javascript)

Designers of Template strings faced the same issue as Python where
single and double quotes were taken. Unlike Python however, "backticks"
were not. Despite their issues, they were chosen as part of the
ECMAScript 2015 (ES6) standard:

    console.log(`Fifteen is ${a + b} and\nnot ${2 * a + b}.`);

Custom prefixes are also supported by implementing a function the same
name as the tag:

    function tag(strings, ...values) {
        console.log(strings.raw[0]);    // raw string is also available
        return "Bazinga!";
    }
    tag`Hello ${ a + b } world ${ a * b}`;

-   Explicit interpolation within backticks.
-   User implemented prefixes supported.
-   Arbitrary expressions are supported.

C#, Version 6

C# has a useful new interpolation feature as well, with some ability to
customize interpolation via the IFormattable interface:

    $"{person.Name, 20} is {person.Age:D3} year{(p.Age == 1 ? "" : "s")} old.";

-   Explicit interpolation with double quotes and $ prefix.
-   Custom interpolations are available.
-   Arbitrary expressions are supported.

Apple's Swift

Arbitrary interpolation under Swift is available on all strings:

    let multiplier = 3
    let message = "\(multiplier) times 2.5 is \(Double(multiplier) * 2.5)"
    // message is "3 times 2.5 is 7.5"

-   Implicit interpolation with double quotes.
-   Arbitrary expressions are supported.
-   Cannot contain CR/LF.

Additional examples

A number of additional examples of string interpolation may be found at
Wikipedia.

Now that background and history have been covered, let's continue on for
a solution.

New Syntax

This should be an option of last resort, as every new syntax feature has
a cost in terms of real-estate in a brain it inhabits. There is however
one alternative left on our list of possibilities, which follows.

New String Prefix

Given the history of string formatting in Python and
backwards-compatibility, implementations in other languages, avoidance
of new syntax unless necessary, an acceptable design is reached through
elimination rather than unique insight. Therefore, marking interpolated
string literals with a string prefix is chosen.

We also choose an expression syntax that reuses and builds on the
strongest of the existing choices, str.format() to avoid further
duplication of functionality:

    >>> location = 'World'
    >>> f'Hello, {location} !'      # new prefix: f''
    'Hello, World !'                # interpolated result

PEP 498 -- Literal String Formatting, delves into the mechanics and
implementation of this design.

Additional Topics

Safety

In this section we will describe the safety situation and precautions
taken in support of format-strings.

1.  Only string literals have been considered for format-strings, not
    variables to be taken as input or passed around, making external
    attacks difficult to accomplish.

    str.format() and alternatives already handle this use-case.

2.  Neither locals() nor globals() are necessary nor used during the
    transformation, avoiding leakage of information.

3.  To eliminate complexity as well as RuntimeError (s) due to recursion
    depth, recursive interpolation is not supported.

However, mistakes or malicious code could be missed inside string
literals. Though that can be said of code in general, that these
expressions are inside strings means they are a bit more likely to be
obscured.

Mitigation via Tools

The idea is that tools or linters such as pyflakes, pylint, or Pycharm,
may check inside strings with expressions and mark them up
appropriately. As this is a common task with programming languages
today, multi-language tools won't have to implement this feature solely
for Python, significantly shortening time to implementation.

Farther in the future, strings might also be checked for constructs that
exceed the safety policy of a project.

Style Guide/Precautions

As arbitrary expressions may accomplish anything a Python expression is
able to, it is highly recommended to avoid constructs inside
format-strings that could cause side effects.

Further guidelines may be written once usage patterns and true problems
are known.

Reference Implementation(s)

The say module on PyPI implements string interpolation as described here
with the small burden of a callable interface:

    ＞ pip install say

    from say import say
    nums = list(range(4))
    say("Nums has {len(nums)} items: {nums}")

A Python implementation of Ruby interpolation is also available. It uses
the codecs module to do its work:

    ＞ pip install interpy

    # coding: interpy
    location = 'World'
    print("Hello #{location}.")

Backwards Compatibility

By using existing syntax and avoiding current or historical features,
format strings were designed so as to not interfere with existing code
and are not expected to cause any issues.

Postponed Ideas

Internationalization

Though it was highly desired to integrate internationalization support,
(see PEP 501), the finer details diverge at almost every point, making a
common solution unlikely:[12]

-   Use-cases differ
-   Compile vs. run-time tasks
-   Interpolation syntax needs
-   Intended audience
-   Security policy

Rejected Ideas

Restricting Syntax to str.format() Only

The common arguments against support of arbitrary expressions were:

1.  YAGNI, "You aren't gonna need it."
2.  The feature is not congruent with historical Python conservatism.
3.  Postpone - can implement in a future version if need is
    demonstrated.

Support of only str.format() syntax however, was deemed not enough of a
solution to the problem. Often a simple length or increment of an
object, for example, is desired before printing.

It can be seen in the Implementations in Other Languages section that
the developer community at large tends to agree. String interpolation
with arbitrary expressions is becoming an industry standard in modern
languages due to its utility.

Additional/Custom String-Prefixes

As seen in the Implementations in Other Languages section, many modern
languages have extensible string prefixes with a common interface. This
could be a way to generalize and reduce lines of code in common
situations. Examples are found in ES6 (Javascript), Scala, Nim, and C#
(to a lesser extent). This was rejected by the BDFL.[13]

Automated Escaping of Input Variables

While helpful in some cases, this was thought to create too much
uncertainty of when and where string expressions could be used safely or
not. The concept was also difficult to describe to others.[14]

Always consider format string variables to be unescaped, unless the
developer has explicitly escaped them.

Environment Access and Command Substitution

For systems programming and shell-script replacements, it would be
useful to handle environment variables and capture output of commands
directly in an expression string. This was rejected as not important
enough, and looking too much like bash/perl, which could encourage bad
habits.[15]

Acknowledgements

-   Eric V. Smith for the authoring and implementation of PEP 498.
-   Everyone on the python-ideas mailing list for rejecting the various
    crazy ideas that came up, helping to keep the final design in focus.

References

Copyright

This document has been placed in the public domain.

[1] Python Str.Format Syntax
(https://docs.python.org/3.6/library/string.html#format-string-syntax)

[2] Python Format-Spec Mini Language
(https://docs.python.org/3.6/library/string.html#format-specification-mini-language)

[3] Briefer String Format
(https://mail.python.org/pipermail/python-ideas/2015-July/034659.html)

[4] Briefer String Format
(https://mail.python.org/pipermail/python-ideas/2015-July/034669.html)

[5] Briefer String Format
(https://mail.python.org/pipermail/python-ideas/2015-July/034701.html)

[6] Bash Docs (https://tldp.org/LDP/abs/html/arithexp.html)

[7] Bash Docs (https://tldp.org/LDP/abs/html/commandsub.html)

[8] Perl Cookbook
(https://docstore.mik.ua/orelly/perl/cookbook/ch01_11.htm)

[9] Perl Docs
(https://web.archive.org/web/20121025185907/https://perl6maven.com/perl6-scalar-array-and-hash-interpolation)

[10] Ruby Docs
(http://ruby-doc.org/core-2.1.1/doc/syntax/literals_rdoc.html#label-Strings)

[11] Ruby Docs
(https://en.wikibooks.org/wiki/Ruby_Programming/Syntax/Literals#Interpolation)

[12] Literal String Formatting
(https://mail.python.org/pipermail/python-dev/2015-August/141289.html)

[13] Extensible String Prefixes
(https://mail.python.org/pipermail/python-ideas/2015-August/035336.html)

[14] Escaping of Input Variables
(https://mail.python.org/pipermail/python-ideas/2015-August/035532.html)

[15] Environment Access and Command Substitution
(https://mail.python.org/pipermail/python-ideas/2015-August/035554.html)