Following system colour scheme Selected dark colour scheme Selected light colour scheme

Python Enhancement Proposals

PEP 798 – Unpacking in Comprehensions

Author:
Adam Hartz <hz at mit.edu>, Erik Demaine <edemaine at mit.edu>
Sponsor:
Jelle Zijlstra <jelle.zijlstra at gmail.com>
Discussions-To:
Discourse thread
Status:
Draft
Type:
Standards Track
Created:
19-Jul-2025
Python-Version:
3.15
Post-History:
16-Oct-2021, 22-Jun-2025

Table of Contents

Abstract

This PEP proposes extending list, set, and dictionary comprehensions, as well as generator expressions, to allow unpacking notation (* and **) at the start of the expression, providing a concise way of combining an arbitrary number of iterables into one list or set or generator, or an arbitrary number of dictionaries into one dictionary, for example:

[*it for it in its]  # list with the concatenation of iterables in 'its'
{*it for it in its}  # set with the union of iterables in 'its'
{**d for d in dicts} # dict with the combination of dicts in 'dicts'
(*it for it in its)  # generator of the concatenation of iterables in 'its'

Motivation

Extended unpacking notation (* and **) from PEP 448 makes it easy to combine a few iterables or dictionaries:

[*it1, *it2, *it3]  # list with the concatenation of three iterables
{*it1, *it2, *it3}  # set with the union of three iterables
{**dict1, **dict2, **dict3}  # dict with the combination of three dicts

But if we want to similarly combine an arbitrary number of iterables, we cannot use unpacking in this same way.

That said, we do have a few options for combining multiple iterables. We could, for example, use explicit looping structures and built-in means of combination:

new_list = []
for it in its:
    new_list.extend(it)

new_set = set()
for it in its:
    new_set.update(it)

new_dict = {}
for d in dicts:
    new_dict.update(d)

def new_generator():
    for it in its:
        yield from it

Or, we could be more concise by using a comprehension with two loops:

[x for it in its for x in it]
{x for it in its for x in it}
{key: value for d in dicts for key, value in d.items()}
(x for it in its for x in it)

Or, we could use itertools.chain or itertools.chain.from_iterable:

list(itertools.chain(*its))
set(itertools.chain(*its))
dict(itertools.chain(*(d.items() for d in dicts)))
itertools.chain(*its)

list(itertools.chain.from_iterable(its))
set(itertools.chain.from_iterable(its))
dict(itertools.chain.from_iterable(d.items() for d in dicts))
itertools.chain.from_iterable(its)

Or, for all but the generator, we could use functools.reduce:

functools.reduce(operator.iconcat, its, (new_list := []))
functools.reduce(operator.ior, its, (new_set := set()))
functools.reduce(operator.ior, its, (new_dict := {}))

This PEP proposes allowing unpacking operations to be used in comprehensions as an additional alternative:

[*it for it in its]  # list with the concatenation of iterables in 'its'
{*it for it in its}  # set with the union of iterables in 'its'
{**d for d in dicts} # dict with the combination of dicts in 'dicts'
(*it for it in its)  # generator of the concatenation of iterables in 'its'   (*it for it in its)

This proposal also extends to asynchronous comprehensions and generator expressions, such that, for example, (*ait async for ait in aits()) is equivalent to (x async for ait in aits() for x in ait).

Rationale

Combining iterable objects together into a single larger object is a common task. One StackOverflow post asking about flattening a list of lists, for example, has been viewed 4.6 million times. Despite this being a common operation, the options currently available for performing it concisely require levels of indirection that can make the resulting code difficult to read and understand.

The proposed notation is concise (avoiding the use and repetition of auxiliary variables) and, we expect, intuitive and familiar to programmers familiar with both comprehensions and unpacking notation (see Code Examples for examples of code from the standard library that could be rewritten more clearly and concisely using the proposed syntax).

This proposal was motivated in part by a written exam in a Python programming class, where several students used the notation (specifically the set version) in their solutions, assuming that it already existed in Python. This suggests that the notation is intuitive, even to beginners. By contrast, the existing syntax [x for it in its for x in it] is one that students often get wrong, the natural impulse for many students being to reverse the order of the for clauses.

Specification

Syntax

The necessary grammatical changes are allowing the expression in list/set comprehensions and generator expressions to be preceded by a *, and allowing an alternative form of dictionary comprehension in which a double-starred expression can be used in place of a key: value pair.

This can be accomplished by updating the listcomp and setcomp rules to use star_named_expression instead of named_expression:

listcomp[expr_ty]:
    | '[' a=star_named_expression b=for_if_clauses ']'

setcomp[expr_ty]:
    | '{' a=star_named_expression b=for_if_clauses '}'

The rule for genexp would similarly need to be modified to allow a starred_expression:

genexp[expr_ty]:
    | '(' a=(assignment_expression | expression !':=' | starred_expression) b=for_if_clauses ')'

The rule for dictionary comprehensions would need to be adjusted as well, to allow for this new form:

dictcomp[expr_ty]:
    | '{' a=double_starred_kvpair b=for_if_clauses '}'

No change should be made to the way that argument unpacking is handled in function calls, i.e., the general rule that generator expressions provided as the sole argument to functions do not require additional redundant parentheses should be retained. Note that this implies that, for example, f(*x for x in it) is equivalent to f((*x for x in it)) (see Starred Generators as Function Arguments for more discussion).

* and ** should only be allowed at the top-most level of the expression in the comprehension (see Further Generalizing Unpacking Operators for more discussion).

Semantics: List/Set/Dict Comprehensions

The meaning of a starred expression in a list comprehension [*expr for x in it] is to treat each expression as an iterable, and concatenate them, in the same way as if they were explicitly listed via [*expr1, *expr2, ...]. Similarly, {*expr for x in it} forms a set union, as if the expressions were explicitly listed via {*expr1, *expr2, ...}; and {**expr for x in it} combines dictionaries, as if the expressions were explicitly listed via {**expr1, **expr2, ...}. These operations should retain all of the equivalent semantics for combining collections in this way (including, for example, later values replacing earlier ones in the case of a duplicated key when combining dictionaries).

Said another way, the objects created by the following comprehensions:

new_list = [*expr for x in its]
new_set = {*expr for x in its}
new_dict = {**expr for d in dicts}

should be equivalent to the objects created by the following pieces of code, respectively:

new_list = []
for x in its:
    new_list.extend(expr)

new_set = set()
for x in its:
    new_set.update(expr)

new_dict = {}
for x in dicts:
    new_dict.update(expr)

Semantics: Generator Expressions

A generator expression (*expr for x in it) forms a generator producing values from the concatenation of the iterables given by the expressions. Specifically, the behavior is defined to be equivalent to the following generator:

def generator():
    for x in it:
        yield from expr

Since yield from is not allowed inside of async generators, the equivalent for (*expr async for x in ait()), is more like the following (though of course this new form should not define or reference the looping variable i):

async def generator():
    async for x in ait():
        for i in expr:
            yield i

Interaction with Assignment Expressions

Note that this proposal does not suggest changing the order of evaluation of the various pieces of the comprehension, nor any rules about scoping. This is particularly relevant for generator expressions that make use of the “walrus operator” := from PEP 572, which, when used in a comprehension or a generator expression, performs its variable binding in the containing scope rather than locally to the comprehension.

As an example, consider the generator that results from evaluating the expression (*(y := [i, i+1]) for i in (0, 2, 4)). This is approximately equivalent to the following generator, except that in its generator expression form, y will be bound in the containing scope instead of locally:

def generator():
    for i in (0, 2, 4):
        yield from (y := [i, i+1])

In this example, the subexpression (y := [i, i+1]) is evaluated exactly three times before the generator is exhausted: just after assigning i in the comprehension to 0, 2, and 4, respectively. Thus, y (in the containing scope) will be modified at those points in time:

>>> g = (*(y := [i, i+1]) for i in (0, 2, 4))
>>> y
Traceback (most recent call last):
  File "<python-input-1>", line 1, in <module>
    y
NameError: name 'y' is not defined
>>> next(g)
0
>>> y
[0, 1]
>>> next(g)
1
>>> y
[0, 1]
>>> next(g)
2
>>> y
[2, 3]

Error Reporting

Currently, the proposed syntax generates a SyntaxError. Allowing these forms to be recognized as syntactically valid requires adjusting the grammar rules for invalid_comprehension and invalid_dict_comprehension to allow the use of * and **, respectively.

Additional specific error messages should be provided in at least the following cases:

  • Attempting to use ** in a list comprehension or generator expression should report that dictionary unpacking cannot be used in those structures, for example:
    >>> [**x for x in y]
      File "<stdin>", line 1
        [**x for x in y]
         ^^^
    SyntaxError: cannot use dict unpacking in list comprehension
    
    >>> (**x for x in y)
      File "<stdin>", line 1
        (**x for x in y)
         ^^^
    SyntaxError: cannot use dict unpacking in generator expression
    
  • The existing error message for attempting to use * in a dictionary key/value should be retained, but similar messages should be reported when attempting to use ** unpacking on a dictionary key or value, for example:
    >>> {*k: v for k,v in items}
      File "<stdin>", line 1
        {*k: v for k,v in items}
         ^^
    SyntaxError: cannot use a starred expression in a dictionary key
    
    >>> {k: *v for k,v in items}
      File "<stdin>", line 1
        {k: *v for k,v in items}
            ^^
    SyntaxError: cannot use a starred expression in a dictionary value
    
    >>> {**k: v for k,v in items}
      File "<stdin>", line 1
        {**k: v for k,v in items}
         ^^^
    SyntaxError: cannot use dict unpacking in a dictionary key
    
    >>> {k: **v for k,v in items}
      File "<stdin>", line 1
        {k: **v for k,v in items}
            ^^^
    SyntaxError: cannot use dict unpacking in a dictionary value
    
  • The phrasing of some other existing error messages should similarly be adjusted to account for the presence of the new syntax, and/or to clarify ambiguous or confusing cases relating to unpacking more generally (particularly those mentioned in Further Generalizing Unpacking Operators), for example:
    >>> [*x if x else y]
      File "<stdin>", line 1
        [*x if x else y]
         ^^^^^^^^^^^^^^
    SyntaxError: invalid starred expression. Did you forget to wrap the conditional expression in parentheses?
    
     >>> {**x if x else y}
      File "<stdin>", line 1
        {**x if x else y}
         ^^^^^^^^^^^^^^^
    SyntaxError: invalid double starred expression. Did you forget to wrap the conditional expression in parentheses?
    
    >>> [x if x else *y]
      File "<stdin>", line 1
        [x if x else *y]
                     ^
    SyntaxError: cannot unpack only part of a conditional expression
    
    >>> {x if x else **y}
      File "<stdin>", line 1
        {x if x else **y}
                     ^^
    SyntaxError: cannot use dict unpacking on only part of a conditional expression
    

Reference Implementation

A reference implementation is available, which implements this functionality, including draft documentation and additional test cases.

Backwards Compatibility

The behavior of all comprehensions that are currently syntactically valid would be unaffected by this change, so we do not anticipate much in the way of backwards-incompatibility concerns. In principle, this change would only affect code that relied on the fact that attempting to use unpacking operations in comprehensions would raise a SyntaxError, or that relied on the particular phrasing of any of the old error messages being replaced, which we expect to be rare.

Code Examples

This section shows some illustrative examples of how small pieces of code from the standard library could be rewritten to make use of this new syntax to improve concision and readability. The Reference Implementation continues to pass all tests with these replacements made.

Replacing Explicit Loops

Replacing explicit loops compresses multiple lines into one, and avoids the need for defining and referencing an auxiliary variable.

  • From email/_header_value_parser.py:
    # current:
    comments = []
    for token in self:
        comments.extend(token.comments)
    return comments
    
    # improved:
    return [*token.comments for token in self]
    
  • From shutil.py:
    # current:
    ignored_names = []
    for pattern in patterns:
        ignored_names.extend(fnmatch.filter(names, pattern))
    return set(ignored_names)
    
    # improved:
    return {*fnmatch.filter(names, pattern) for pattern in patterns}
    
  • From http/cookiejar.py:
    # current:
    cookies = []
    for domain in self._cookies.keys():
        cookies.extend(self._cookies_for_domain(domain, request))
    return cookies
    
    # improved:
    return [
        *self._cookies_for_domain(domain, request)
        for domain in self._cookies.keys()
    ]
    

Replacing from_iterable and Friends

While not always the right choice, replacing itertools.chain.from_iterable and map can avoid an extra level of redirection, resulting in code that follows conventional wisdom that comprehensions are more readable than map/filter.

  • From dataclasses.py:
    # current:
    inherited_slots = set(
        itertools.chain.from_iterable(map(_get_slots, cls.__mro__[1:-1]))
    )
    
    # improved:
    inherited_slots = {*_get_slots(c) for c in cls.__mro__[1:-1]}
    
  • From importlib/metadata/__init__.py:
    # current:
    return itertools.chain.from_iterable(
        path.search(prepared) for path in map(FastPath, paths)
    )
    
    # improved:
    return (*FastPath(path).search(prepared) for path in paths)
    
  • From collections/__init__.py (Counter class):
    # current:
    return _chain.from_iterable(_starmap(_repeat, self.items()))
    
    # improved:
    return (*_repeat(elt, num) for elt, num in self.items())
    
  • From zipfile/_path/__init__.py:
    # current:
    parents = itertools.chain.from_iterable(map(_parents, names))
    
    # improved:
    parents = (*_parents(name) for name in names)
    
  • From _pyrepl/_module_completer.py:
    # current:
    search_locations = set(chain.from_iterable(
        getattr(spec, 'submodule_search_locations', [])
        for spec in specs if spec
    ))
    
    # improved:
    search_locations = {
        *getattr(spec, 'submodule_search_locations', [])
        for spec in specs if spec
    }
    

Replacing Double Loops in Comprehensions

Replacing double loops in comprehensions avoids the need for defining and referencing an auxiliary variable, reducing clutter.

  • From importlib/resources/readers.py:
    # current:
    children = (child for path in self._paths for child in path.iterdir())
    
    # improved:
    children = (*path.iterdir() for path in self._paths)
    
  • From asyncio/base_events.py:
    # current:
    exceptions = [exc for sub in exceptions for exc in sub]
    
    # improved:
    exceptions = [*sub for sub in exceptions]
    
  • From _weakrefset.py:
    # current:
    return self.__class__(e for s in (self, other) for e in s)
    
    # improved:
    return self.__class__(*s for s in (self, other))
    

How to Teach This

Currently, a common way to introduce the notion of comprehensions (which is employed by the Python Tutorial) is to demonstrate equivalent code. For example, this method would say that, for example, out = [expr for x in it] is equivalent to the following code:

out = []
for x in it:
    out.append(expr)

Taking this approach, we can introduce out = [*expr for x in it] as instead being equivalent to the following (which uses extend instead of append):

out = []
for x in it:
    out.extend(expr)

Set and dict comprehensions that make use of unpacking can also be introduced by a similar analogy:

# equivalent to out = {expr for x in it}
out = set()
for x in it:
    out.add(expr)

# equivalent to out = {*expr for x in it}
out = set()
for x in it:
    out.update(expr)

# equivalent to out = {k_expr: v_expr for x in it}
out = {}
for x in it:
    out[k_expr] = v_expr

# equivalent to out = {**expr for x in it}
out = {}
for x in it:
    out.update(expr)

And we can take a similar approach to illustrate the behavior of generator expressions that involve unpacking:

# equivalent to g = (expr for x in it)
def generator():
    for x in it:
        yield expr
g = generator()

# equivalent to g = (*expr for x in it)
def generator():
    for x in it:
        yield from expr
g = generator()

We can then generalize from these specific examples to the idea that, wherever a non-starred comprehension/genexp would use an operator that adds a single element to a collection, the starred would instead use an operator that adds multiple elements to that collection.

Alternatively, we don’t need to think of the two ideas as separate; instead, with the new syntax, we can think of out = [...x... for x in it] as equivalent to the following code [1], regardless of whether or not ...x... uses *:

out = []
for x in it:
    out.extend([...x...])

Similarly, we can think of out = {...x... for x in it} as equivalent to the following code, regardless of whether or not ...x... uses * or ** or ::

out = set()
for x in it:
    out.update({...x...})

These examples are equivalent in the sense that the output they produce would be the same in both the version with the comprehension and the version without it, but note that the non-comprehension version is slightly less efficient due to making new lists/sets/dictionaries before each extend or update, which is unnecessary in the version that uses comprehensions.

Rejected Alternative Proposals

The primary goal when thinking through the specification above was consistency with existing norms around unpacking and comprehensions / generator expressions. One way to interpret this is that the goal was to write the specification so as to require the smallest possible change(s) to the existing grammar and code generation, letting the existing code inform the surrounding semantics.

Below we discuss some of the common concerns/alternative proposals that came up in discussions but that are not included in this proposal.

Starred Generators as Function Arguments

One common concern that has arisen multiple times (not only in the discussion threads linked above but also in previous discussions around this same idea) is a possible syntactical ambiguity when passing a starred generator as the sole argument to f(*x for x in y). In the original PEP 448, this ambiguity was cited as a reason for not including a similar generalization as part of the proposal.

This proposal suggests that f(*x for x in y) should be interpreted as f((*x for x in y)) and should not attempt further unpacking of the resulting generator, but several alternatives were suggested in our discussion (and/or have been suggested in the past), including:

  • interpreting f(*x for x in y) as f(*(x for x in y),
  • interpreting f(*x for x in y) as f(*(*x for x in y)), or
  • continuing to raise a SyntaxError for f(*x for x in y) even if the other aspects of this proposal are accepted.

The reason to prefer this proposal over these alternatives is the preservation of existent conventions for punctuation around generator expressions. Currently, the general rule is that generator expressions must be wrapped in parentheses except when provided as the sole argument to a function, and this proposal suggests maintaining that rule even as we allow more kinds of generator expressions. This option maintains a full symmetry between comprehensions and generator expressions that use unpacking and those that don’t.

Currently, we have the following conventions:

f([x for x in y])  # pass in a single list
f({x for x in y})  # pass in a single set
f(x for x in y)  # pass in a single generator (no additional parentheses required around genexp)

f(*[x for x in y])  # pass in elements from the list separately
f(*{x for x in y})  # pass in elements from the set separately
f(*(x for x in y))  # pass in elements from the generator separately (parentheses required)

This proposal opts to maintain those conventions even when the comprehensions make use of unpacking:

f([*x for x in y])  # pass in a single list
f({*x for x in y})  # pass in a single set
f(*x for x in y)  # pass in a single generator (no additional parentheses required around genexp)

f(*[*x for x in y])  # pass in elements from the list separately
f(*{*x for x in y})  # pass in elements from the set separately
f(*(*x for x in y))  # pass in elements from the generator separately (parentheses required)

Further Generalizing Unpacking Operators

Another suggestion that came out of the discussion involved further generalizing the * beyond simply allowing it to be used to unpack the expression in a comprehension. Two main flavors of this extension were considered:

  • making * and ** true unary operators that create a new kind of Unpackable object (or similar), which comprehensions could treat by unpacking it but which could also be used in other contexts; or
  • continuing to allow * and ** only in the places they are allowed elsewhere in this proposal (expression lists, comprehensions, generator expressions, and argument lists), but also allow them to be used in subexpressions within a comprehension, allowing, for example, the following as a way to flatten a list that contains some iterables but some non-iterable objects:
    [*x if isinstance(x, Iterable) else x for x in [[1,2,3], 4]]
    

These variants were considered substantially more complex (both to understand and to implement) and of only marginal utility, so neither is included in this PEP. As such, these forms should continue to raise a SyntaxError, but with a new error message as described above, though it should not be ruled out as a consideration for future proposals.

Concerns and Disadvantages

Although the general consensus from the discussion thread seemed to be that this syntax was clear and intuitive, several concerns and potential downsides were raised as well. This section aims to summarize those concerns.

  • Overlap with existing alternatives: While the proposed syntax is arguably clearer and more concise, there are already several ways to accomplish this same thing in Python.
  • Function call ambiguity: Expressions like f(*x for x in y) may initially appear ambiguous, as it’s not obvious whether the intent is to unpack the generator or to pass it as a single argument. Although this proposal retains existing conventions by treating that form as equivalent to f((*x for x in y)), that equivalence may not be immediately obvious.
  • Potential for overuse or abuse: Complex uses of unpacking in comprehensions could obscure logic that would be clearer in an explicit loop. While this is already a concern with comprehensions more generally, the addition of * and ** may make particularly-complex uses even more difficult to read and understand at a glance. For example, while these situations are likely rare, comprehensions that use unpacking in multiple ways can make it difficult to know what’s being unpacked and when: f(*(*x for *x, _ in list_of_lists)).
  • Unclear limitation of scope: This proposal restricts unpacking to the top level of the comprehension expression, but some users may expect that the unpacking operator is being further generalized as discussed in Further Generalizing Unpacking Operators.
  • Effect on External Tools: As with any change to Python’s syntax, making this change would create work for maintainers of code formatters, linters, type checkers, etc., to make sure that the new syntax is supported.

Other Languages

Quite a few other languages support this kind of flattening with syntax similar to what is already available in Python, but support for using unpacking syntax within comprehensions is rare. This section provides a brief summary of support for similar syntax in a few other languages.

Many languages that support comprehensions support double loops:

# python
[x for xs in [[1,2,3], [], [4,5]] for x in xs * 2]

-- haskell
[x | xs <- [[1,2,3], [], [4,5]], x <- xs ++ xs]

# julia
[x for xs in [[1,2,3], [], [4,5]] for x in [xs; xs]]

; clojure
(for [xs [[1 2 3] [] [4 5]] x (concat xs xs)] x)

Several other languages (even those without comprehensions) support these operations via a built-in function/method to support flattening of nested structures:

# python
list(itertools.chain(*(xs*2 for xs in [[1,2,3], [], [4,5]])))

// Javascript
[[1,2,3], [], [4,5]].flatMap(xs => [...xs, ...xs])

-- haskell
concat (map (\x -> x ++ x) [[1,2,3], [], [4,5]])

# ruby
[[1, 2, 3], [], [4, 5]].flat_map {|e| e * 2}

However, languages that support both comprehension and unpacking do not tend to allow unpacking within a comprehension. For example, the following expression in Julia currently leads to a syntax error:

[xs... for xs in [[1,2,3], [], [4,5]]]

As one counterexample, support for a similar syntax was recently added to Civet. For example, the following is a valid comprehension in Civet, making use of Javascript’s ... syntax for unpacking:

for xs of [[1,2,3], [], [4,5]] then ...(xs++xs)

References


Source: https://github.com/python/peps/blob/main/peps/pep-0798.rst

Last modified: 2025-07-20 03:07:04 GMT