PEP 798 – Unpacking in Comprehensions
- Author:
- Adam Hartz <hz at mit.edu>, Erik Demaine <edemaine at mit.edu>
- Sponsor:
- Jelle Zijlstra <jelle.zijlstra at gmail.com>
- Discussions-To:
- Discourse thread
- Status:
- Draft
- Type:
- Standards Track
- Created:
- 19-Jul-2025
- Python-Version:
- 3.15
- Post-History:
- 16-Oct-2021, 22-Jun-2025
Abstract
This PEP proposes extending list, set, and dictionary comprehensions, as well
as generator expressions, to allow unpacking notation (*
and **
) at the
start of the expression, providing a concise way of combining an arbitrary
number of iterables into one list or set or generator, or an arbitrary number
of dictionaries into one dictionary, for example:
[*it for it in its] # list with the concatenation of iterables in 'its'
{*it for it in its} # set with the union of iterables in 'its'
{**d for d in dicts} # dict with the combination of dicts in 'dicts'
(*it for it in its) # generator of the concatenation of iterables in 'its'
Motivation
Extended unpacking notation (*
and **
) from PEP 448 makes it
easy to combine a few iterables or dictionaries:
[*it1, *it2, *it3] # list with the concatenation of three iterables
{*it1, *it2, *it3} # set with the union of three iterables
{**dict1, **dict2, **dict3} # dict with the combination of three dicts
But if we want to similarly combine an arbitrary number of iterables, we cannot use unpacking in this same way.
That said, we do have a few options for combining multiple iterables. We could, for example, use explicit looping structures and built-in means of combination:
new_list = []
for it in its:
new_list.extend(it)
new_set = set()
for it in its:
new_set.update(it)
new_dict = {}
for d in dicts:
new_dict.update(d)
def new_generator():
for it in its:
yield from it
Or, we could be more concise by using a comprehension with two loops:
[x for it in its for x in it]
{x for it in its for x in it}
{key: value for d in dicts for key, value in d.items()}
(x for it in its for x in it)
Or, we could use itertools.chain
or itertools.chain.from_iterable
:
list(itertools.chain(*its))
set(itertools.chain(*its))
dict(itertools.chain(*(d.items() for d in dicts)))
itertools.chain(*its)
list(itertools.chain.from_iterable(its))
set(itertools.chain.from_iterable(its))
dict(itertools.chain.from_iterable(d.items() for d in dicts))
itertools.chain.from_iterable(its)
Or, for all but the generator, we could use functools.reduce
:
functools.reduce(operator.iconcat, its, (new_list := []))
functools.reduce(operator.ior, its, (new_set := set()))
functools.reduce(operator.ior, its, (new_dict := {}))
This PEP proposes allowing unpacking operations to be used in comprehensions as an additional alternative:
[*it for it in its] # list with the concatenation of iterables in 'its'
{*it for it in its} # set with the union of iterables in 'its'
{**d for d in dicts} # dict with the combination of dicts in 'dicts'
(*it for it in its) # generator of the concatenation of iterables in 'its' (*it for it in its)
This proposal also extends to asynchronous comprehensions and generator
expressions, such that, for example, (*ait async for ait in aits())
is
equivalent to (x async for ait in aits() for x in ait)
.
Rationale
Combining iterable objects together into a single larger object is a common task. One StackOverflow post asking about flattening a list of lists, for example, has been viewed 4.6 million times. Despite this being a common operation, the options currently available for performing it concisely require levels of indirection that can make the resulting code difficult to read and understand.
The proposed notation is concise (avoiding the use and repetition of auxiliary variables) and, we expect, intuitive and familiar to programmers familiar with both comprehensions and unpacking notation (see Code Examples for examples of code from the standard library that could be rewritten more clearly and concisely using the proposed syntax).
This proposal was motivated in part by a written exam in a Python programming
class, where several students used the notation (specifically the set
version) in their solutions, assuming that it already existed in Python. This
suggests that the notation is intuitive, even to beginners. By contrast, the
existing syntax [x for it in its for x in it]
is one that students often
get wrong, the natural impulse for many students being to reverse the order of
the for
clauses.
Specification
Syntax
The necessary grammatical changes are allowing the expression in list/set
comprehensions and generator expressions to be preceded by a *
, and
allowing an alternative form of dictionary comprehension in which a
double-starred expression can be used in place of a key: value
pair.
This can be accomplished by updating the listcomp
and setcomp
rules to
use star_named_expression
instead of named_expression
:
listcomp[expr_ty]:
| '[' a=star_named_expression b=for_if_clauses ']'
setcomp[expr_ty]:
| '{' a=star_named_expression b=for_if_clauses '}'
The rule for genexp
would similarly need to be modified to allow a
starred_expression
:
genexp[expr_ty]:
| '(' a=(assignment_expression | expression !':=' | starred_expression) b=for_if_clauses ')'
The rule for dictionary comprehensions would need to be adjusted as well, to allow for this new form:
dictcomp[expr_ty]:
| '{' a=double_starred_kvpair b=for_if_clauses '}'
No change should be made to the way that argument unpacking is handled in
function calls, i.e., the general rule that generator expressions provided as
the sole argument to functions do not require additional redundant parentheses
should be retained. Note that this implies that, for example, f(*x for x in
it)
is equivalent to f((*x for x in it))
(see Starred Generators as Function Arguments
for more discussion).
*
and **
should only be allowed at the top-most level of the expression
in the comprehension (see Further Generalizing Unpacking Operators for more discussion).
Semantics: List/Set/Dict Comprehensions
The meaning of a starred expression in a list comprehension [*expr for x in
it]
is to treat each expression as an iterable, and concatenate them, in the
same way as if they were explicitly listed via [*expr1, *expr2, ...]
.
Similarly, {*expr for x in it}
forms a set union, as if the expressions
were explicitly listed via {*expr1, *expr2, ...}
; and {**expr for x in
it}
combines dictionaries, as if the expressions were explicitly listed via
{**expr1, **expr2, ...}
. These operations should retain all of the
equivalent semantics for combining collections in this way (including, for
example, later values replacing earlier ones in the case of a duplicated key
when combining dictionaries).
Said another way, the objects created by the following comprehensions:
new_list = [*expr for x in its]
new_set = {*expr for x in its}
new_dict = {**expr for d in dicts}
should be equivalent to the objects created by the following pieces of code, respectively:
new_list = []
for x in its:
new_list.extend(expr)
new_set = set()
for x in its:
new_set.update(expr)
new_dict = {}
for x in dicts:
new_dict.update(expr)
Semantics: Generator Expressions
A generator expression (*expr for x in it)
forms a generator producing
values from the concatenation of the iterables given by the expressions.
Specifically, the behavior is defined to be equivalent to the following
generator:
def generator():
for x in it:
yield from expr
Since yield from
is not allowed inside of async generators, the equivalent
for (*expr async for x in ait())
, is more like the following (though of
course this new form should not define or reference the looping variable
i
):
async def generator():
async for x in ait():
for i in expr:
yield i
Interaction with Assignment Expressions
Note that this proposal does not suggest changing the order of evaluation of
the various pieces of the comprehension, nor any rules about scoping. This is
particularly relevant for generator expressions that make use of the “walrus
operator” :=
from PEP 572, which, when used in a comprehension or a
generator expression, performs its variable binding in the containing scope
rather than locally to the comprehension.
As an example, consider the generator that results from evaluating the
expression (*(y := [i, i+1]) for i in (0, 2, 4))
. This is approximately
equivalent to the following generator, except that in its generator expression
form, y
will be bound in the containing scope instead of locally:
def generator():
for i in (0, 2, 4):
yield from (y := [i, i+1])
In this example, the subexpression (y := [i, i+1])
is evaluated exactly
three times before the generator is exhausted: just after assigning i
in
the comprehension to 0
, 2
, and 4
, respectively. Thus, y
(in
the containing scope) will be modified at those points in time:
>>> g = (*(y := [i, i+1]) for i in (0, 2, 4))
>>> y
Traceback (most recent call last):
File "<python-input-1>", line 1, in <module>
y
NameError: name 'y' is not defined
>>> next(g)
0
>>> y
[0, 1]
>>> next(g)
1
>>> y
[0, 1]
>>> next(g)
2
>>> y
[2, 3]
Error Reporting
Currently, the proposed syntax generates a SyntaxError
. Allowing these
forms to be recognized as syntactically valid requires adjusting the grammar
rules for invalid_comprehension
and invalid_dict_comprehension
to allow
the use of *
and **
, respectively.
Additional specific error messages should be provided in at least the following cases:
- Attempting to use
**
in a list comprehension or generator expression should report that dictionary unpacking cannot be used in those structures, for example:>>> [**x for x in y] File "<stdin>", line 1 [**x for x in y] ^^^ SyntaxError: cannot use dict unpacking in list comprehension >>> (**x for x in y) File "<stdin>", line 1 (**x for x in y) ^^^ SyntaxError: cannot use dict unpacking in generator expression
- The existing error message for attempting to use
*
in a dictionary key/value should be retained, but similar messages should be reported when attempting to use**
unpacking on a dictionary key or value, for example:>>> {*k: v for k,v in items} File "<stdin>", line 1 {*k: v for k,v in items} ^^ SyntaxError: cannot use a starred expression in a dictionary key >>> {k: *v for k,v in items} File "<stdin>", line 1 {k: *v for k,v in items} ^^ SyntaxError: cannot use a starred expression in a dictionary value >>> {**k: v for k,v in items} File "<stdin>", line 1 {**k: v for k,v in items} ^^^ SyntaxError: cannot use dict unpacking in a dictionary key >>> {k: **v for k,v in items} File "<stdin>", line 1 {k: **v for k,v in items} ^^^ SyntaxError: cannot use dict unpacking in a dictionary value
- The phrasing of some other existing error messages should similarly be
adjusted to account for the presence of the new syntax, and/or to clarify
ambiguous or confusing cases relating to unpacking more generally
(particularly those mentioned in Further Generalizing Unpacking Operators), for example:
>>> [*x if x else y] File "<stdin>", line 1 [*x if x else y] ^^^^^^^^^^^^^^ SyntaxError: invalid starred expression. Did you forget to wrap the conditional expression in parentheses? >>> {**x if x else y} File "<stdin>", line 1 {**x if x else y} ^^^^^^^^^^^^^^^ SyntaxError: invalid double starred expression. Did you forget to wrap the conditional expression in parentheses? >>> [x if x else *y] File "<stdin>", line 1 [x if x else *y] ^ SyntaxError: cannot unpack only part of a conditional expression >>> {x if x else **y} File "<stdin>", line 1 {x if x else **y} ^^ SyntaxError: cannot use dict unpacking on only part of a conditional expression
Reference Implementation
A reference implementation is available, which implements this functionality, including draft documentation and additional test cases.
Backwards Compatibility
The behavior of all comprehensions that are currently syntactically valid would
be unaffected by this change, so we do not anticipate much in the way of
backwards-incompatibility concerns. In principle, this change would only
affect code that relied on the fact that attempting to use unpacking operations
in comprehensions would raise a SyntaxError
, or that relied on the
particular phrasing of any of the old error messages being replaced, which we
expect to be rare.
Code Examples
This section shows some illustrative examples of how small pieces of code from the standard library could be rewritten to make use of this new syntax to improve concision and readability. The Reference Implementation continues to pass all tests with these replacements made.
Replacing Explicit Loops
Replacing explicit loops compresses multiple lines into one, and avoids the need for defining and referencing an auxiliary variable.
- From
email/_header_value_parser.py
:# current: comments = [] for token in self: comments.extend(token.comments) return comments # improved: return [*token.comments for token in self]
- From
shutil.py
:# current: ignored_names = [] for pattern in patterns: ignored_names.extend(fnmatch.filter(names, pattern)) return set(ignored_names) # improved: return {*fnmatch.filter(names, pattern) for pattern in patterns}
- From
http/cookiejar.py
:# current: cookies = [] for domain in self._cookies.keys(): cookies.extend(self._cookies_for_domain(domain, request)) return cookies # improved: return [ *self._cookies_for_domain(domain, request) for domain in self._cookies.keys() ]
Replacing from_iterable and Friends
While not always the right choice, replacing itertools.chain.from_iterable
and map
can avoid an extra level of redirection, resulting in code that
follows conventional wisdom that comprehensions are more readable than
map/filter.
- From
dataclasses.py
:# current: inherited_slots = set( itertools.chain.from_iterable(map(_get_slots, cls.__mro__[1:-1])) ) # improved: inherited_slots = {*_get_slots(c) for c in cls.__mro__[1:-1]}
- From
importlib/metadata/__init__.py
:# current: return itertools.chain.from_iterable( path.search(prepared) for path in map(FastPath, paths) ) # improved: return (*FastPath(path).search(prepared) for path in paths)
- From
collections/__init__.py
(Counter
class):# current: return _chain.from_iterable(_starmap(_repeat, self.items())) # improved: return (*_repeat(elt, num) for elt, num in self.items())
- From
zipfile/_path/__init__.py
:# current: parents = itertools.chain.from_iterable(map(_parents, names)) # improved: parents = (*_parents(name) for name in names)
- From
_pyrepl/_module_completer.py
:# current: search_locations = set(chain.from_iterable( getattr(spec, 'submodule_search_locations', []) for spec in specs if spec )) # improved: search_locations = { *getattr(spec, 'submodule_search_locations', []) for spec in specs if spec }
Replacing Double Loops in Comprehensions
Replacing double loops in comprehensions avoids the need for defining and referencing an auxiliary variable, reducing clutter.
- From
importlib/resources/readers.py
:# current: children = (child for path in self._paths for child in path.iterdir()) # improved: children = (*path.iterdir() for path in self._paths)
- From
asyncio/base_events.py
:# current: exceptions = [exc for sub in exceptions for exc in sub] # improved: exceptions = [*sub for sub in exceptions]
- From
_weakrefset.py
:# current: return self.__class__(e for s in (self, other) for e in s) # improved: return self.__class__(*s for s in (self, other))
How to Teach This
Currently, a common way to introduce the notion of comprehensions (which is
employed by the Python Tutorial) is to demonstrate equivalent code. For
example, this method would say that, for example, out = [expr for x in it]
is equivalent to the following code:
out = []
for x in it:
out.append(expr)
Taking this approach, we can introduce out = [*expr for x in it]
as instead
being equivalent to the following (which uses extend
instead of
append
):
out = []
for x in it:
out.extend(expr)
Set and dict comprehensions that make use of unpacking can also be introduced by a similar analogy:
# equivalent to out = {expr for x in it}
out = set()
for x in it:
out.add(expr)
# equivalent to out = {*expr for x in it}
out = set()
for x in it:
out.update(expr)
# equivalent to out = {k_expr: v_expr for x in it}
out = {}
for x in it:
out[k_expr] = v_expr
# equivalent to out = {**expr for x in it}
out = {}
for x in it:
out.update(expr)
And we can take a similar approach to illustrate the behavior of generator expressions that involve unpacking:
# equivalent to g = (expr for x in it)
def generator():
for x in it:
yield expr
g = generator()
# equivalent to g = (*expr for x in it)
def generator():
for x in it:
yield from expr
g = generator()
We can then generalize from these specific examples to the idea that, wherever a non-starred comprehension/genexp would use an operator that adds a single element to a collection, the starred would instead use an operator that adds multiple elements to that collection.
Alternatively, we don’t need to think of the two ideas as separate; instead,
with the new syntax, we can think of out = [...x... for x in it]
as
equivalent to the following code [1], regardless of whether or not
...x...
uses *
:
out = []
for x in it:
out.extend([...x...])
Similarly, we can think of out = {...x... for x in it}
as equivalent to the
following code, regardless of whether or not ...x...
uses *
or **
or :
:
out = set()
for x in it:
out.update({...x...})
These examples are equivalent in the sense that the output they produce would
be the same in both the version with the comprehension and the version without
it, but note that the non-comprehension version is slightly less efficient due
to making new lists/sets/dictionaries before each extend
or update
,
which is unnecessary in the version that uses comprehensions.
Rejected Alternative Proposals
The primary goal when thinking through the specification above was consistency with existing norms around unpacking and comprehensions / generator expressions. One way to interpret this is that the goal was to write the specification so as to require the smallest possible change(s) to the existing grammar and code generation, letting the existing code inform the surrounding semantics.
Below we discuss some of the common concerns/alternative proposals that came up in discussions but that are not included in this proposal.
Starred Generators as Function Arguments
One common concern that has arisen multiple times (not only in the discussion
threads linked above but also in previous discussions around this same idea) is
a possible syntactical ambiguity when passing a starred generator as the sole
argument to f(*x for x in y)
. In the original PEP 448, this ambiguity
was cited as a reason for not including a similar generalization as part of the
proposal.
This proposal suggests that f(*x for x in y)
should be interpreted as
f((*x for x in y))
and should not attempt further unpacking of the
resulting generator, but several alternatives were suggested in our discussion
(and/or have been suggested in the past), including:
- interpreting
f(*x for x in y)
asf(*(x for x in y)
, - interpreting
f(*x for x in y)
asf(*(*x for x in y))
, or - continuing to raise a
SyntaxError
forf(*x for x in y)
even if the other aspects of this proposal are accepted.
The reason to prefer this proposal over these alternatives is the preservation of existent conventions for punctuation around generator expressions. Currently, the general rule is that generator expressions must be wrapped in parentheses except when provided as the sole argument to a function, and this proposal suggests maintaining that rule even as we allow more kinds of generator expressions. This option maintains a full symmetry between comprehensions and generator expressions that use unpacking and those that don’t.
Currently, we have the following conventions:
f([x for x in y]) # pass in a single list
f({x for x in y}) # pass in a single set
f(x for x in y) # pass in a single generator (no additional parentheses required around genexp)
f(*[x for x in y]) # pass in elements from the list separately
f(*{x for x in y}) # pass in elements from the set separately
f(*(x for x in y)) # pass in elements from the generator separately (parentheses required)
This proposal opts to maintain those conventions even when the comprehensions make use of unpacking:
f([*x for x in y]) # pass in a single list
f({*x for x in y}) # pass in a single set
f(*x for x in y) # pass in a single generator (no additional parentheses required around genexp)
f(*[*x for x in y]) # pass in elements from the list separately
f(*{*x for x in y}) # pass in elements from the set separately
f(*(*x for x in y)) # pass in elements from the generator separately (parentheses required)
Further Generalizing Unpacking Operators
Another suggestion that came out of the discussion involved further
generalizing the *
beyond simply allowing it to be used to unpack the
expression in a comprehension. Two main flavors of this extension were
considered:
- making
*
and**
true unary operators that create a new kind ofUnpackable
object (or similar), which comprehensions could treat by unpacking it but which could also be used in other contexts; or - continuing to allow
*
and**
only in the places they are allowed elsewhere in this proposal (expression lists, comprehensions, generator expressions, and argument lists), but also allow them to be used in subexpressions within a comprehension, allowing, for example, the following as a way to flatten a list that contains some iterables but some non-iterable objects:[*x if isinstance(x, Iterable) else x for x in [[1,2,3], 4]]
These variants were considered substantially more complex (both to understand
and to implement) and of only marginal utility, so neither is included in this
PEP. As such, these forms should continue to raise a SyntaxError
, but with
a new error message as described above, though it should not be ruled out as a
consideration for future proposals.
Concerns and Disadvantages
Although the general consensus from the discussion thread seemed to be that this syntax was clear and intuitive, several concerns and potential downsides were raised as well. This section aims to summarize those concerns.
- Overlap with existing alternatives: While the proposed syntax is arguably clearer and more concise, there are already several ways to accomplish this same thing in Python.
- Function call ambiguity:
Expressions like
f(*x for x in y)
may initially appear ambiguous, as it’s not obvious whether the intent is to unpack the generator or to pass it as a single argument. Although this proposal retains existing conventions by treating that form as equivalent tof((*x for x in y))
, that equivalence may not be immediately obvious. - Potential for overuse or abuse:
Complex uses of unpacking in comprehensions could obscure logic that would be
clearer in an explicit loop. While this is already a concern with
comprehensions more generally, the addition of
*
and**
may make particularly-complex uses even more difficult to read and understand at a glance. For example, while these situations are likely rare, comprehensions that use unpacking in multiple ways can make it difficult to know what’s being unpacked and when:f(*(*x for *x, _ in list_of_lists))
. - Unclear limitation of scope: This proposal restricts unpacking to the top level of the comprehension expression, but some users may expect that the unpacking operator is being further generalized as discussed in Further Generalizing Unpacking Operators.
- Effect on External Tools: As with any change to Python’s syntax, making this change would create work for maintainers of code formatters, linters, type checkers, etc., to make sure that the new syntax is supported.
Other Languages
Quite a few other languages support this kind of flattening with syntax similar to what is already available in Python, but support for using unpacking syntax within comprehensions is rare. This section provides a brief summary of support for similar syntax in a few other languages.
Many languages that support comprehensions support double loops:
# python
[x for xs in [[1,2,3], [], [4,5]] for x in xs * 2]
-- haskell
[x | xs <- [[1,2,3], [], [4,5]], x <- xs ++ xs]
# julia
[x for xs in [[1,2,3], [], [4,5]] for x in [xs; xs]]
; clojure
(for [xs [[1 2 3] [] [4 5]] x (concat xs xs)] x)
Several other languages (even those without comprehensions) support these operations via a built-in function/method to support flattening of nested structures:
# python
list(itertools.chain(*(xs*2 for xs in [[1,2,3], [], [4,5]])))
// Javascript
[[1,2,3], [], [4,5]].flatMap(xs => [...xs, ...xs])
-- haskell
concat (map (\x -> x ++ x) [[1,2,3], [], [4,5]])
# ruby
[[1, 2, 3], [], [4, 5]].flat_map {|e| e * 2}
However, languages that support both comprehension and unpacking do not tend to allow unpacking within a comprehension. For example, the following expression in Julia currently leads to a syntax error:
[xs... for xs in [[1,2,3], [], [4,5]]]
As one counterexample, support for a similar syntax was recently added to Civet. For example, the following is a valid comprehension in
Civet, making use of Javascript’s ...
syntax for unpacking:
for xs of [[1,2,3], [], [4,5]] then ...(xs++xs)
References
Copyright
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.
Source: https://github.com/python/peps/blob/main/peps/pep-0798.rst
Last modified: 2025-07-20 03:07:04 GMT