PEP: 692 Title: Using TypedDict for more precise **kwargs typing Author:
Franek Magiera <framagie@gmail.com> Sponsor: Jelle Zijlstra
<jelle.zijlstra@gmail.com> Discussions-To:
https://discuss.python.org/t/pep-692-using-typeddict-for-more-precise-kwargs-typing/17314
Status: Final Type: Standards Track Topic: Typing Created: 29-May-2022
Python-Version: 3.12 Post-History: 29-May-2022, 12-Jul-2022,
12-Jul-2022, Resolution:
https://discuss.python.org/t/pep-692-using-typeddict-for-more-precise-kwargs-typing/17314/81

typing:unpack-kwargs

Abstract

Currently **kwargs can be type hinted as long as all of the keyword
arguments specified by them are of the same type. However, that
behaviour can be very limiting. Therefore, in this PEP we propose a new
way to enable more precise **kwargs typing. The new approach revolves
around using TypedDict to type **kwargs that comprise keyword arguments
of different types.

Motivation

Currently annotating **kwargs with a type T means that the kwargs type
is in fact dict[str, T]. For example:

    def foo(**kwargs: str) -> None: ...

means that all keyword arguments in foo are strings (i.e., kwargs is of
type dict[str, str]). This behaviour limits the ability to type annotate
**kwargs only to the cases where all of them are of the same type.
However, it is often the case that keyword arguments conveyed by
**kwargs have different types that are dependent on the keyword's name.
In those cases type annotating **kwargs is not possible. This is
especially a problem for already existing codebases where the need of
refactoring the code in order to introduce proper type annotations may
be considered not worth the effort. This in turn prevents the project
from getting all of the benefits that type hinting can provide.

Moreover, **kwargs can be used to reduce the amount of code needed in
cases when there is a top-level function that is a part of a public API
and it calls a bunch of helper functions, all of which expect the same
keyword arguments. Unfortunately, if those helper functions were to use
**kwargs, there is no way to properly type hint them if the keyword
arguments they expect are of different types. In addition, even if the
keyword arguments are of the same type, there is no way to check whether
the function is being called with keyword names that it actually
expects.

As described in the Intended Usage section, using **kwargs is not always
the best tool for the job. Despite that, it is still a widely used
pattern. As a consequence, there has been a lot of discussion around
supporting more precise **kwargs typing and it became a feature that
would be valuable for a large part of the Python community. This is best
illustrated by the mypy GitHub issue 4441 which contains a lot of real
world cases that could benefit from this propsal.

One more use case worth mentioning for which **kwargs are also
convenient, is when a function should accommodate optional keyword-only
arguments that don't have default values. A need for a pattern like that
can arise when values that are usually used as defaults to indicate no
user input, such as None, can be passed in by a user and should result
in a valid, non-default behavior. For example, this issue came up in the
popular httpx library.

Rationale

PEP 589 introduced the TypedDict type constructor that supports
dictionary types consisting of string keys and values of potentially
different types. A function's keyword arguments represented by a formal
parameter that begins with double asterisk, such as **kwargs, are
received as a dictionary. Additionally, such functions are often called
using unpacked dictionaries to provide keyword arguments. This makes
TypedDict a perfect candidate to be used for more precise **kwargs
typing. In addition, with TypedDict keyword names can be taken into
account during static type analysis. However, specifying **kwargs type
with a TypedDict means, as mentioned earlier, that each keyword argument
specified by **kwargs is a TypedDict itself. For instance:

    class Movie(TypedDict):
        name: str
        year: int

    def foo(**kwargs: Movie) -> None: ...

means that each keyword argument in foo is itself a Movie dictionary
that has a name key with a string type value and a year key with an
integer type value. Therefore, in order to support specifying kwargs
type as a TypedDict without breaking current behaviour, a new construct
has to be introduced.

To support this use case, we propose reusing Unpack which was initially
introduced in PEP 646. There are several reasons for doing so:

-   Its name is quite suitable and intuitive for the **kwargs typing use
    case as our intention is to "unpack" the keywords arguments from the
    supplied TypedDict.
-   The current way of typing *args would be extended to **kwargs and
    those are supposed to behave similarly.
-   There would be no need to introduce any new special forms.
-   The use of Unpack for the purposes described in this PEP does not
    interfere with the use cases described in PEP 646.

Specification

With Unpack we introduce a new way of annotating **kwargs. Continuing
the previous example:

    def foo(**kwargs: Unpack[Movie]) -> None: ...

would mean that the **kwargs comprise two keyword arguments specified by
Movie (i.e. a name keyword of type str and a year keyword of type int).
This indicates that the function should be called as follows:

    kwargs: Movie = {"name": "Life of Brian", "year": 1979}

    foo(**kwargs)                               # OK!
    foo(name="The Meaning of Life", year=1983)  # OK!

When Unpack is used, type checkers treat kwargs inside the function body
as a TypedDict:

    def foo(**kwargs: Unpack[Movie]) -> None:
        assert_type(kwargs, Movie)  # OK!

Using the new annotation will not have any runtime effect - it should
only be taken into account by type checkers. Any mention of errors in
the following sections relates to type checker errors.

Function calls with standard dictionaries

Passing a dictionary of type dict[str, object] as a **kwargs argument to
a function that has **kwargs annotated with Unpack must generate a type
checker error. On the other hand, the behaviour for functions using
standard, untyped dictionaries can depend on the type checker. For
example:

    def foo(**kwargs: Unpack[Movie]) -> None: ...

    movie: dict[str, object] = {"name": "Life of Brian", "year": 1979}
    foo(**movie)  # WRONG! Movie is of type dict[str, object]

    typed_movie: Movie = {"name": "The Meaning of Life", "year": 1983}
    foo(**typed_movie)  # OK!

    another_movie = {"name": "Life of Brian", "year": 1979}
    foo(**another_movie)  # Depends on the type checker.

Keyword collisions

A TypedDict that is used to type **kwargs could potentially contain keys
that are already defined in the function's signature. If the duplicate
name is a standard parameter, an error should be reported by type
checkers. If the duplicate name is a positional-only parameter, no
errors should be generated. For example:

    def foo(name, **kwargs: Unpack[Movie]) -> None: ...     # WRONG! "name" will
                                                            # always bind to the
                                                            # first parameter.

    def foo(name, /, **kwargs: Unpack[Movie]) -> None: ...  # OK! "name" is a
                                                            # positional-only parameter,
                                                            # so **kwargs can contain
                                                            # a "name" keyword.

Required and non-required keys

By default all keys in a TypedDict are required. This behaviour can be
overridden by setting the dictionary's total parameter as False.
Moreover, PEP 655 introduced new type qualifiers - typing.Required and
typing.NotRequired - that enable specifying whether a particular key is
required or not:

    class Movie(TypedDict):
        title: str
        year: NotRequired[int]

When using a TypedDict to type **kwargs all of the required and
non-required keys should correspond to required and non-required
function keyword parameters. Therefore, if a required key is not
supported by the caller, then an error must be reported by type
checkers.

Assignment

Assignments of a function typed with **kwargs: Unpack[Movie] and another
callable type should pass type checking only if they are compatible.
This can happen for the scenarios described below.

Source and destination contain **kwargs

Both destination and source functions have a **kwargs: Unpack[TypedDict]
parameter and the destination function's TypedDict is assignable to the
source function's TypedDict and the rest of the parameters are
compatible:

    class Animal(TypedDict):
        name: str

    class Dog(Animal):
        breed: str

    def accept_animal(**kwargs: Unpack[Animal]): ...
    def accept_dog(**kwargs: Unpack[Dog]): ...

    accept_dog = accept_animal  # OK! Expression of type Dog can be
                                # assigned to a variable of type Animal.

    accept_animal = accept_dog  # WRONG! Expression of type Animal
                                # cannot be assigned to a variable of type Dog.

Source contains **kwargs and destination doesn't

The destination callable doesn't contain **kwargs, the source callable
contains **kwargs: Unpack[TypedDict] and the destination function's
keyword arguments are assignable to the corresponding keys in source
function's TypedDict. Moreover, not required keys should correspond to
optional function arguments, whereas required keys should correspond to
required function arguments. Again, the rest of the parameters have to
be compatible. Continuing the previous example:

    class Example(TypedDict):
        animal: Animal 
        string: str
        number: NotRequired[int]

    def src(**kwargs: Unpack[Example]): ...
    def dest(*, animal: Dog, string: str, number: int = ...): ...

    dest = src  # OK!

It is worth pointing out that the destination function's parameters that
are to be compatible with the keys and values from the TypedDict must be
keyword only:

    def dest(dog: Dog, string: str, number: int = ...): ...

    dog: Dog = {"name": "Daisy", "breed": "labrador"}

    dest(dog, "some string")  # OK!

    dest = src                # Type checker error!
    dest(dog, "some string")  # The same call fails at
                              # runtime now because 'src' expects
                              # keyword arguments.

The reverse situation where the destination callable contains
**kwargs: Unpack[TypedDict] and the source callable doesn't contain
**kwargs should be disallowed. This is because, we cannot be sure that
additional keyword arguments are not being passed in when an instance of
a subclass had been assigned to a variable with a base class type and
then unpacked in the destination callable invocation:

    def dest(**kwargs: Unpack[Animal]): ...
    def src(name: str): ...

    dog: Dog = {"name": "Daisy", "breed": "Labrador"}
    animal: Animal = dog

    dest = src      # WRONG!
    dest(**animal)  # Fails at runtime.

Similar situation can happen even without inheritance as compatibility
between TypedDicts is based on structural subtyping.

Source contains untyped **kwargs

The destination callable contains **kwargs: Unpack[TypedDict] and the
source callable contains untyped **kwargs:

    def src(**kwargs): ...
    def dest(**kwargs: Unpack[Movie]): ...

    dest = src  # OK!

Source contains traditionally typed **kwargs: T

The destination callable contains **kwargs: Unpack[TypedDict], the
source callable contains traditionally typed **kwargs: T and each of the
destination function TypedDict's fields is assignable to a variable of
type T:

    class Vehicle:
        ...

    class Car(Vehicle):
        ...

    class Motorcycle(Vehicle):
        ...

    class Vehicles(TypedDict):
        car: Car
        moto: Motorcycle

    def dest(**kwargs: Unpack[Vehicles]): ...
    def src(**kwargs: Vehicle): ...

    dest = src  # OK!

On the other hand, if the destination callable contains either untyped
or traditionally typed **kwargs: T and the source callable is typed
using **kwargs: Unpack[TypedDict] then an error should be generated,
because traditionally typed **kwargs aren't checked for keyword names.

To summarize, function parameters should behave contravariantly and
function return types should behave covariantly.

Passing kwargs inside a function to another function

A previous point mentions the problem of possibly passing additional
keyword arguments by assigning a subclass instance to a variable that
has a base class type. Let's consider the following example:

    class Animal(TypedDict):
        name: str

    class Dog(Animal):
        breed: str

    def takes_name(name: str): ...

    dog: Dog = {"name": "Daisy", "breed": "Labrador"}
    animal: Animal = dog

    def foo(**kwargs: Unpack[Animal]):
        print(kwargs["name"].capitalize())

    def bar(**kwargs: Unpack[Animal]):
        takes_name(**kwargs)

    def baz(animal: Animal):
        takes_name(**animal)

    def spam(**kwargs: Unpack[Animal]):
        baz(kwargs)

    foo(**animal)   # OK! foo only expects and uses keywords of 'Animal'.

    bar(**animal)   # WRONG! This will fail at runtime because 'breed' keyword
                    # will be passed to 'takes_name' as well.

    spam(**animal)  # WRONG! Again, 'breed' keyword will be eventually passed
                    # to 'takes_name'.

In the example above, the call to foo will not cause any issues at
runtime. Even though foo expects kwargs of type Animal it doesn't matter
if it receives additional arguments because it only reads and uses what
it needs completely ignoring any additional values.

The calls to bar and spam will fail because an unexpected keyword
argument will be passed to the takes_name function.

Therefore, kwargs hinted with an unpacked TypedDict can only be passed
to another function if the function to which unpacked kwargs are being
passed to has **kwargs in its signature as well, because then additional
keywords would not cause errors at runtime during function invocation.
Otherwise, the type checker should generate an error.

In cases similar to the bar function above the problem could be worked
around by explicitly dereferencing desired fields and using them as
arguments to perform the function call:

    def bar(**kwargs: Unpack[Animal]):
        name = kwargs["name"]
        takes_name(name)

Using Unpack with types other than TypedDict

As described in the Rationale section, TypedDict is the most natural
candidate for typing **kwargs. Therefore, in the context of typing
**kwargs, using Unpack with types other than TypedDict should not be
allowed and type checkers should generate errors in such cases.

Changes to Unpack

Currently using Unpack in the context of typing is interchangeable with
using the asterisk syntax:

    >>> Unpack[Movie]
    *<class '__main__.Movie'>

Therefore, in order to be compatible with the new use case, Unpack's
repr should be changed to simply Unpack[T].

Intended Usage

The intended use cases for this proposal are described in the Motivation
section. In summary, more precise **kwargs typing can bring benefits to
already existing codebases that decided to use **kwargs initially, but
now are mature enough to use a stricter contract via type hints. Using
**kwargs can also help in reducing code duplication and the amount of
copy-pasting needed when there is a bunch of functions that require the
same set of keyword arguments. Finally, **kwargs are useful for cases
when a function needs to facilitate optional keyword arguments that
don't have obvious default values.

However, it has to be pointed out that in some cases there are better
tools for the job than using TypedDict to type **kwargs as proposed in
this PEP. For example, when writing new code if all the keyword
arguments are required or have default values then writing everything
explicitly is better than using **kwargs and a TypedDict:

    def foo(name: str, year: int): ...     # Preferred way.
    def foo(**kwargs: Unpack[Movie]): ...

Similarly, when type hinting third party libraries via stubs it is again
better to state the function signature explicitly - this is the only way
to type such a function if it has default arguments. Another issue that
may arise in this case when trying to type hint the function with a
TypedDict is that some standard function parameters may be treated as
keyword only:

    def foo(name, year): ...              # Function in a third party library.

    def foo(Unpack[Movie]): ...           # Function signature in a stub file.

    foo("Life of Brian", 1979)            # This would be now failing type
                                          # checking but is fine.

    foo(name="Life of Brian", year=1979)  # This would be the only way to call
                                          # the function now that passes type
                                          # checking.

Therefore, in this case it is again preferred to type hint such function
explicitly as:

    def foo(name: str, year: int): ...

Also, for the benefit of IDEs and documentation pages, functions that
are part of the public API should prefer explicit keyword parameters
whenever possible.

How to Teach This

This PEP could be linked in the typing module's documentation. Moreover,
a new section on using Unpack could be added to the aforementioned docs.
Similar sections could be also added to the mypy documentation and the
typing RTD documentation.

Reference Implementation

The mypy type checker already supports more precise **kwargs typing
using Unpack.

Pyright type checker also provides provisional support for this feature.

Rejected Ideas

TypedDict unions

It is possible to create unions of typed dictionaries. However,
supporting typing **kwargs with a union of typed dicts would greatly
increase the complexity of the implementation of this PEP and there
seems to be no compelling use case to justify the support for this.
Therefore, using unions of typed dictionaries to type **kwargs as
described in the context of this PEP can result in an error:

    class Book(TypedDict):
        genre: str
        pages: int

    TypedDictUnion = Movie | Book

    def foo(**kwargs: Unpack[TypedDictUnion]) -> None: ...  # WRONG! Unsupported use
                                                            # of a union of
                                                            # TypedDicts to type
                                                            # **kwargs

Instead, a function that expects a union of TypedDicts can be
overloaded:

    @overload
    def foo(**kwargs: Unpack[Movie]): ...

    @overload
    def foo(**kwargs: Unpack[Book]): ...

Changing the meaning of **kwargs annotations

One way to achieve the purpose of this PEP would be to change the
meaning of **kwargs annotations, so that the annotations would apply to
the entire **kwargs dict, not to individual elements. For consistency,
we would have to make an analogous change to *args annotations.

This idea was discussed in a meeting of the typing community, and the
consensus was that the change would not be worth the cost. There is no
clear migration path, the current meaning of *args and **kwargs
annotations is well-established in the ecosystem, and type checkers
would have to introduce new errors for code that is currently legal.

Introducing a new syntax

In the previous versions of this PEP, using a double asterisk syntax was
proposed to support more precise **kwargs typing. Using this syntax,
functions could be annotated as follows:

    def foo(**kwargs: **Movie): ...

Which would have the same meaning as:

    def foo(**kwargs: Unpack[Movie]): ...

This greatly increased the scope of the PEP, as it would require a
grammar change and adding a new dunder for the Unpack special form. At
the same the justification for introducing a new syntax was not strong
enough and became a blocker for the whole PEP. Therefore, we decided to
abandon the idea of introducing a new syntax as a part of this PEP and
may propose it again in a separate one.

References

Copyright

This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.