PEP 794 – Import Name Metadata
- Author:
- Brett Cannon <brett at python.org>
- Discussions-To:
- Discourse thread
- Status:
- Draft
- Type:
- Standards Track
- Topic:
- Packaging
- Created:
- 05-Jun-2025
- Post-History:
- 02-May-2025 05-Jun-2025
Abstract
This PEP proposes extending the core metadata specification for Python
packaging to include a new, repeatable field named Import-Name
to record
the import names that a project owns/provides once installed. A new key named
import-names
will be added to the [project]
table in
pyproject.toml
for providing the values for the new core metadata field.
This also leads to the introduction of core metadata version 2.5.
Motivation
In Python packaging there is no requirement that a project name match the name(s) that you can import for that project. As such, there is no clean, easy, accurate way to go from import name to project name and vice-versa. This can make it difficult for tools that try to help people in discovering the right project to install when they know the import name or knowing what import names a project will provide once installed.
As an example, a code editor may detect a user has an unsatisfied import in a
selected virtual environment. But with no way to reliably know what import
names various projects provide, the code editor cannot accurately
provide a user with a list of potential projects to install to satisfy that
import requirement (e.g. it is not obvious that import PIL
very likely
implies the user wants the Pillow project installed). This also applies to when a
user vaguely remembers the project name but does not remember the import
name(s) and would have their memory jogged when seeing a list of import names
a package provides. Finally, tools would be able to notify users what import
names will become available once they install a project.
It may also help with spam detection. If a project specifies the same import names as a very popular project it can act as a signal to take a closer look at the validity of the less popular project. A project found to be lying about what import names it provides would be another signal.
Rationale
This PEP proposes extending the packaging Core metadata specifications so that project owners can specify the highest-level import names that a project provides and owns if installed on some platform.
Putting this metadata in the core metadata means the data is (potentially) served independently of any sdist or wheel by an index server. That negates needing to come up with another way to expose the metadata to tools to avoid having to download an entire e.g. wheel.
Having this metadata be the same across all release artifacts would allow for
projects to only have to check a single file’s core metadata to get all
possible import names instead of checking all the released files. This also
means one does not need to worry if a file is missing when reading the core
metadata or one can work solely from an sdist if the metadata is provided. As
well, it simplifies having a project.import-names
key in
pyproject.toml
by having it be consistent for the entire project version
and not unique per released file for the same version.
This PEP is not overly strict on what to (not) list in the proposed metadata on purpose. Having build back-ends verify that a project is accurately following a specification that is somehow strict about what can be listed would be near impossible to get right due to how flexible Python’s import system is. As such, this PEP only requires that valid import names be used and that projects don’t lie (and it is acknowledged the latter requirements cannot be validated programmatically).
Various other attempts have been made to solve this, but they all have to make various trade-offs. For instance, one could download every wheel for every project release and look at what files are provided via the Binary distribution format, but that’s a lot of CPU and bandwidth for something that is static information (although tricks can be used to lessen the data requests such as using HTTP range requests to only read the table of contents of the zip file). This sort of calculation is also currently repeated by everyone independently instead of having the metadata hosted by a central index server like PyPI. It also doesn’t work for sdists as the structure of the wheel isn’t known yet, and so inferring the structure of the code installed isn’t known yet. As well, these solutions are not necessarily accurate as it is based on inference instead of being explicitly provided by the project owners. All of these accuracy issues affect even having an index hosting the information to avoid the compute costs of gathering the information.
Specification
Because this PEP introduces a new field to the core metadata, it bumps the latest core metadata version to 2.5.
The Import-Name
field is a “multiple uses” field. Each entry of
Import-Name
MUST be a valid import name. The names specified in
Import-Name
MUST be importable when the project is installed on some
platform for the same version of the project (e.g. the metadata MUST be
consistent across all sdists and wheels for a project release). This does
imply that the information isn’t specific to the distribution artifact it is
found in, but for the release version the distribution artifact belongs to.
Projects are not required to list every single import name that is provided.
Instead, projects SHOULD list the highest-level/shortest import name that the
project would “own” when installed (this includes “private” names). For
example, if you install a project that has a single package named myproj
which itself has multiple submodules, the expectation is only myproj
would be listed in Import-Name
and not every submodule. If a project is
part of a namespace package named ns
and it provides a subpackage called
ns.myproj
(i.e. ns.myproj.__init__
exists), then ns.myproj
should
be listed in Import-Name
, but NOT ns
alone as that is not “owned” by
the project upon installation (i.e. other projects can be installed which
also contribute to ns
).
If a project chooses not to provide any Import-Name
entries, tools MAY
assume the import name matches the project name (including de-normalization of
the project name, e.g. my-proj
as my_proj
).
The pyproject.toml specification will gain an import-names
key. It
will be an array of strings that stores what will be written out to
Import-Name
. Build back-ends MAY support dynamically calculating the
value on the user’s behalf if desired, if the user declares the key in
project.dynamic
.
Examples
In scikit-learn 1.7.0 ,
an entry for the sklearn
package would be used.
In pytest 8.3.5 there would be 3 expected entries:
_pytest
py
pytest
In azure-mgmt-search 9.1.0,
there should be a single entry for azure.mgmt.search
.
Backwards Compatibility
As this is a new field for the core metadata and a new core metadata version, there should be no backwards compatibility concerns.
Security Implications
Tools should treat the metadata as potentially inaccurate. As such, any decisions made based on the provided metadata should be assumed to be malicious in some way.
How to Teach This
Project owners should be taught that they can now record what namespaces their project provides. They should be told that if their project has a non-obvious namespace from the file structure of the project that they should specify the appropriate information. They should have it explained to them that they should use the shortest name possible that appropriately explains what the project provides (i.e. what the specification requires to be recorded).
Users of projects don’t necessarily need to know about this new metadata.
While they may be exposed to it via tooling, the details of where that data
came from isn’t critical. It’s possible they may come across it if an index
server exposed it (e.g., listed the values from Import-Name
and marked
whether the file structure backed up the claims the metadata makes), but that
still wouldn’t require users to know the technical details of this PEP.
Reference Implementation
https://github.com/brettcannon/packaging/tree/pep-794 is a branch to update ‘packaging’ to support this PEP.
Rejected Ideas
Re-purpose the Provides
field
Introduced in metadata version 1.1 and deprecated in 1.2, the Provides
field was meant to provide similar information, except for all names
provided by a project instead of the distinguishing namespaces as this PEP
proposes. Based on that difference and the fact that Provides
is
deprecated and thus could be ignored by preexisting code, the decision was
made to go with a new field.
Name the field Namespace
While the term “namespace” name is technically accurate from an import perspective, it could be confused with implicit namespace packages.
Serving the RECORD
file
During discussions about a pre-PEP version of this
PEP, it was suggested that the RECORD
file from wheels be served from
index servers instead of this new metadata. That would have the benefit of
being implementable immediately. But in order to provide the equivalent
information there would be necessary inference based on the file structure of
what would be installed by the wheel. That could lead to inaccurate
information. It also doesn’t support sdists.
In the end a poll was held and the approach this PEP takes won out.
Be more prescriptive in what projects specify
An earlier version of this PEP was much more strict in what could be put into
Import-Name
. This included turning some “SHOULD” guidelines into “MUST”
requirements and being specific about how to calculate what a project “owned”.
In the end it was decided that was too restrictive and risked being implemented
incorrectly or the spec being unexpectedy too strict.
Since the metadata was never expected to be exhaustive as it can’t be verified to be, the looser spec that is currently in this PEP was chosen instead.
Open Issues
N/A
Acknowledgments
Thanks to HeeJae Chang for ~~complaining about~~ bringing up regularly the usefulness that this metadata would provide. Thanks to Josh Cannon (no relation) for reviewing drafts of this PEP and providing feedback. Also, thanks to everyone who participated in a previous discussion on this topic.
Copyright
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive. CC0-1.0-Universal license, whichever is more permissive.
Source: https://github.com/python/peps/blob/main/peps/pep-0794.rst
Last modified: 2025-06-11 22:35:17 GMT