PEP 643 – Metadata for Package Source Distributions
- Paul Moore <p.f.moore at gmail.com>
- Paul Ganssle <paul at ganssle.io>
- Discourse thread
- Standards Track
- 24-Oct-2020, 01-Nov-2020, 02-Nov-2020, 14-Nov-2020
- Discourse post
Python package metadata is stored in the distribution file in a standard format, defined in the Core Metadata Specification. However, for source distributions, while the format of the data is defined, there has traditionally been a lot of inconsistency in what data is recorded in the source distribution. See here for a discussion of this issue.
As a result, metadata consumers are unable to rely on the data available from source distributions, and need to use the (costly) PEP 517 build mechanisms to extract medatata.
This PEP defines a standard that allows build backends to reliably store package metadata in the source distribution, while still retaining the necessary flexibility to handle metadata fields that have to be calculated at build time.
There are a number of issues with the way that metadata is currently stored in source distributions:
- The details of how to store metadata, while standardised, are not easy to find.
- The specification requires an old metadata version, and has not been updated in line with changes to the core metadata spec.
- There is no way in the spec to distinguish between “this field has been omitted because its value will not be known until build time” and “this field does not have a value”.
- The core metadata specification allows most fields to be optional, meaning that the previous issue affects nearly every metadata field.
This PEP proposes an update to the metadata specification to allow recording of fields which are expected to be “filled in later”, and updates the source distribution specification to clarify that backends should record sdist metadata using that version of the spec (or later).
This PEP allows projects to define source distribution metadata values as being “dynamic”. In this context, saying that a field is “dynamic” means that the value has not been fixed at the time that the source distribution was generated. Dynamic values will be supplied by the build backend at the time when the wheel is generated, and could depend on details of the build environment.
PEP 621 has a similar concept, of “dynamic” values that will be “filled in later”, and so we choose to use the same term here by analogy.
This PEP defines the relationship between metadata values specified in a source distribution, and the corresponding values in wheels built from it. It requires build backends to clearly mark any fields which will not simply be copied unchanged from the sdist to the wheel.
In addition, this PEP makes the PyPA Specifications document the canonical location for the specification of the source distribution format (collecting the information in PEP 517 and in this PEP).
A new field,
Dynamic, will be added to the Core Metadata Specification.
This field will be multiple use, and will be allowed to contain the name
of another core metadata field.
When found in the metadata of a source distribution, the following rules apply:
- If a field is not marked as
Dynamic, then the value of the field in any wheel built from the sdist MUST match the value in the sdist. If the field is not in the sdist, and not marked as
Dynamic, then it MUST NOT be present in the wheel.
- If a field is marked as
Dynamic, it may contain any valid value in a wheel built from the sdist (including not being present at all).
- Backends MUST NOT mark a field as
Dynamicif they can determine that it was generated from data that will not change at build time.
Backends MAY record the value they calculated for a field they mark as
Dynamic in a source distribution. Consumers, however, MUST NOT treat
this value as canonical, but MAY use it as an hint about what the final
value in a wheel could be.
In any context other than a source distribution, if a field is marked as
Dynamic, that indicates that the value was generated at wheel build
time and may not match the value in the sdist (or in other builds of the
project). Backends are not required to record this information, though,
and consumers MUST NOT assume that the lack of a
Dynamic marking has
any significance, except in a source distribution.
Version MUST NOT be marked as
As it adds a new metadata field, this PEP updates the core metadata format to version 2.2.
Source distributions SHOULD use the latest version of the core metadata specification that was available when they were created.
Build backends are strongly encouraged to only mark fields as
Dynamic when absolutely necessary, and to encourage projects to
avoid backend features that require the use of
should prefer to use environment markers on static values to adapt to
details of the install location.
As this proposal increments the core metadata version, it is compatible with existing source distributions, which will use an older metadata version. Tools can determine whether a source distribution conforms to this PEP by checking the metadata version.
As this specification is purely for the storage of data that is intended to be publicly available, there are no security implications.
How to Teach This
This is a data storage format for project metadata, and so will not typically be visible to end users. There is therefore no need to teach users how to use the format. Developers wanting to reference the metadata will be able to find the details in the PyPA Specifications.
- Rather than marking fields as
Dynamic, fields should be assumed to be dynamic unless explicitly marked as
This is logically equivalent to the current proposal, but it implies that fields being dynamic is the norm. Packaging tools can be much more efficient in the presence of metadata that is known to be static, so the PEP chooses to make dynamic fields the exception, and require backends to “opt in” to making a field dynamic.
In addition, if dynamic is the default, then in future, as more and more metadata becomes static, metadata files will include an increasing number of
- Rather than having a
Dynamicfield, add a special value that indicates that a field is “not yet defined”.
Again, this is logically equivalent to the current proposal. It makes “being dynamic” an explicit choice, but requires a special value. As some fields can contain arbitrary text, choosing a such a value is somewhat awkward (although likely not a problem in practice). There does not seem to be enough benefit to this approach to make it worth using instead of the proposed mechanism.
- Special handling of
Early drafts of the PEP needed special discussion of
Requires-Python, because the lack of environment markers for this field meant that it might be difficult to require it to be static. The final form of the PEP no longer needs this, as the idea of a whitelist of fields allowed to be dynamic was dropped.
- Restrict the use of
Dynamicto a minimal “white list” of permitted fields.
This approach was likely to prove extremely difficult for setuptools to implement in a backward compatible way, due to the dynamic nature of the setuptools interface. Instead, the proposal now allows most fields to be dynamic, but encourages backends to avoid dynamic values unless essential.
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.
Last modified: 2021-12-13 19:17:45 GMT