PEP 777 – How to Re-invent the Wheel
- Author:
- Ethan Smith <ethan at ethanhs.me>
- Sponsor:
- Barry Warsaw <barry at python.org>
- PEP-Delegate:
- Paul Moore <p.f.moore at gmail.com>
- Discussions-To:
- Discourse thread
- Status:
- Draft
- Type:
- Standards Track
- Topic:
- Packaging
- Created:
- 09-Oct-2024
- Post-History:
- 10-Oct-2024
Table of Contents
- Abstract
- Rationale
- Specification
- Backwards Compatibility
- Rejected Ideas
- The Wheel Format is Perfect and Does not Need to be Changed
- Wheel Format Changes Should be Tied to CPython Releases
- Keep Using
.whl
as the File Extension - Store the Wheel Major Version in the File Extension (
.whl2
) - Wheel 2.0 Should Change the Outer Container Format
- Why not Specify Wheel 2.0 In This PEP?
- Discussion Topics
- Acknowledgements
- Copyright
Abstract
The current wheel 1.0 specification was written over a decade ago, and has been extremely robust to changes in the Python packaging ecosystem. Previous efforts to improve the wheel specification were deferred to focus on other packaging specifications. Meanwhile, the use of wheels has changed dramatically in the last decade. There have been many requests for new wheel features over the years; however, a fundamental obstacle to evolving the wheel specification has been that there is no defined process for how to handle adding backwards-incompatible features to wheels. Therefore, to enable other PEPs to describe new enhancements to the wheel specification, this PEP prescribes compatibility requirements on future wheel revisions. This PEP does not specify a new wheel revision. The specification of a new wheel format (“Wheel 2.0”) is left to a future PEP.
Rationale
Currently, wheel specification changes that require new installer behavior are backwards incompatible and require a major version increase in the wheel metadata format. An increase of the wheel major version has yet to happen, partially because such a change has the potential to be catastrophically disruptive. Per the wheel specification, any installer that does not support the new major version must abort at install time. This means that if the major version were to be incremented without further planning, many users would see installation failures as older installers reject new wheels uploaded to public package indices like the Python Package Index (PyPI). It is critically important to carefully plan the interactions between build tools, package indices, and package installers to avoid incompatibility issues, especially considering the long tail of users who are slow to update their installers.
The backward compatibility concerns have prevented valuable improvements to the wheel file format, such as better compression, wheel data format improvements, better information about what is included in a wheel, and JSON formatted metadata in the “.dist-info” folder.
This PEP describes constraints and behavior for new wheel revisions to preserve stability for existing tools that do not support a new major version of the wheel format. This ensures that backwards incompatible changes to the wheel specification will only affect users and tools that are properly set up to use the newer wheels. With a clear path for evolving the wheel specification, future PEPs will be able to improve the wheel format without needing to re-define a completely new compatibility story.
Specification
Add Wheel-Version Metadata Field to Core Metadata
Currently, the wheel 1.0 PEP, PEP 427, specifies that wheel files
must contain a WHEEL
metadata file that contains the version of the wheel
specification that the file conforms to. PEP 427 stipulates that installers
MUST warn on installation of a wheel with a minor version greater than supported,
and MUST abort on installation of wheels with a major version that is greater than
what the installer supports. This ensures that users do not get invalid
installations from wheels that installers cannot properly install.
However, resolvers do not currently exclude wheels with an incompatible wheel
version. There is also currently no way for a resolver to check a wheel’s
version without downloading the wheel directly. To make wheel version filtering
easy for resolvers, the wheel version MUST be included in the relevant
metadata file (currently METADATA
). This will allow resolvers to
efficiently check the wheel version using the PEP 658 metadata API without
needing to download and inspect the .dist-info/WHEEL
file.
To accomplish this, a new field, Wheel-Version
, will be added to the
Core Metadata Specification.
This field is single use, and must contain the exact same version specified as
the Wheel-Version
entry in the WHEEL
file, or any future replacement
file defining metadata about the wheel file. If Wheel-Version
is absent
from the metadata file, then tools MUST infer the wheel file major
version as 1.
Wheel-Version
MUST NOT be included in source distribution metadata
(PKG-INFO
) files. If a tool encounters Wheel-Version
inside of a source
distribution metadata file, it SHOULD raise an error.
Wheel-Version
MAY be included in the metadata file for wheels of
version 1, but for wheels of version 2 or higher, the metadata file MUST
include Wheel-Version
. This enforces that future revisions of the wheel
specification can rely on resolvers skipping incompatible wheels by checking
the Wheel-Version
field. Build backends are encouraged to include
Wheel-Version
in all wheels that they generate, regardless of version.
Installers SHOULD copy the metadata file in a wheel unmodified during
installation. This prevents the need to update the RECORD
file, which is
an error prone process. Tools reading installed core metadata SHOULD NOT
assume that the field is present, as other installation formats may omit it.
When installing a wheel, installers MUST take the following steps:
- Check that the values of
Wheel-Version
in both the core metadata file and wheel metadata file match. If they do not match, installers MUST abort installation. Neither value takes precedence. - Check that the installer is compatible with
Wheel-Version
. IfWheel-Version
is absent, assume the version is 1.0. Warn if minor version is greater, abort if major version is greater. This procedure is identital to that in PEP 427. - Proceed with installation as specified in the Binary Distribution Format specification.
Resolver Behavior Regarding Wheel-Version
Resolvers, in the process of selecting a wheel to install, MUST check a
candidate wheel’s Wheel-Version
, and ignore incompatible wheel files.
Without ignoring these files, older installers might select a wheel that uses
an unsupported wheel version for that installer, and force the installer to
abort per PEP 427. By skipping incompatible wheel files, users will not see
installation errors when a project adopts a new wheel major version. As already
specified in PEP 427, installers MUST abort if a user tries to directly
install a wheel that is incompatible. If, in the process of resolving packages
found in multiple indices, a resolver comes across two wheels of the same
distribution and version, resolvers should prioritize the wheel of the highest
compatible version.
While the above protects users from unexpected breakages, users may miss a new release of a distribution if their installer does not support the wheel version used in the release. Imagine in the future that a package publishes 3.0 wheel files. Downstream users won’t see that there is a new release available if their installers only support 2.x wheels. Therefore, installers SHOULD emit a warning if, in the process of resolving packages, they come across an incompatible wheel and skip it.
First Major Version Bump Must Change File Extension
Unfortunately, existing resolvers do not check the compatibility of wheels
before selecting them as installation candidates. Until a majority of users
update to installers that properly check for wheel compatibility, it is unsafe
to allow publishing wheels of a new major version that existing resolvers might
select. It could take upwards of four years before the majority of users are on
updated resolvers, based on current data about PyPI installer usage (See the
Appendix: Analysis of Installer Usage on PyPI, for
details). To allow for experimentation and faster adoption of 2.0 wheels,
this PEP proposes a change to the file extension of the
wheel file format, from .whl
to .whlx
for all future wheel versions.
Note that x
in whlx
is the letter “x” and does not specify the wheel
major version. The change to extension name resolves the initial transition
issue of 2.0 wheels breaking users on existing installers that do not implement
Wheel-Version
checks. By using a different file extension, 2.0 wheels can
immediately be uploaded to PyPI, and users will be able to experiment with the
new features right away. Users on older installers will simply ignore these new
files.
One rejected alternative would be to keep the .whl
extension, but delay the
publishing of wheel 2.0 to PyPI. For more on that, please see Rejected Ideas.
Recommended Build Backend Behavior with New Wheel Formats
Build backends are recommended to generate the most compatible wheel based on features a project uses. For example, if a wheel does not use symbolic links, and such a feature was introduced in wheel 5.0, the build backend could generate a wheel of version 4.0. On the other hand, some features will want to be adopted by default. For example, if wheel 3.0 introduces better compression, the build backend may wish to enable this feature by default to improve the wheel size and download performance.
Limitations on Future Wheel Revisions
While it is difficult to know what future features may be planned for the wheel format, it is important that certain compatibility promises are maintained.
Wheel files, when installed, MUST stay compatible with the Python standard
library’s importlib.metadata
for all supported CPython versions. For
example, replacing .dist-info/METADATA
with a JSON formatted metadata file
MUST be a multi-major version migration with one version introducing the new
JSON file alongside the existing email header format, and another future
version removing the email header format metadata file. The version to remove
.dist-info/METADATA
also MUST be adopted only after the last CPython
release that lacked support for the new file reaches end of life. This ensures
that code using importlib.metadata
will not break with wheel major version
revisions.
Wheel files MUST remain ZIP format files as the outer container format.
Additionally, the .dist-info
metadata directory MUST be placed at the
root of the archive without any compression, so that unpacking the wheel file
produces a normal .dist-info
directory holding any metadata for the wheel.
Future wheel revisions MAY modify the layout, compression, and other
attributes about non-metadata components of a wheel such as data and code. This
assures that future wheel revisions remain compatible with tools operating on
package metadata, while allowing for improvements to code storage in the wheel,
such as adopting compression.
Package tooling MUST NOT assume that the contents and format of the wheel
file will remain the same for future wheel major versions beyond the
limitations above about metadata folder contents and outer container format.
For example, newer wheel major versions may add or remove filename components,
such as the build tag or the platform tag. Therefore it is incumbent upon
tooling to check the metadata for the Wheel-Version
before attempting to
install a wheel.
Finally, future wheel revisions MUST NOT use any compression formats not in the CPython standard library of at least the latest release. Wheels generated using any new compression format should be tagged as requiring at least the first released version of CPython to support the new compression format, regardless of the Python API compatibility of the code within the wheel.
Backwards Compatibility
Backwards compatibility is an incredibly important issue for evolving the wheel format. If adopting a new wheel revision is painful for downstream users, package creators will hesitate to adopt the new standards, and users will be stuck with failed CI pipelines and other installation woes.
Several choices in the above specification are made so that the adoption of a new feature is less painful. For example, today wheels of an incompatible major version are still selected by pip as installation candidates, which causes installer failures if a project starts publishing 2.0 wheels. To avoid this issue, this PEP requires resolvers to filter out wheels with major versions or features incompatible with the installer.
This PEP also defines constraints on future wheel revisions, with the goal of maintaining compatibility with CPython, but allowing evolution of wheel contents. Wheel revisions shouldn’t cause package installations to break on older CPython revisions, as not only would it be frustrating, it would be incredibly hard to debug for users.
This PEP relies on resolvers being able to efficiently acquire package metadata, usually through PEP 658. This might present a problem for users of package indices that do not serve PEP 658 metadata. However, today most installers fall back on using HTTP range requests to efficiently acquire only the part of a wheel needed to read the metadata, a feature most storage providers and servers include. Furthermore, future improvements to wheels such as compression will make up performance losses due to inspecting files in the wheel.
The main compatibility limitation of this PEP is for projects that start publishing solely new wheels alongside a source distribution. If a user on an older installer tries to install the package, it will fall back to the source distribution, because the resolver will skip all newer wheels. Users are often poorly set up to build projects from source, so this could lead to some failed builds users would not see otherwise. There are several approaches to resolving this issue, such as allowing dual-publishing for the initial migration, or marking source distributions as not intended to be built.
Rejected Ideas
The Wheel Format is Perfect and Does not Need to be Changed
The wheel format has been around for over 10 years, and in that time, Python packages have changed a lot. It is much more common for packages to include Rust or C extension modules, increasing the size of packages. Better compression, such as lzma or zstd, could save a lot of time and bandwidth for PyPI and its users. Compatibility tags cannot express the wide variety of hardware used to accelerate Python code today, nor encode shared library compatibility information. In order to address these issues, evolution of the wheel package format is necessary.
Wheel Format Changes Should be Tied to CPython Releases
I do not believe that tying wheel revisions to CPython releases is beneficial. The main benefit of doing so is to make adoption of new wheels predictable - users with the latest CPython get the latest package format! This choice has several issues however. First, tying the new format to the latest CPython makes adoption much slower. Users on LTS versions of Linux with older Python installations are free to update their pip in a virtual environment, but cannot update the version of Python as easily. While some changes to the wheel format must be tied to CPython changes necessarily, such as adding new compression formats or changing the metadata format, many changes do not need to be tied to the Python version, such as symlinks, enhanced compatibility tags, and new formats that use existing compression formats in the standard library. Additionally, wheels are used across multiple different language implementations, which lag behind the CPython version. It seems unfair to prevent their users from using a feature due to the Python version. Lastly, while this PEP does not suggest tying the wheel version to CPython releases, a future PEP may still do so at any time, so this choice does not need to be made in this PEP.
Keep Using .whl
as the File Extension
While keeping the extension .whl
is appealing for many reasons, it presents
several problems that are difficult to surmount. First, current installers
would still pick a new wheel and fail to install the package. Furthermore,
the file name of a wheel would not be able to change without breaking existing
installers that expect a set wheel file name format. While the current filename
specification for wheels is sufficient for current usage, the optional
build tag in the middle of the file name makes any extensions ambiguous (i.e.
foo-0.3-py3-none-any-fancy_new_tag.whl
would parse as the build tag being
py3
). This limits changes to information stored in the wheel file name.
Store the Wheel Major Version in the File Extension (.whl2
)
Storing the wheel major version in the file extension has several nice
advantages. For one, there is no need to introduce the Wheel-Version
metadata field, since installers could simply filter based on file extension.
This would also allow future side-by-side packages. However, changing the
extension for wheels each major version has some downsides. First, the version
stored in the WHEEL
file must match the file extension, and this would need
to be verified by installers. Additionally, many systems associate file type by
file extension (e.g. executable associations, various web caching software),
and these would need to be updated every version that is released. Furthermore,
part of the brittleness of the current wheel specification is that so much
metadata is stored in the filename. Filenames are not well suited to store
structured data. Moving away from encoding information in the filename should
be a goal of future wheel revisions.
Another possibility is to use the file extension to encode the outer container
format (i.e. a ZIP file containing .dist-info
) separate from the inner
wheel version. However, this could lead to confusion if the file extension and
inner Wheel-Version
diverge. If an installer raises an error due to an
incompatible wheel 3.0 as obtained from the wheel metadata, some users will
be confused by the difference from the file extension .whl2
.
Wheel 2.0 Should Change the Outer Container Format
Since wheel 2.0 will change the extension of wheel files, it is the best
opportunity to modify the outer container format. Compatibility does not need
to be kept with a different file extension that tools will need to opt-in to
reading. The main use-case for a different exterior compression format would
be better compression. For example, the outer container could be changed into
a Zstandard tarfile, .tar.zst
, which
would decompress faster and produce smaller wheels. However, there are several
practical issues with this. First, Zstandard is not part of the Python standard
library, so pure-Python packaging tools would need to ship an extension to
unpack these wheels. This could cause some compatibility issues for several
platforms where extension modules are not easy to install. Furthermore, a
future wheel revision could always introduce a new layout of non-metadata files
that uses a .tar.zst
inside the existing ZIP-based format.
Finally, it is not a good idea to change the wheel file format too much at once. The goal of this PEP is to make evolving the specification easier, and part of the rationale behind making wheel evolution easier is to avoid “all at once” changes. Changing the outer file format for wheels would require re-writing how package metadata is not only discovered, but also installed.
Why not Specify Wheel 2.0 In This PEP?
There are many features that could be included as part of wheel 2.0, but this PEP does not cover them. The goal of this PEP is to define a compatibility story for the wheel file format. Changes that do not pertain to compatibility for wheel versions do not need to be in this PEP, and should be introducted in follow-up PEPs defining new wheel features.
Discussion Topics
Should Indices Support Dual-publishing for the First Migration?
Since .whl
and .whlx
will look different in file name, they could be
uploaded side-by-side to package indices like PyPI. This has some nice
benefits, like dual-support for older and newer installers, so users who can
get the latest features, while users who don’t upgrade still can install the
latest version of a package.
There are many complications however. Should we allow wheel 2 uploads to existing wheel 1-only releases? Should we put any requirements on the side-by-side wheels, such as:
Constraints on dual-published wheels
A given index may contain identical-content wheels with different wheel versions, and installers should prefer the newest-available wheel format, with all other factors held constant.
Should we only allow uploading both with PEP 694 allowing “atomic” dual-publishing?
Acknowledgements
The author of this PEP is greatly indebted to the incredibly valuable review, advice, and feedback of Barry Warsaw and Michael Sarahan.
Copyright
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.
Source: https://github.com/python/peps/blob/main/peps/pep-0777.rst
Last modified: 2024-10-15 22:48:36 GMT