PEP: 777 Title: How to Re-invent the Wheel Author: Ethan Smith
<ethan@ethanhs.me> Sponsor: Barry Warsaw <barry@python.org>
PEP-Delegate: Paul Moore <p.f.moore@gmail.com> Discussions-To:
https://discuss.python.org/t/pep-777-how-to-re-invent-the-wheel/67484
Status: Draft Type: Standards Track Topic: Packaging Created:
09-Oct-2024 Post-History: 10-Oct-2024

Abstract

The current wheel 1.0 specification <427> was written over a decade ago,
and has been extremely robust to changes in the Python packaging
ecosystem. Previous efforts to improve the wheel specification
were deferred <491#pep-deferral> to focus on other packaging
specifications. Meanwhile, the use of wheels has changed dramatically in
the last decade. There have been many requests for new wheel features
over the years; however, a fundamental obstacle to evolving the wheel
specification has been that there is no defined process for how to
handle adding backwards-incompatible features to wheels. Therefore, to
enable other PEPs to describe new enhancements to the wheel
specification, this PEP prescribes compatibility requirements on future
wheel revisions. This PEP does not specify a new wheel revision. The
specification of a new wheel format (“Wheel 2.0”) is left to a future
PEP.

Rationale

Currently, wheel specification changes that require new installer
behavior are backwards incompatible and require a major version increase
in the wheel metadata format. An increase of the wheel major version has
yet to happen, partially because such a change has the potential to be
catastrophically disruptive. Per the wheel specification, any installer
that does not support the new major version must abort at install time.
This means that if the major version were to be incremented without
further planning, many users would see installation failures as older
installers reject new wheels uploaded to public package indices like the
Python Package Index (PyPI). It is critically important to carefully
plan the interactions between build tools, package indices, and package
installers to avoid incompatibility issues, especially considering the
long tail of users who are slow to update their installers.

The backward compatibility concerns have prevented valuable improvements
to the wheel file format, such as better compression, wheel data format
improvements, better information about what is included in a wheel, and
JSON formatted metadata in the ".dist-info" folder.

This PEP describes constraints and behavior for new wheel revisions to
preserve stability for existing tools that do not support a new major
version of the wheel format. This ensures that backwards incompatible
changes to the wheel specification will only affect users and tools that
are properly set up to use the newer wheels. With a clear path for
evolving the wheel specification, future PEPs will be able to improve
the wheel format without needing to re-define a completely new
compatibility story.

Specification

Add Wheel-Version Metadata Field to Core Metadata

Currently, the wheel 1.0 PEP <427>, PEP 427, specifies that wheel files
must contain a WHEEL metadata file that contains the version of the
wheel specification that the file conforms to. PEP 427 stipulates that
installers MUST warn on installation of a wheel with a minor version
greater than supported, and MUST abort on installation of wheels with a
major version that is greater than what the installer supports. This
ensures that users do not get invalid installations from wheels that
installers cannot properly install.

However, resolvers do not currently exclude wheels with an incompatible
wheel version. There is also currently no way for a resolver to check a
wheel's version without downloading the wheel directly. To make wheel
version filtering easy for resolvers, the wheel version MUST be included
in the relevant metadata file (currently METADATA). This will allow
resolvers to efficiently check the wheel version using the PEP 658
metadata API without needing to download and inspect the
.dist-info/WHEEL file.

To accomplish this, a new field, Wheel-Version, will be added to the
Core Metadata Specification. This field is single use, and must contain
the exact same version specified as the Wheel-Version entry in the WHEEL
file, or any future replacement file defining metadata about the wheel
file. If Wheel-Version is absent from the metadata file, then tools MUST
infer the wheel file major version as 1.

Wheel-Version MUST NOT be included in source distribution metadata
(PKG-INFO) files. If a tool encounters Wheel-Version inside of a source
distribution metadata file, it SHOULD raise an error.

Wheel-Version MAY be included in the metadata file for wheels of version
1, but for wheels of version 2 or higher, the metadata file MUST include
Wheel-Version. This enforces that future revisions of the wheel
specification can rely on resolvers skipping incompatible wheels by
checking the Wheel-Version field. Build backends are encouraged to
include Wheel-Version in all wheels that they generate, regardless of
version.

Installers SHOULD copy the metadata file in a wheel unmodified during
installation. This prevents the need to update the RECORD file, which is
an error prone process. Tools reading installed core metadata SHOULD NOT
assume that the field is present, as other installation formats may omit
it.

When installing a wheel, installers MUST take the following steps:

1.  Check that the values of Wheel-Version in both the core metadata
    file and wheel metadata file match. If they do not match, installers
    MUST abort installation. Neither value takes precedence.
2.  Check that the installer is compatible with Wheel-Version. If
    Wheel-Version is absent, assume the version is 1.0. Warn if minor
    version is greater, abort if major version is greater. This
    procedure is identital to that in PEP 427.
3.  Proceed with installation as specified in the Binary Distribution
    Format specification.

Resolver Behavior Regarding Wheel-Version

Resolvers, in the process of selecting a wheel to install, MUST check a
candidate wheel's Wheel-Version, and ignore incompatible wheel files.
Without ignoring these files, older installers might select a wheel that
uses an unsupported wheel version for that installer, and force the
installer to abort per PEP 427. By skipping incompatible wheel files,
users will not see installation errors when a project adopts a new wheel
major version. As already specified in PEP 427, installers MUST abort if
a user tries to directly install a wheel that is incompatible. If, in
the process of resolving packages found in multiple indices, a resolver
comes across two wheels of the same distribution and version, resolvers
should prioritize the wheel of the highest compatible version.

While the above protects users from unexpected breakages, users may miss
a new release of a distribution if their installer does not support the
wheel version used in the release. Imagine in the future that a package
publishes 3.0 wheel files. Downstream users won't see that there is a
new release available if their installers only support 2.x wheels.
Therefore, installers SHOULD emit a warning if, in the process of
resolving packages, they come across an incompatible wheel and skip it.

First Major Version Bump Must Change File Extension

Unfortunately, existing resolvers do not check the compatibility of
wheels before selecting them as installation candidates. Until a
majority of users update to installers that properly check for wheel
compatibility, it is unsafe to allow publishing wheels of a new major
version that existing resolvers might select. It could take upwards of
four years before the majority of users are on updated resolvers, based
on current data about PyPI installer usage (See the
777-pypi-download-analysis, for details). To allow for experimentation
and faster adoption of 2.0 wheels, this PEP proposes a change to the
file extension of the wheel file format, from .whl to .whlx for all
future wheel versions. Note that x in whlx is the letter "x" and does
not specify the wheel major version. The change to extension name
resolves the initial transition issue of 2.0 wheels breaking users on
existing installers that do not implement Wheel-Version checks. By using
a different file extension, 2.0 wheels can immediately be uploaded to
PyPI, and users will be able to experiment with the new features right
away. Users on older installers will simply ignore these new files.

One rejected alternative would be to keep the .whl extension, but delay
the publishing of wheel 2.0 to PyPI. For more on that, please see
Rejected Ideas.

Recommended Build Backend Behavior with New Wheel Formats

Build backends are recommended to generate the most compatible wheel
based on features a project uses. For example, if a wheel does not use
symbolic links, and such a feature was introduced in wheel 5.0, the
build backend could generate a wheel of version 4.0. On the other hand,
some features will want to be adopted by default. For example, if wheel
3.0 introduces better compression, the build backend may wish to enable
this feature by default to improve the wheel size and download
performance.

Limitations on Future Wheel Revisions

While it is difficult to know what future features may be planned for
the wheel format, it is important that certain compatibility promises
are maintained.

Wheel files, when installed, MUST stay compatible with the Python
standard library's importlib.metadata for all supported CPython
versions. For example, replacing .dist-info/METADATA with a JSON
formatted metadata file MUST be a multi-major version migration with one
version introducing the new JSON file alongside the existing email
header format, and another future version removing the email header
format metadata file. The version to remove .dist-info/METADATA also
MUST be adopted only after the last CPython release that lacked support
for the new file reaches end of life. This ensures that code using
importlib.metadata will not break with wheel major version revisions.

Wheel files MUST remain ZIP format files as the outer container format.
Additionally, the .dist-info metadata directory MUST be placed at the
root of the archive without any compression, so that unpacking the wheel
file produces a normal .dist-info directory holding any metadata for the
wheel. Future wheel revisions MAY modify the layout, compression, and
other attributes about non-metadata components of a wheel such as data
and code. This assures that future wheel revisions remain compatible
with tools operating on package metadata, while allowing for
improvements to code storage in the wheel, such as adopting compression.

Package tooling MUST NOT assume that the contents and format of the
wheel file will remain the same for future wheel major versions beyond
the limitations above about metadata folder contents and outer container
format. For example, newer wheel major versions may add or remove
filename components, such as the build tag or the platform tag.
Therefore it is incumbent upon tooling to check the metadata for the
Wheel-Version before attempting to install a wheel.

Finally, future wheel revisions MUST NOT use any compression formats not
in the CPython standard library of at least the latest release. Wheels
generated using any new compression format should be tagged as requiring
at least the first released version of CPython to support the new
compression format, regardless of the Python API compatibility of the
code within the wheel.

Backwards Compatibility

Backwards compatibility is an incredibly important issue for evolving
the wheel format. If adopting a new wheel revision is painful for
downstream users, package creators will hesitate to adopt the new
standards, and users will be stuck with failed CI pipelines and other
installation woes.

Several choices in the above specification are made so that the adoption
of a new feature is less painful. For example, today wheels of an
incompatible major version are still selected by pip as installation
candidates, which causes installer failures if a project starts
publishing 2.0 wheels. To avoid this issue, this PEP requires resolvers
to filter out wheels with major versions or features incompatible with
the installer.

This PEP also defines constraints on future wheel revisions, with the
goal of maintaining compatibility with CPython, but allowing evolution
of wheel contents. Wheel revisions shouldn't cause package installations
to break on older CPython revisions, as not only would it be
frustrating, it would be incredibly hard to debug for users.

This PEP relies on resolvers being able to efficiently acquire package
metadata, usually through PEP 658. This might present a problem for
users of package indices that do not serve PEP 658 metadata. However,
today most installers fall back on using HTTP range requests to
efficiently acquire only the part of a wheel needed to read the
metadata, a feature most storage providers and servers include.
Furthermore, future improvements to wheels such as compression will make
up performance losses due to inspecting files in the wheel.

The main compatibility limitation of this PEP is for projects that start
publishing solely new wheels alongside a source distribution. If a user
on an older installer tries to install the package, it will fall back to
the source distribution, because the resolver will skip all newer
wheels. Users are often poorly set up to build projects from source, so
this could lead to some failed builds users would not see otherwise.
There are several approaches to resolving this issue, such as allowing
dual-publishing for the initial migration, or marking source
distributions as not intended to be built.

Rejected Ideas

The Wheel Format is Perfect and Does not Need to be Changed

The wheel format has been around for over 10 years, and in that time,
Python packages have changed a lot. It is much more common for packages
to include Rust or C extension modules, increasing the size of packages.
Better compression, such as lzma or zstd, could save a lot of time and
bandwidth for PyPI and its users. Compatibility tags cannot express the
wide variety of hardware used to accelerate Python code today, nor
encode shared library compatibility information. In order to address
these issues, evolution of the wheel package format is necessary.

Wheel Format Changes Should be Tied to CPython Releases

I do not believe that tying wheel revisions to CPython releases is
beneficial. The main benefit of doing so is to make adoption of new
wheels predictable - users with the latest CPython get the latest
package format! This choice has several issues however. First, tying the
new format to the latest CPython makes adoption much slower. Users on
LTS versions of Linux with older Python installations are free to update
their pip in a virtual environment, but cannot update the version of
Python as easily. While some changes to the wheel format must be tied to
CPython changes necessarily, such as adding new compression formats or
changing the metadata format, many changes do not need to be tied to the
Python version, such as symlinks, enhanced compatibility tags, and new
formats that use existing compression formats in the standard library.
Additionally, wheels are used across multiple different language
implementations, which lag behind the CPython version. It seems unfair
to prevent their users from using a feature due to the Python version.
Lastly, while this PEP does not suggest tying the wheel version to
CPython releases, a future PEP may still do so at any time, so this
choice does not need to be made in this PEP.

Keep Using .whl as the File Extension

While keeping the extension .whl is appealing for many reasons, it
presents several problems that are difficult to surmount. First, current
installers would still pick a new wheel and fail to install the package.
Furthermore, the file name of a wheel would not be able to change
without breaking existing installers that expect a set wheel file name
format. While the current filename specification for wheels is
sufficient for current usage, the optional build tag in the middle of
the file name makes any extensions ambiguous (i.e.
foo-0.3-py3-none-any-fancy_new_tag.whl would parse as the build tag
being py3). This limits changes to information stored in the wheel file
name.

Store the Wheel Major Version in the File Extension (.whl2)

Storing the wheel major version in the file extension has several nice
advantages. For one, there is no need to introduce the Wheel-Version
metadata field, since installers could simply filter based on file
extension. This would also allow future side-by-side packages. However,
changing the extension for wheels each major version has some downsides.
First, the version stored in the WHEEL file must match the file
extension, and this would need to be verified by installers.
Additionally, many systems associate file type by file extension (e.g.
executable associations, various web caching software), and these would
need to be updated every version that is released. Furthermore, part of
the brittleness of the current wheel specification is that so much
metadata is stored in the filename. Filenames are not well suited to
store structured data. Moving away from encoding information in the
filename should be a goal of future wheel revisions.

Another possibility is to use the file extension to encode the outer
container format (i.e. a ZIP file containing .dist-info) separate from
the inner wheel version. However, this could lead to confusion if the
file extension and inner Wheel-Version diverge. If an installer raises
an error due to an incompatible wheel 3.0 as obtained from the wheel
metadata, some users will be confused by the difference from the file
extension .whl2.

Wheel 2.0 Should Change the Outer Container Format

Since wheel 2.0 will change the extension of wheel files, it is the best
opportunity to modify the outer container format. Compatibility does not
need to be kept with a different file extension that tools will need to
opt-in to reading. The main use-case for a different exterior
compression format would be better compression. For example, the outer
container could be changed into a Zstandard tarfile, .tar.zst, which
would decompress faster and produce smaller wheels. However, there are
several practical issues with this. First, Zstandard is not part of the
Python standard library, so pure-Python packaging tools would need to
ship an extension to unpack these wheels. This could cause some
compatibility issues for several platforms where extension modules are
not easy to install. Furthermore, a future wheel revision could always
introduce a new layout of non-metadata files that uses a .tar.zst inside
the existing ZIP-based format.

Finally, it is not a good idea to change the wheel file format too much
at once. The goal of this PEP is to make evolving the specification
easier, and part of the rationale behind making wheel evolution easier
is to avoid "all at once" changes. Changing the outer file format for
wheels would require re-writing how package metadata is not only
discovered, but also installed.

Why not Specify Wheel 2.0 In This PEP?

There are many features that could be included as part of wheel 2.0, but
this PEP does not cover them. The goal of this PEP is to define a
compatibility story for the wheel file format. Changes that do not
pertain to compatibility for wheel versions do not need to be in this
PEP, and should be introducted in follow-up PEPs defining new wheel
features.

Discussion Topics

Should Indices Support Dual-publishing for the First Migration?

Since .whl and .whlx will look different in file name, they could be
uploaded side-by-side to package indices like PyPI. This has some nice
benefits, like dual-support for older and newer installers, so users who
can get the latest features, while users who don't upgrade still can
install the latest version of a package.

There are many complications however. Should we allow wheel 2 uploads to
existing wheel 1-only releases? Should we put any requirements on the
side-by-side wheels, such as:

Constraints on dual-published wheels

A given index may contain identical-content wheels with different wheel
versions, and installers should prefer the newest-available wheel
format, with all other factors held constant.

Should we only allow uploading both with PEP 694 allowing "atomic"
dual-publishing?

Acknowledgements

The author of this PEP is greatly indebted to the incredibly valuable
review, advice, and feedback of Barry Warsaw and Michael Sarahan.

Copyright

This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.