PEP: 714 Title: Rename dist-info-metadata in the Simple API Author:
Donald Stufft <donald@stufft.io> PEP-Delegate: Paul Moore
<p.f.moore@gmail.com> Discussions-To: https://discuss.python.org/t/27471
Status: Accepted Type: Standards Track Topic: Packaging Content-Type:
text/x-rst Created: 06-Jun-2023 Post-History: 06-Jun-2023 Resolution:
27-Jun-2023

Abstract

This PEP renames the metadata provided by PEP 658 in both HTML and JSON
formats of the Simple API and provides guidelines for both clients and
servers in how to handle the renaming.

Motivation

PEP 658 specified a mechanism to host the core metadata files from an
artifact available through the Simple API such that a client could fetch
the metadata and use it without having to download the entire artifact.
Later PEP 691 was written to add the ability to use JSON rather than
HTML on the Simple API, which included support for the PEP 658 metadata.

Unfortunately, PyPI did not support PEP 658 until just recently, which
released with a bug where the dist-info-metadata key from PEP 658 was
incorrectly named in the JSON representation, to be
data-dist-info-metadata. However, when attempting to fix that bug, it
was discovered that pip also had a bug, where any use of
dist-info-metadata in the JSON representation would cause pip to hard
fail with an exception.

The bug in pip has existed since at least v22.3, which means that it has
been released for approximately 8 months, long enough to have been
pulled into Python releases, downstream Linux releases, baked into
containers, virtual environments, etc.

This puts us in an awkward position of having a bug on PyPI that cannot
be fixed without breaking pip, due to a bug in pip, but that version of
pip is old enough to have been widely deployed. To make matters worse, a
version of pip that is broken in this way cannot install anything from
PyPI once it fixes its bug, including installing a new, fixed version of
pip.

Rationale

There are 3 main options for a path forward for fixing these bugs:

1.  Do not change the spec, fix the bug in pip, wait some amount of
    time, then fix the bug in PyPI, breaking anyone using an unfixed pip
    such that they cannot even install a new pip from PyPI.
2.  Do the same as (1), but special case PyPI so it does not emit the
    PEP 658 metadata for pip, even if it is available. This allows
    people to upgrade pip if they're on a broken version, but nothing
    else.
3.  Change the spec to avoid the key that pip can't handle currently,
    allowing PyPI to emit that key and a new version of pip to be
    released to take advantage of that key.

This PEP chooses (3), but goes a little further and also renames the key
in the HTML representation.

Typically we do not change specs because of bugs that only affect one
particular implementation, unless the spec itself is at fault, which
isn't the case here: the spec is fine and these are just genuine bugs in
pip and PyPI.

However, we choose to do this for 4 reasons:

1.  Bugs that affect pip and PyPI together represent an outsized amount
    of impact compared to any other client or repository combination.
2.  The impact of being broken is that installs do not function, at all,
    rather than degrading gracefully in some way.
3.  The feature that is being blocked by these bugs is of large
    importance to the ability to quickly and efficiently resolve
    dependencies from PyPI with pip, and having to delay it for a long
    period of time while we wait for the broken versions of pip to fall
    out of use would be of detriment to the entire ecosystem.
4.  The downsides of changing the spec are fairly limited, given that we
    do not believe that support for this is widespread, so it affects
    only a limited number of projects.

Specification

The keywords "**MUST**", "**MUST NOT**", "**REQUIRED**", "**SHALL**",
"**SHALL NOT**", "**SHOULD**", "**SHOULD NOT**", "**RECOMMENDED**",
"**MAY**", and "**OPTIONAL**"" in this document are to be interpreted as
described in RFC 2119 <2119>.

Servers

The PEP 658 metadata, when used in the HTML representation of the Simple
API, MUST be emitted using the attribute name data-core-metadata, with
the supported values remaining the same.

The PEP 658 metadata, when used in the PEP 691 JSON representation of
the Simple API, MUST be emitted using the key core-metadata, with the
supported values remaining the same.

To support clients that used the previous key names, the HTML
representation MAY also be emitted using the data-dist-info-metadata,
and if it does so it MUST match the value of data-core-metadata.

Clients

Clients consuming any of the HTML representations of the Simple API MUST
read the PEP 658 metadata from the key data-core-metadata if it is
present. They MAY optionally use the legacy data-dist-info-metadata if
it is present but data-core-metadata is not.

Clients consuming the JSON representation of the Simple API MUST read
the PEP 658 metadata from the key core-metadata if it is present. They
MAY optionally use the legacy dist-info-metadata key if it is present
but core-metadata is not.

Backwards Compatibility

There is a minor compatibility break in this PEP, in that clients that
currently correctly handle the existing metadata keys will not
automatically understand the newer metadata keys, but they should
degrade gracefully, and simply act as if the PEP 658 metadata does not
exist.

Otherwise there should be no compatibility problems with this PEP.

Rejected Ideas

Leave the spec unchanged, and cope with fixing in PyPI and/or pip

We believe that the improvements brought by PEP 658 are very important
to improving the performance of resolving dependencies from PyPI, and
would like to be able to deploy it as quickly as we can.

Unfortunately the nature of these bugs is that we cannot deploy them as
is without breaking widely deployed and used versions of pip. The
breakages in this case would be bad enough that affected users would not
even be able to directly upgrade their version of pip to fix it, but
would have to manually fetch pip another way first (e.g. get-pip.py).

This is something that PyPI would be unwilling to do without some way to
mitigate those breakages for those users. Without some reasonable
mitigation strategy, we would have to wait until those versions of pip
are no longer in use on PyPI, which would likely be 5+ years from now.

There are a few possible mitigation strategies that we could use, but
we've rejected them as well.

Mitigation: Special Case pip

The breakages are particularly bad in that it prevents users from even
upgrading pip to get an unbroken version of pip, so a command like
pip install --upgrade pip would fail. We could mitigate this by having
PyPI special case pip itself, so that the JSON endpoint never returns
the PEP 658 metadata and the above still works.

This PEP rejects this idea because while the simple command that only
upgrades pip would work, if the user included anything else in that
command to upgrade then the command would go back to failing, which we
consider to be still too large of a breakage.

Additionally, while this bug happens to be getting exposed right now
with PyPI, it is really a bug that would happen with any PEP 691
repository that correctly exposed the PEP 658 metadata. This would mean
that every repository would have to carry this special case for pip.

Mitigation: Have the server use User-Agent Detection

pip puts its version number into its User-Agent, which means that the
server could detect the version number and serve different responses
based on that version number so that we don't serve the PEP 658 metadata
to versions of pip that are broken.

This PEP rejects this idea because supporting User-Agent detection is
too difficult to implement in a reasonable way.

1.  On PyPI we rely heavily on caching the Simple API in our CDN. If we
    varied the responses based on User-Agent, then our CDN cache would
    have an explosion of cache keys for the same content, which would
    make it more likely that any particular request would not be cached
    and fall back to hitting our backend servers, which would have to
    scale much higher to support the load.
2.  PyPI could support the User-Agent detection idea by mutating the
    Accept header of the request so that those versions appear to only
    accept the HTML version, allowing us to maintain the CDNs cache
    keys. This doesn't affect any downstream caches of PyPI though,
    including pip's HTTP cache which would possibly have JSON versions
    cached for those requests and we wouldn't emit a Vary on User-Agent
    for them to know that it isn't acceptable to share those caches, and
    adding a Vary: User-Agent for downstream caches would have the same
    problem as (1), but for downstream caches instead of our CDN cache.
3.  The pip bug ultimately isn't PyPI specific, it affects any
    repository that implements PEP 691 and PEP 658 together. This would
    mean that workarounds that rely on implementation specific fixes
    have to be replicated for each repository that implements both,
    which may not be easy or possible in all cases (static mirrors may
    not be able to do this User-Agent detection for instance).

Only change the JSON key

The bug in pip only affects the JSON representation of the Simple API,
so we only need to actually change the key in the JSON, and we could
leave the existing HTML keys alone.

This PEP rejects doing that because we believe that in the long term,
having the HTML and JSON key names diverge would make mistakes like this
more likely and make implementing and understanding the spec more
confusing.

The main reason that we would want to not change the HTML keys is to not
lose PEP 658 support in any HTML only clients or repositories that might
already support it. This PEP mitigates that breakage by allowing both
clients and servers to continue to support both keys, with a
recommendation of when and how to do that.

Recommendations

The recommendations in this section, other than this notice itself, are
non-normative, and represent what the PEP authors believe to be the best
default implementation decisions for something implementing this PEP,
but it does not represent any sort of requirement to match these
decisions.

Servers

We recommend that servers only emit the newer keys, particularly for the
JSON representation of the Simple API since the bug itself only affected
JSON.

Servers that wish to support PEP 658 in clients that use HTML and have
it implemented, can safely emit both keys only in HTML.

Servers should not emit the old keys in JSON unless they know that no
broken versions of pip will be used to access their server.

Clients

We recommend that clients support both keys, for both HTML and JSON,
preferring the newer key as this PEP requires. This will allow clients
to support repositories that already have correctly implemented PEP 658
and PEP 691 but have not implemented this PEP.

Copyright

This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.