PEP 763 – Limiting deletions on PyPI
- Author:
- William Woodruff <william at yossarian.net>, Alexis Challande <alexis.challande at trailofbits.com>
- Sponsor:
- Donald Stufft <donald at stufft.io>
- PEP-Delegate:
- Donald Stufft <donald at stufft.io>
- Discussions-To:
- Discourse thread
- Status:
- Draft
- Type:
- Standards Track
- Topic:
- Packaging
- Created:
- 24-Oct-2024
- Post-History:
- 09-Jul-2022, 01-Oct-2024, 28-Oct-2024
Abstract
We propose limiting when users can delete files, releases, and projects from PyPI. A project, release, or file may only be deleted within 72 hours of when it is uploaded to the index. From this point, users may only use the “yank” mechanism specified by PEP 592.
An exception to this restriction is made for releases and files that are marked with pre-release specifiers, which will remain deletable at any time. The PyPI administrators will retain the ability to delete files, releases, and projects at any time, for example for moderation or security purposes.
Rationale and Motivation
As observed in PEP 592, user-level deletion of projects on PyPI enables a catch-22 situation of dependency breakage:
Whenever a project detects that a particular release on PyPI might be broken, they oftentimes will want to prevent further users from inadvertently using that version. However, the obvious solution of deleting the existing file from a repository will break users who have pinned to a specific version of the project.This leaves projects in a catch-22 situation where new projects may be pulling down this known broken version, but if they do anything to prevent that they’ll break projects that are already using it.
On a technical level, the problem of deletion is mitigated by “yanking,” also specified in PEP 592. However, deletions continue to be allowed on PyPI, and have caused multiple notable disruptions to the Python ecosystem over the interceding years:
- July 2022: atomicwrites
was deleted by its maintainer
in an attempt to remove the project’s “critical” designation, without the
maintainer realizing that project deletion would also delete all previously
uploaded releases.
The project was subsequently restored with the maintainer’s consent, but at the cost of manual administrator action and extensive downstream breakage to projects like pytest. As of October 2024, atomicwrites is archived but still has around 4.5 million monthly downloads from PyPI.
- April 2023: codecov was deleted by
its maintainers after a long deprecation period. This caused extensive
breakage for many of Codecov’s CI/CD users, who were unaware of the
deprecation period due to limited observability of deprecation warnings
within CI/CD logs.
The project was subsequently re-created by its maintainers, with a new release published to compensate for the deleted releases (which were not restored), meaning that any pinned installations remained broken. As of October 2024, this single release remains the only release on PyPI and has around 1.5 million monthly downloads.
- June 2023: python-sonarqube-api
deleted all released releases prior to 2.0.2.
The project’s maintainer subsequently deleted conversations and force-pushed over the tag history for
python-sonarqube-api
’s source repository, impeding efforts by users to compare changes between releases. - June 2024: PySimpleGUI changed licenses and deleted nearly all previous releases. This resulted in widespread disruption for users, who (prior to the relicensing) were downloading PySimpleGUI approximately 25,000 times a day.
In addition to their disruptive effect on downstreams, deletions also have detrimental effects on PyPI’s sustainability as well as the overall security of the ecosystem:
- Deletions increase support workload for PyPI’s administrators and moderators, as users mistakenly file support requests believing that PyPI is broken, or that the administrators themselves have removed the project.
- Deletions impair external (meaning end-user) incident response and analysis, making it difficult to distinguish “good faith” maintainer behavior from a malicious actor attempting the cover their tracks.
The Python ecosystem is continuing to grow, meaning that future deletions of projects can be reasonably assumed to be just, as if not more, disruptive than the deletions sampled above.
Given all of the above, this PEP concludes that deletions now present a greater risk and detriment to the Python ecosystem than a benefit.
In addition to these technical arguments, there is also precedent from other packaging ecosystems for limiting the ability of users to delete projects and their constituent releases. This precedent is documented in Appendix A.
Specification
There are three different types of deletable objects:
- Files, which are individual project distributions (such as source
distributions or wheels).
Example:
requests-2.32.3-py3-none-any.whl
. - Releases, which contain one or more files that share the same version
number.
Example: requests v2.32.3.
- Projects, which contain one or more releases.
Example: requests.
Deletion eligibility rules
This PEP proposes the following deletion eligibility rules:
- A file is deletable if and only if it was uploaded to PyPI less than 72 hours from the current time, or if it has a pre-release specifier.
- A release is deletable if and only if all of its contained files are deletable.
- A project is deletable if and only if all of its releases are deletable.
These rules allow new projects to be deleted entirely, and allow old projects to delete new files or releases, but do not allow old projects to delete old files or releases.
Implementation
This PEP’s implementation primarily concerns aspects of PyPI that are not standardized or subject to standardization, such as the web interface and signed-in user operations. As a result, this section describes its implementation in behavioral terms.
Changes
- Per the eligibility rules above, PyPI will reject web interface requests (using an appropriate HTTP response code of its choosing) for file, release, or project deletion if the respective object is not eligible for deletion.
- PyPI will amend its web interface to indicate a file/release/project’s deletion ineligibility, e.g. by styling the relevant UI elements as “inactive” and making relevant bottoms/forms unclickable.
Security Implications
This PEP does not identify negative security implications associated with the proposed approach.
This PEP identifies one minor positive security implication: by restricting user-controlled deletions, this PEP makes it more difficult for a malicious actor to cover their tracks by deleting malware from the index. This is particularly useful for external (i.e. non-PyPI administrator) triage and incident response, where the defending party needs easy access to malware samples to develop indicators of compromise.
How To Teach This
This PEP suggests at least two pieces of public-facing material to help the larger Python packaging community (and its downstream consumers) understand its changes:
- An announcement post on the PyPI blog explaining the nature of the PEP, its motivations, and its behavioral implications for PyPI.
- An announcement banner on PyPI itself, linking to the above.
- Updates to the PyPI user documentation explaining the difference between deletion and yanking and the limited conditions under which the former can still be initiated by package owners.
Rejected Ideas
Conditioning deletion on dependency relationships
An alternative to time-based deletion windows is deletion eligibility based on
downstream dependents. For example, a release could be considered deletable
if and only if it has fewer than N
downstream dependents on PyPI,
where N
could be as low as 1.
This idea is appealing since it directly links deletion eligibility to disruptiveness. npm uses it and conditions project removal on the absence of any downstream dependencies known to the index.
Despite its appeal, this PEP identifies several disadvantages and technical limitations that make dependency-conditioned deletion not appropriate for PyPI:
- PyPI is not aware of dependency relationships. In Python packaging,
both project builds and metadata generation are frequently dynamic
operations, involving arbitrary project-specified code. This is typified
by source distributions containing
setup.py
scripts, where the execution ofsetup.py
is responsible for computing the set of dependencies encoded in the project’s metadata.This is in marked contrast to ecosystems like npm and Rust’s crates, where project builds can be dynamic but the project’s metadata itself is static.
As a result of this, PyPI doesn’t know your project’s dependencies, and is architecturally incapable of knowing them without either running arbitrary code (a significant security risk) or performing a long-tail deprecation of
setup.py
-based builds in favor of PEP 517 and PEP 621-style static metadata. - Results in an unintuitive permissions model. Dependency-conditioned
deletion results in a “reversed” power relationship, where anybody
who introduces a dependency on a project can prevent that project from
being deleted.
This is reasonable on face value, but can be abused to produce unexpected and undesirable (in the context of enabling some deletions) outcomes. A notable example of this is npm’s everything package, which depends on every public package on npm (as of 30 Dec 2023) and thereby prevents their deletion.
Conditioning deletion on download count
Another alternative to time-based deletion windows is to delete based on the
number of downloads. For example, a release could be considered deletable if
and only if it has fewer than N
downloads during the last period.
While presenting advantages by tying a project deletion possibility to its usage, this PEP identifies several limitations to this approach:
- Ecosystem diversity. The Python ecosystem includes projects with widely varying usage patterns. A fixed download threshold would not adequately account for niche but critical projects with naturally low download counts.
- Time sensitivity. Download counts do not necessarily reflect a project’s current status or importance. A previously popular project might have low recent downloads but still be crucial for maintaining older systems.
- Technical complexity. Accessing the download count of a project within PyPI is not straightforward, and there is limited possibility to gather a project’s download statistics from mirrors or other distributions systems.
Appendix A: Precedent in other ecosystems
The following is a table of support for deletion in different packaging ecosystems. An ecosystem is considered to not support deletion if it restrict’s a user’s ability to perform deletions in a manner similar to this PEP.
An earlier version of this table, showing only deletion, was compiled by Donald Stufft and others on the Python discussion forum in July 2022.
Ecosystem (Index) | Deletion | Yanking | Notes |
---|---|---|---|
Python (PyPI) | ✅ [1] | ✅ [2] | Deletion currently completely unrestricted. |
Rust (crates.io) | ❌ | ✅ [3] | Deletion by users not allowed at all. |
JavaScript (npm) | ❌ [4] | ✅ [5] | Deletion is limited by criteria similar to this PEP. |
Ruby (RubyGems) | ✅ [6] | ❌ | RubyGems calls deletion “yanking.” Yanking in PyPI’s terms is not supported at all. |
Java (Maven Central) | ❌ [7] | ❌ | Deletion by users not allowed at all. |
PHP (Packagist) | ❌ [8] | ❌ | Deletion restricted after an undocumented number of installs. |
.NET (NuGet) | ❌ [9] | ✅ [10] | NuGet calls yanking “unlisting.” |
Elixir (Hex) | ❌ [11] | ✅ [11] | Hex calls yanking “retiring.” |
R (CRAN) | ❌ [12] | ✅ [12] | Deletion is limited to within 24 hours of initial release or 60 minutes for subsequent versions. CRAN calls yanking “archiving.” |
Perl (CPAN) | ✅ | ❌ | Yanking is not supported at all. Deletion seemingly encouraged, at least as of 2021 [13]. |
Lua (LuaRocks) | ✅ [14] | ✅ [14] | LuaRocks calls yanking “archiving.” |
Haskell (Hackage) | ❌ [15] | ✅ [16] | Hackage calls yanking “deprecating.” |
OCaml (OPAM) | ❌ [17] | ✅ [17] | Deletion is allowed if it occurs “reasonably soon” after inclusion.
Yanking is de facto supported by the available: false marker, which
effectively disables resolution. |
The following trends are present:
- A strong majority of indices do not support deletion (9 vs. 4)
- A strong majority of indices do support yanking (9 vs. 4)
- An overwhelming majority of indices support one or the other or neither,
but not both (11 vs. 2)
- PyPI and LuaRocks are notable outliers in supporting both deletion and yanking.
Footnotes
Copyright
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.
Source: https://github.com/python/peps/blob/main/peps/pep-0763.rst
Last modified: 2024-10-28 23:59:04 GMT