PEP: 627 Title: Recording installed projects Author: Petr Viktorin
<encukou@gmail.com> BDFL-Delegate: Paul Moore <p.f.moore@gmail.com>
Discussions-To: https://discuss.python.org/t/pep-627/4126 Status: Final
Type: Standards Track Topic: Packaging Content-Type: text/x-rst Created:
15-Jul-2020 Resolution: https://discuss.python.org/t/pep-627/4126/42

packaging:recording-installed-packages

Abstract

This PEP clarifies and updates PEP 376 (Database of Installed Python
Distributions), rewriting it as an interoperability standard. It moves
the canonical location of the standard to the Python Packaging Authority
(PyPA) standards repository, and sets up guidelines for changing it.

Two files in installed .dist-info directories are made optional: RECORD
(which PEP 376 lists as mandatory, but suggests it can be left out for
"system packages"), and INSTALLER.

Motivation

Python packaging is moving from relying on specific tools (Setuptools
and pip) toward an ecosystem of tools and tool-agnostic interoperability
standards.

PEP 376 is not written as an interoperability standard. It describes
implementation details of specific tools and libraries, and is
underspecified, leaving much room for implementation-defined behavior.

This is a proposal to “distill” the standard from PEP 376, clarify it,
and rewrite it to be tool-agnostic.

The aim of this PEP is to have a better standard, not necessarily a
perfect one. Some issues are left to later clarification.

Rationale Change

PEP 376's rationale focuses on two problems:

-   There are too many ways to install projects and this makes
    interoperation difficult.
-   There is no API to get information on installed distributions.

The new document focuses only the on-disk format of information about
installed projects. Providing API to install, uninstall or query this
information is left to be implemented by tools.

Standard and Changes Process

The canonical standard for Recording installed projects (previously
known as Database of Installed Python Distributions) is the
documentation at packaging.python.org. Any changes to the document
(except trivial language or typography fixes) must be made through the
PEP process.

The document is normative (with examples to aid understanding). PEPs
that change it, such as this one, contain additional information that is
expected to get out of date, such as rationales and compatibility
considerations.

The proposed standard is submitted together with this PEP as a pull
request to packaging.python.org.

Changes and their Rationale

Renaming to "Recording installed projects"

The standard is renamed from Database of Installed Python Distributions
to Recording installed projects.

While putting files in known locations on disk may be thought of as a
“database”, it's not what most people think about when they hear the
term. The PyPA links to PEP 376 under the heading Recording installed
distributions.

The PyPA glossary defines “Distribution” (or, “Distribution Package” to
prevent confusion with e.g. Linux distributions) as “A versioned archive
file […]”. Since there may be other ways to install Python code than
from archive files, the document uses “installed project” rather than
“installed distribution”.

Removal of Implementation Details

All tool- and library-specific details are removed. The mechanisms of
how a project is installed are also left out: the document focuses on
the end state. One exception is a sketch of an uninstallation algorithm,
which is given to better explain the purpose of the RECORD file.

References to .egg-info and .egg, formats specific to setuptools and
distutils, are left out.

Explicitly Allowing Additional Files

The .dist-info directory is allowed to contain files not specified in
the spec. The current tools already do this.

A note in the specification mentions files in the .dist-info directory
of wheels. Current tools copy these files to the installed
.dist-info—something to keep in mind for further standardization
efforts.

Clarifications in the RECORD File

The CSV dialect is specified to be the default of Python's csv module.
This resolves edge cases around handling double-quotes and line
terminators in file names.

The “base” of relative paths in RECORD is specified relative to the
.dist-info directory, rather than tool-specific --install-lib and
--prefix options.

Both hash and size fields are now optional (for any file, not just .pyc,
.pyo and RECORD). Leavng them out is discouraged, except for *.pyc and
RECORD itself. (Note that PEP 376 is unclear on what was optional; when
taken literally, its text and examples contradict. Despite that, “both
fields are optional“ is a reasonable interpretation of PEP 376. The
alternative would be to mandate—rather than recommend—which files can be
recorded without hash and size, and to update that list over time as new
use cases come up.)

The new spec explicitly says that the RECORD file must now include all
files of the installed project (the exception for .pyc files remains).
Since tools use RECORD for uninstallation, incomplete file lists could
introduce orphaned files to users' environments. On the other hand, this
means that there is no way to record hashes of some any files if the
full list of files is unknown.

A sketch of an uninstallation algorithm is included to clarify the
file's primary purpose and contents.

Tools must not uninstall/remove projects that lack a RECORD file (unless
they have external information, such as in system package managers of
Linux distros).

On Windows, files in RECORD may be separated by either / or \. PEP 376
was unclear on this: it mandates forward slashes in one place, but shows
backslackes in a Windows-specific example.

Optional RECORD File

The RECORD file is made optional. Not all tools can easily generate a
list of installed files in a Python-specific format.

Specifically, the RECORD file is unnecessary when projects are installed
by a Linux system packaging system, which has its own ways to keep track
of files, uninstall them or check their integrity. Having to keep a
RECORD file in sync with the disk and the system package database would
be unreasonably fragile, and no RECORD file is better than one that does
not correspond to reality.

(Full disclosure: The author of this PEP is an RPM packager active in
the Fedora Linux distro.)

Optional INSTALLER File

The INSTALLER file is also made optional, and specified to be used for
informational purposes only. It is still a single-line text file
containing the name of the installer.

This file was originally added to distinguish projects installed by the
Python installer (pip) from ones installed by other package managers
(e.g. dnf). There were attempts to use this file to prevent pip from
updating or uninstalling packages it didn't install.

Our goal is supporting interoperating tools, and basing any action on
which tool happened to install a package runs counter to that goal.

Instead of relying on the installer name, tools should use feature
detection. The current document offers a crude way of making a project
untouchable by Python tooling: omitting RECORD file.

On the other hand, the installer name may be useful in hints to the
user.

To align with this new purpose of the file, the new specification allows
any ASCII string in INSTALLER, rather than a lowercase identifier. It
also suggests using the command-line command, if available.

The REQUESTED File: Removed from Spec

The REQUESTED file is now considered a tool-specific extension.

Per PEP 376, REQUESTED was to be written when a project was installed by
direct user request, as opposed to automatically to satisfy dependencies
of another project. Projects without this marker file could be
uninstalled when no longer needed.

Despite the standard, many existing installers (including older versions
of pip) never write this file. There is no distinction between projects
that are “OK to remove when no longer needed” and ones simply installed
by a tool that ignores REQUESTED. So, the file is currently not usable
for its intended purpose (unless a tool can use additional, non-standard
information).

Clarifications

When possible, terms (such as name and version) are qualified by
references to existing specs.

Deferred Ideas

To limit the scope of this PEP, some improvements are explicitly left to
future PEPs:

-   Encoding of the RECORD file
-   Limiting or namespacing files that can appear in .dist-info
-   Marking the difference between projects installed directly by user
    request versus those installed to satisfy dependencies, so that the
    latter can be removed when no longer needed.

Copyright

This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.