PEP: 301 Title: Package Index and Metadata for Distutils Version:
$Revision$ Last-Modified: $Date$ Author: Richard Jones
<richard@python.org> Status: Final Type: Standards Track Topic:
Packaging Content-Type: text/x-rst Created: 24-Oct-2002 Python-Version:
2.3 Post-History: 08-Nov-2002

Abstract

This PEP proposes several extensions to the Distutils packaging system
[1]. These enhancements include a central package index server, tools
for submitting package information to the index and extensions to the
package metadata to include Trove[2] information.

This PEP does not address issues of package dependency. It also does not
address storage and download of packages as described in PEP 243. Nor is
it proposing a local database of packages as described in PEP 262.

Existing package repositories such as the Vaults of Parnassus[3],
CPAN[4] and PAUSE[5] will be investigated as prior art in this field.

Rationale

Python programmers have long needed a simple method of discovering
existing modules and systems available for their use. It is arguable
that the existence of these systems for other languages have been a
significant contribution to their popularity. The existence of the
Catalog-SIG, and the many discussions there indicate that there is a
large population of users who recognise this need.

The introduction of the Distutils packaging system to Python simplified
the process of distributing shareable code, and included mechanisms for
the capture of package metadata, but did little with the metadata save
ship it with the package.

An interface to the index should be hosted in the python.org domain,
giving it an air of legitimacy that existing catalog efforts do not
have.

The interface for submitting information to the catalog should be as
simple as possible - hopefully just a one-line command for most users.

Issues of package dependency are not addressed due to the complexity of
such a system. PEP 262 proposes such a system, but as of this writing
the PEP is still unfinished.

Issues of package dissemination (storage on a central server) are not
addressed because they require assumptions about availability of storage
and bandwidth that I am not in a position to make. PEP 243, which is
still being developed, is tackling these issues and many more. This
proposal is considered compatible with, and adjunct to the proposal in
PEP 243.

Specification

The specification takes three parts, the web interface, the Distutils
register command and the Distutils Trove classification.

Web Interface

A web interface is implemented over a simple store. The interface is
available through the python.org domain, either directly or as
packages.python.org.

The store has columns for all metadata fields. The (name, version)
double is used as a uniqueness key. Additional submissions for an
existing (name, version) will result in an update operation.

The web interface implements the following commands/interfaces:

index

    Lists known packages, optionally filtered. An additional HTML page,
    search, presents a form to the user which is used to customise the
    index view. The index will include a browsing interface like that
    presented in the Trove interface design section 4.3. The results
    will be paginated, sorted alphabetically and only showing the most
    recent version. The most recent version information will be
    determined using the Distutils LooseVersion class.

display

    Displays information about the package. All fields are displayed as
    plain text. The "url" (or "home_page") field is hyperlinked.

submit

    Accepts a POST submission of metadata about a package. The "name"
    and "version" fields are mandatory, as they uniquely identify an
    entry in the index. Submit will automatically determine whether to
    create a new entry or update an existing entry. The metadata is
    checked for correctness where appropriate - specifically the Trove
    discriminators are compared with the allowed set. An update will
    update all information about the package based on the new submitted
    information.

    There will also be a submit/edit form that will allow manual
    submission and updating for those who do not use Distutils.

submit_pkg_info

    Accepts a POST submission of a PKG-INFO file and performs the same
    function as the submit interface.

user

    Registers a new user with the index. Requires username, password and
    email address. Passwords will be stored in the index database as SHA
    hashes. If the username already exists in the database:

    1.  If valid HTTP Basic authentication is provided, the password and
        email address are updated with the submission information, or
    2.  If no valid authentication is provided, the user is informed
        that the login is already taken.

    Registration will be a three-step process, involving:

    1.  User submission of details via the Distutils register command or
        through the web,
    2.  Index server sending email to the user's email address with a
        URL to visit to confirm registration with a random one-time key,
        and
    3.  User visits URL with the key and confirms registration.

roles

    An interface for changing user Role assignments.

password_reset

    Using a supplied email address as the key, this resets a user's
    password and sends an email with the new password to the user.

The submit command will require HTTP Basic authentication, preferably
over an HTTPS connection.

The server interface will indicate success or failure of the commands
through a subset of the standard HTTP response codes:

  Code   Meaning        Register command implications
  ------ -------------- -------------------------------------------------------------------------------------------
  200    OK             Everything worked just fine
  400    Bad request    Data provided for submission was malformed
  401    Unauthorised   The username or password supplied were incorrect
  403    Forbidden      User does not have permission to update the package information (not Owner or Maintainer)

User Roles

Three user Roles will be assignable to users:

Owner

    Owns a package name, may assign Maintainer Role for that name. The
    first user to register information about a package is deemed Owner
    of the package name. The Admin user may change this if necessary.
    May submit updates for the package name.

Maintainer

    Can submit and update info for a particular package name.

Admin

    Can assign Owner Role and edit user details. Not specific to a
    package name.

Index Storage (Schema)

The index is stored in a set of relational database tables:

packages

    Lists package names and holds package-level metadata (currently just
    the stable release version)

releases

    Each package has an entry in releases for each version of the
    package that is released. A row holds the bulk of the information
    given in the package's PKG-INFO file. There is one row for each
    package (name, version).

trove_discriminators

    Lists the Trove discriminator text and assigns each one a unique ID.

release_discriminators

    Each entry maps a package (name, version) to a discriminator_id. We
    map to releases instead of packages because the set of
    discriminators may change between releases.

journals

    Holds information about changes to package information in the index.
    Changes to the packages, releases, roles, and release_discriminators
    tables are listed here by package name and version if the change is
    release-specific.

users

    Holds our user database - user name, email address and password.

roles

    Maps user_name and role_name to a package_name.

An additional table, rego_otk holds the One Time Keys generated during
registration and is not interesting in the scope of the index itself.

Distutils register Command

An additional Distutils command, register, is implemented which posts
the package metadata to the central index. The register command
automatically handles user registration; the user is presented with
three options:

1.  login and submit package information
2.  register as a new packager
3.  send password reminder email

On systems where the $HOME environment variable is set, the user will be
prompted at exit to save their username/password to a file in their
$HOME directory in the file .pypirc.

Notification of changes to a package entry will be sent to all users who
have submitted information about the package. That is, the original
submitter and any subsequent updaters.

The register command will include a --verify option which performs a
test submission to the index without actually committing the data. The
index will perform its submission verification checks as usual and
report any errors it would have reported during a normal submission.
This is useful for verifying correctness of Trove discriminators.

Distutils Trove Classification

The Trove concept of discrimination will be added to the metadata set
available to package authors through the new attribute "classifiers".
The list of classifiers will be available through the web, and added to
the package like so:

    setup(
        name = "roundup",
        version = __version__,
        classifiers = [
            'Development Status :: 4 - Beta',
            'Environment :: Console',
            'Environment :: Web Environment',
            'Intended Audience :: End Users/Desktop',
            'Intended Audience :: Developers',
            'Intended Audience :: System Administrators',
            'License :: OSI Approved :: Python Software Foundation License',
            'Operating System :: MacOS :: MacOS X',
            'Operating System :: Microsoft :: Windows',
            'Operating System :: POSIX',
            'Programming Language :: Python',
            'Topic :: Communications :: Email',
            'Topic :: Office/Business',
            'Topic :: Software Development :: Bug Tracking',
        ],
        url = 'http://sourceforge.net/projects/roundup/',
        ...
    )

It was decided that strings would be used for the classification entries
due to the deep nesting that would be involved in a more formal Python
structure.

The original Trove specification that classification namespaces be
separated by slashes ("/") unfortunately collides with many of the names
having slashes in them (e.g. "OS/2"). The double-colon solution (" :: ")
implemented by SourceForge and FreshMeat gets around this limitation.

The list of classification values on the module index has been merged
from FreshMeat and SourceForge (with their permission). This list will
be made available both through the web interface and through the
register command's --list-classifiers option as a text list which may
then be copied to the setup.py file. The register command's --verify
option will check classifiers values against the server's list.

Unfortunately, the addition of the "classifiers" property is not
backwards-compatible. A setup.py file using it will not work under
Python 2.1.3. It is hoped that a bug-fix release of Python 2.2 (most
likely 2.2.3) will relax the argument checking of the setup() command to
allow new keywords, even if they're not actually used. It is preferable
that a warning be produced, rather than a show-stopping error. The use
of the new keyword should be discouraged in situations where the package
is advertised as being compatible with python versions earlier than
2.2.3 or 2.3.

In the PKG-INFO, the classifiers list items will appear as individual
Classifier: entries:

    Name: roundup
    Version: 0.5.2
    Classifier: Development Status :: 4 - Beta
    Classifier: Environment :: Console (Text Based)
                .
                .
    Classifier: Topic :: Software Development :: Bug Tracking
    Url: http://sourceforge.net/projects/roundup/

Implementation

The server is available at:

  http://www.python.org/pypi

The code is available from the SourceForge project:

  http://sourceforge.net/projects/pypi/

The register command has been integrated into Python 2.3.

Rejected Proposals

Originally, the index server was to return custom headers (inspired by
PEP 243):

X-Pypi-Status

    Either "success" or "fail".

X-Pypi-Reason

    A description of the reason for failure, or additional information
    in the case of a success.

However, it has been pointed out[6] that this is a bad scheme to use.

References

Copyright

This document has been placed in the public domain.

Acknowledgements

Anthony Baxter, Martin v. Loewis and David Goodger for encouragement and
feedback during initial drafting.

A.M. Kuchling for support including hosting the second prototype.

Greg Stein for recommending that the register command interpret the HTTP
response codes rather than custom X-PyPI-* headers.

The many participants of the Distutils and Catalog SIGs for their ideas
over the years.



  Local Variables: mode: indented-text indent-tabs-mode: nil
  sentence-end-double-space: t fill-column: 70 End:

[1] Distutils packaging system
(http://docs.python.org/library/distutils.html)

[2] Trove (http://www.catb.org/~esr/trove/)

[3] Vaults of Parnassus (http://www.vex.net/parnassus/)

[4] CPAN (http://www.cpan.org/)

[5] PAUSE (http://pause.cpan.org/)

[6] [PEP243] upload status is bogus
(https://mail.python.org/pipermail/distutils-sig/2001-March/002262.html)