PEP: 364 Title: Transitioning to the Py3K Standard Library Version:
$Revision$ Last-Modified: $Date$ Author: Barry Warsaw <barry@python.org>
Status: Withdrawn Type: Standards Track Content-Type: text/x-rst
Created: 01-Mar-2007 Python-Version: 2.6 Post-History:

Abstract

PEP 3108 describes the reorganization of the Python standard library for
the Python 3.0 release. This PEP describes a mechanism for transitioning
from the Python 2.x standard library to the Python 3.0 standard library.
This transition will allow and encourage Python programmers to use the
new Python 3.0 library names starting with Python 2.6, while maintaining
the old names for backward compatibility. In this way, a Python
programmer will be able to write forward compatible code without
sacrificing interoperability with existing Python programs.

Rationale

PEP 3108 presents a rationale for Python standard library (stdlib)
reorganization. The reader is encouraged to consult that PEP for details
about why and how the library will be reorganized. Should PEP 3108 be
accepted in part or in whole, then it is advantageous to allow Python
programmers to begin the transition to the new stdlib module names in
Python 2.x, so that they can write forward compatible code starting with
Python 2.6.

Note that PEP 3108 proposes to remove some "silly old stuff", i.e.
modules that are no longer useful or necessary. The PEP you are reading
does not address this because there are no forward compatibility issues
for modules that are to be removed, except to stop using such modules.

This PEP concerns only the mechanism by which mappings from old stdlib
names to new stdlib names are maintained. Please consult PEP 3108 for
all specific module renaming proposals. Specifically see the section
titled Modules to Rename for guidelines on the old name to new name
mappings. The few examples in this PEP are given for illustrative
purposes only and should not be used for specific renaming
recommendations.

Supported Renamings

There are at least 4 use cases explicitly supported by this PEP:

-   Simple top-level package name renamings, such as StringIO to
    stringio;
-   Sub-package renamings where the package name may or may not be
    renamed, such as email.MIMEText to email.mime.text;
-   Extension module renaming, such as cStringIO to cstringio;
-   Third party renaming of any of the above.

Two use cases supported by this PEP include renaming simple top-level
modules, such as StringIO, as well as modules within packages, such as
email.MIMEText.

In the former case, PEP 3108 currently recommends StringIO be renamed to
stringio, following PEP 8 recommendations.

In the latter case, the email 4.0 package distributed with Python 2.5
already renamed email.MIMEText to email.mime.text, although it did so in
a one-off, uniquely hackish way inside the email package. The mechanism
described in this PEP is general enough to handle all module renamings,
obviating the need for the Python 2.5 hack (except for backward
compatibility with earlier Python versions).

An additional use case is to support the renaming of C extension
modules. As long as the new name for the C module is importable, it can
be remapped to the new name. E.g. cStringIO renamed to cstringio.

Third party package renaming is also supported, via several public
interfaces accessible by any Python module.

Remappings are not performed recursively.

.mv files

Remapping files are called .mv files; the suffix was chosen to be
evocative of the Unix mv(1) command. An .mv file is a simple
line-oriented text file. All blank lines and lines that start with a #
are ignored. All other lines must contain two whitespace separated
fields. The first field is the old module name, and the second field is
the new module name. Both module names must be specified using their
full dotted-path names. Here is an example .mv file from Python 2.6:

    # Map the various string i/o libraries to their new names
    StringIO    stringio
    cStringIO   cstringio

.mv files can appear anywhere in the file system, and there is a
programmatic interface provided to parse them, and register the
remappings inside them. By default, when Python starts up, all the .mv
files in the oldlib package are read, and their remappings are
automatically registered. This is where all the module remappings should
be specified for top-level Python 2.x standard library modules.

Implementation Specification

This section provides the full specification for how module renamings in
Python 2.x are implemented. The central mechanism relies on various
import hooks as described in PEP 302. Specifically
sys.path_importer_cache, sys.path, and sys.meta_path are all employed to
provide the necessary functionality.

When Python's import machinery is initialized, the oldlib package is
imported. Inside oldlib there is a class called OldStdlibLoader. This
class implements the PEP 302 interface and is automatically
instantiated, with zero arguments. The constructor reads all the .mv
files from the oldlib package directory, automatically registering all
the remappings found in those .mv files. This is how the Python 2.x
standard library is remapped.

The OldStdlibLoader class should not be instantiated by other Python
modules. Instead, you can access the global OldStdlibLoader instance via
the sys.stdlib_remapper instance. Use this instance if you want
programmatic access to the remapping machinery.

One important implementation detail: as needed by the PEP 302 API, a
magic string is added to sys.path, and module __path__ attributes in
order to hook in our remapping loader. This magic string is currently
<oldlib> and some changes were necessary to Python's site.py file in
order to treat all sys.path entries starting with < as special.
Specifically, no attempt is made to make them absolute file names (since
they aren't file names at all).

In order for the remapping import hooks to work, the module or package
must be physically located under its new name. This is because the
import hooks catch only modules that are not already imported, and
cannot be imported by Python's built-in import rules. Thus, if a module
has been moved, say from Lib/StringIO.py to Lib/stringio.py, and the
former's .pyc file has been removed, then without the remapper, this
would fail:

    import StringIO

Instead, with the remapper, this failing import will be caught, the old
name will be looked up in the registered remappings, and in this case,
the new name stringio will be found. The remapper then attempts to
import the new name, and if that succeeds, it binds the resulting module
into sys.modules, under both the old and new names. Thus, the above
import will result in entries in sys.modules for 'StringIO' and
'stringio', and both will point to the exact same module object.

Note that no way to disable the remapping machinery is proposed, short
of moving all the .mv files away or programmatically removing them in
some custom start up code. In Python 3.0, the remappings will be
eliminated, leaving only the "new" names.

Programmatic Interface

Several methods are added to the sys.stdlib_remapper object, which third
party packages can use to register their own remappings. Note however
that in all cases, there is one and only one mapping from an old name to
a new name. If two .mv files contain different mappings for an old name,
or if a programmatic call is made with an old name that is already
remapped, the previous mapping is lost. This will not affect any already
imported modules.

The following methods are available on the sys.stdlib_remapper object:

-   read_mv_file(filename) -- Read the given file and register all
    remappings found in the file.
-   read_directory_mv_files(dirname, suffix='.mv') -- List the given
    directory, reading all files in that directory that have the
    matching suffix (.mv by default). For each parsed file, register all
    the remappings found in that file.
-   set_mapping(oldname, newname) -- Register a new mapping from an old
    module name to a new module name. Both must be the full dotted-path
    name to the module. newname may be None in which case any existing
    mapping for oldname will be removed (it is not an error if there is
    no existing mapping).
-   get_mapping(oldname, default=None) -- Return any registered newname
    for the given oldname. If there is no registered remapping, default
    is returned.

Open Issues

-   Should there be a command line switch and/or environment variable to
    disable all remappings?

-   Should remappings occur recursively?

-   Should we automatically parse package directories for .mv files when
    the package's __init__.py is loaded? This would allow packages to
    easily include .mv files for their own remappings. Compare what the
    email package currently has to do if we place its .mv file in the
    email package instead of in the oldlib package:

        # Expose old names
        import os, sys
        sys.stdlib_remapper.read_directory_mv_files(os.path.dirname(__file__))

    I think we should automatically read a package's directory for any
    .mv files it might contain.

Reference Implementation

A reference implementation, in the form of a patch against the current
(as of this writing) state of the Python 2.6 svn trunk, is available as
SourceForge patch #1675334[1]. Note that this patch includes a rename of
cStringIO to cstringio, but this is primarily for illustrative and unit
testing purposes. Should the patch be accepted, we might want to split
this change off into other PEP 3108 changes.

References

Copyright

This document has been placed in the public domain.



  Local Variables: mode: indented-text indent-tabs-mode: nil
  sentence-end-double-space: t fill-column: 70 coding: utf-8 End:

[1] Reference implementation (http://bugs.python.org/issue1675334)