PEP: 3149 Title: ABI version tagged .so files Version: $Revision$
Last-Modified: $Date$ Author: Barry Warsaw <barry@python.org> Status:
Final Type: Standards Track Content-Type: text/x-rst Created:
09-Jul-2010 Python-Version: 3.2 Post-History: 14-Jul-2010, 22-Jul-2010
Resolution:
https://mail.python.org/pipermail/python-dev/2010-September/103408.html

Abstract

PEP 3147 described an extension to Python's import machinery that
improved the sharing of Python source code, by allowing more than one
byte compilation file (.pyc) to be co-located with each source file.

This PEP defines an adjunct feature which allows the co-location of
extension module files (.so) in a similar manner. This optional,
build-time feature will enable downstream distributions of Python to
more easily provide more than one Python major version at a time.

Background

PEP 3147 defined the file system layout for a pure-Python package, where
multiple versions of Python are available on the system. For example,
where the alpha package containing source modules one.py and two.py
exist on a system with Python 3.2 and 3.3, the post-byte compilation
file system layout would be:

    alpha/
        __pycache__/
            __init__.cpython-32.pyc
            __init__.cpython-33.pyc
            one.cpython-32.pyc
            one.cpython-33.pyc
            two.cpython-32.pyc
            two.cpython-33.pyc
        __init__.py
        one.py
        two.py

For packages with extension modules, a similar differentiation is needed
for the module's .so files. Extension modules compiled for different
Python major versions are incompatible with each other due to changes in
the ABI. Different configuration/compilation options for the same Python
version can result in different ABIs (e.g. --with-wide-unicode).

While PEP 384 defines a stable ABI, it will minimize, but not eliminate
extension module incompatibilities between Python builds or major
versions. Thus a mechanism for discriminating extension module file
names is proposed.

Rationale

Linux distributions such as Ubuntu[1] and Debian[2] provide more than
one Python version at the same time to their users. For example, Ubuntu
9.10 Karmic Koala users can install Python 2.5, 2.6, and 3.1, with
Python 2.6 being the default.

In order to share as much as possible between the available Python
versions, these distributions install third party package modules (.pyc
and .so files) into /usr/share/pyshared and symlink to them from
/usr/lib/pythonX.Y/dist-packages. The symlinks exist because in a
pre-PEP 3147 world (i.e < Python 3.2), the .pyc files resulting from
byte compilation by the various installed Pythons will name collide with
each other. For Python versions >= 3.2, all pure-Python packages can be
shared, because the .pyc files will no longer cause file system naming
conflicts. Eliminating these symlinks makes for a simpler, more robust
Python distribution.

A similar situation arises with shared library extensions. Because
extension modules are typically named foo.so for a foo extension module,
these would also name collide if foo was provided for more than one
Python version.

In addition, because different configuration/compilation options for the
same Python version can cause different ABIs to be presented to
extension modules. On POSIX systems for example, the configure options
--with-pydebug, --with-pymalloc, and --with-wide-unicode all change the
ABI. This PEP proposes to encode build-time options in the file name of
the .so extension module files.

PyPy[3] can also benefit from this PEP, allowing it to avoid name
collisions in extension modules built for its API, but with a different
.so tag.

Proposal

The configure/compilation options chosen at Python interpreter
build-time will be encoded in the shared library file name for extension
modules. This "tag" will appear between the module base name and the
operation file system extension for shared libraries.

The following information MUST be included in the shared library file
name:

-   The Python implementation (e.g. cpython, pypy, jython, etc.)
-   The interpreter's major and minor version numbers

These two fields are separated by a hyphen and no dots are to appear
between the major and minor version numbers. E.g. cpython-32.

Python implementations MAY include additional flags in the file name tag
as appropriate. For example, on POSIX systems these flags will also
contribute to the file name:

-   --with-pydebug (flag: d)
-   --with-pymalloc (flag: m)
-   --with-wide-unicode (flag: u)

By default in Python 3.2, configure enables --with-pymalloc so shared
library file names would appear as foo.cpython-32m.so. When the other
two flags are also enabled, the file names would be
foo.cpython-32dmu.so.

The shared library file name tag is used unconditionally; it cannot be
changed. The tag and extension module suffix are available through the
sysconfig modules via the following variables:

    >>> sysconfig.get_config_var('EXT_SUFFIX')
    '.cpython-32mu.so'
    >>> sysconfig.get_config_var('SOABI')
    'cpython-32mu'

Note that $SOABI contains just the tag, while $EXT_SUFFIX includes the
platform extension for shared library files, and is the exact suffix
added to the extension module name.

For an arbitrary package foo, you might see these files when the
distribution package was installed:

    /usr/lib/python/foo.cpython-32m.so
    /usr/lib/python/foo.cpython-33m.so

(These paths are for example purposes only. Distributions are free to
use whatever filesystem layout they choose, and nothing in this PEP
changes the locations where from-source builds of Python are installed.)

Python's dynamic module loader will recognize and import shared library
extension modules with a tag that matches its build-time options. For
backward compatibility, Python will also continue to import untagged
extension modules, e.g. foo.so.

This shared library tag would be used globally for all distutils-based
extension modules, regardless of where on the file system they are
built. Extension modules built by means other than distutils would
either have to calculate the tag manually, or fallback to the non-tagged
.so file name.

Proven approach

The approach described here is already proven, in a sense, on Debian and
Ubuntu system where different extensions are used for debug builds of
Python and extension modules. Debug builds on Windows also already use a
different file extension for dynamic libraries, and in fact encoded (in
a different way than proposed in this PEP) the Python major and minor
version in the .dll file name.

Windows

This PEP only addresses build issues on POSIX systems that use the
configure script. While Windows or other platform support is not
explicitly disallowed under this PEP, platform expertise is needed in
order to evaluate, describe, and implement support on such platforms. It
is not currently clear that the facilities in this PEP are even useful
for Windows.

PEP 384

PEP 384 defines a stable ABI for extension modules. In theory, universal
adoption of PEP 384 would eliminate the need for this PEP because all
extension modules could be compatible with any Python version. In
practice of course, it will be impossible to achieve universal adoption,
and as described above, different build-time flags still affect the ABI.
Thus even with a stable ABI, this PEP may still be necessary. While a
complete specification is reserved for PEP 384, here is a discussion of
the relevant issues.

PEP 384 describes a change to PyModule_Create() where 3 is passed as the
API version if the extension was compiled with Py_LIMITED_API. This
should be formalized into an official macro called PYTHON_ABI_VERSION to
mirror PYTHON_API_VERSION. If and when the ABI changes in an
incompatible way, this version number would be bumped. To facilitate
sharing, Python would be extended to search for extension modules with
the PYTHON_ABI_VERSION number in its name. The prefix abi is reserved
for Python's use.

Thus, an initial implementation of PEP 384, when Python is configured
with the default set of flags, would search for the following file names
when extension module foo is imported (in this order):

    foo.cpython-XYm.so
    foo.abi3.so
    foo.so

The distutils[4] build_ext command would also have to be extended to
compile to shared library files with the abi3 tag, when the module
author indicates that their extension supports that version of the ABI.
This could be done in a backward compatible way by adding a keyword
argument to the Extension class, such as:

    Extension('foo', ['foo.c'], abi=3)

Martin v. Löwis describes his thoughts[5] about the applicability of
this PEP to PEP 384. In summary:

-   --with-pydebug would not be supported by the stable ABI because this
    changes the layout of PyObject, which is an exposed structure.
-   --with-pymalloc has no bearing on the issue.
-   --with-wide-unicode is trickier, though Martin's inclination is to
    force the stable ABI to use a Py_UNICODE that matches the platform's
    wchar_t.

Alternatives

In the initial python-dev thread[6] where this idea was first
introduced, several alternatives were suggested. For completeness they
are listed here, along with the reasons for not adopting them.

Independent directories or symlinks

Debian and Ubuntu could simply add a version-specific directory to
sys.path that would contain just the extension modules for that version
of Python. Or the symlink trick eliminated in PEP 3147 could be retained
for just shared libraries. This approach is rejected because it
propagates the essential complexity that PEP 3147 tries to avoid, and
adds potentially several additional directories to search for all
modules, even when the number of extension modules is much fewer than
the total number of Python packages. For example, builds were made
available both with and without wide unicode, with and without pydebug,
and with and without pymalloc, the total number of directories search
increases substantially.

Don't share packages with extension modules

It has been suggested that Python packages with extension modules not be
shared among all supported Python versions on a distribution. Even with
adoption of PEP 3149, extension modules will have to be compiled for
every supported Python version, so perhaps sharing of such packages
isn't useful anyway. Not sharing packages with extensions though is
infeasible for several reasons.

If a pure-Python package is shared in one version, should it suddenly be
not-shared if the next release adds an extension module for speed? Also,
even though all extension shared libraries will be compiled and
distributed once for every supported Python, there's a big difference
between duplicating the .so files and duplicating all .py files. The
extra size increases the download time for such packages, and more
immediately, increases the space pressures on already constrained
distribution CD-ROMs.

Reference implementation

Work on this code is tracked in a Bazaar branch on Launchpad[7] until
it's ready for merge into Python 3.2. The work-in-progress diff can also
be viewed[8] and is updated automatically as new changes are uploaded.

References

Copyright

This document has been placed in the public domain.



  Local Variables: mode: indented-text indent-tabs-mode: nil
  sentence-end-double-space: t fill-column: 70 coding: utf-8 End:

[1] Ubuntu: <http://www.ubuntu.com>

[2] Debian: <http://www.debian.org>

[3] http://codespeak.net/pypy/dist/pypy/doc/

[4] http://docs.python.org/py3k/distutils/index.html

[5] https://mail.python.org/pipermail/python-dev/2010-August/103330.html

[6] https://mail.python.org/pipermail/python-dev/2010-June/100998.html

[7] https://code.edge.launchpad.net/~barry/python/sovers

[8] https://code.edge.launchpad.net/~barry/python/sovers/+merge/29411