PEP: 384 Title: Defining a Stable ABI Author: Martin von Löwis
<martin@v.loewis.de> Status: Final Type: Standards Track Content-Type:
text/x-rst Created: 17-May-2009 Python-Version: 3.2 Post-History:

python:stable (user docs) and devguide:c-api (development docs)

Abstract

Currently, each feature release introduces a new name for the Python DLL
on Windows, and may cause incompatibilities for extension modules on
Unix. This PEP proposes to define a stable set of API functions which
are guaranteed to be available for the lifetime of Python 3, and which
will also remain binary-compatible across versions. Extension modules
and applications embedding Python can work with different feature
releases as long as they restrict themselves to this stable ABI.

Rationale

The primary source of ABI incompatibility are changes to the lay-out of
in-memory structures. For example, the way in which string interning
works, or the data type used to represent the size of an object, have
changed during the life of Python 2.x. As a consequence, extension
modules making direct access to fields of strings, lists, or tuples,
would break if their code is loaded into a newer version of the
interpreter without recompilation: offsets of other fields may have
changed, making the extension modules access the wrong data.

In some cases, the incompatibilities only affect internal objects of the
interpreter, such as frame or code objects. For example, the way line
numbers are represented has changed in the 2.x lifetime, as has the way
in which local variables are stored (due to the introduction of
closures). Even though most applications probably never used these
objects, changing them had required to change the PYTHON_API_VERSION.

On Linux, changes to the ABI are often not much of a problem: the system
will provide a default Python installation, and many extension modules
are already provided pre-compiled for that version. If additional
modules are needed, or additional Python versions, users can typically
compile them themselves on the system, resulting in modules that use the
right ABI.

On Windows, multiple simultaneous installations of different Python
versions are common, and extension modules are compiled by their
authors, not by end users. To reduce the risk of ABI incompatibilities,
Python currently introduces a new DLL name pythonXY.dll for each feature
release, whether or not ABI incompatibilities actually exist.

With this PEP, it will be possible to reduce the dependency of binary
extension modules on a specific Python feature release, and applications
embedding Python can be made work with different releases.

Specification

The ABI specification falls into two parts: an API specification,
specifying what function (groups) are available for use with the ABI,
and a linkage specification specifying what libraries to link with. The
actual ABI (layout of structures in memory, function calling
conventions) is not specified, but implied by the compiler. As a
recommendation, a specific ABI is recommended for selected platforms.

During evolution of Python, new ABI functions will be added.
Applications using them will then have a requirement on a minimum
version of Python; this PEP provides no mechanism for such applications
to fall back when the Python library is too old.

Terminology

Applications and extension modules that want to use this ABI are
collectively referred to as "applications" from here on.

Header Files and Preprocessor Definitions

Applications shall only include the header file Python.h (before
including any system headers), or, optionally, include pyconfig.h, and
then Python.h.

During the compilation of applications, the preprocessor macro
Py_LIMITED_API must be defined. Doing so will hide all definitions that
are not part of the ABI.

Structures

Only the following structures and structure fields are accessible to
applications:

-   PyObject (ob_refcnt, ob_type)
-   PyVarObject (ob_base, ob_size)
-   PyMethodDef (ml_name, ml_meth, ml_flags, ml_doc)
-   PyMemberDef (name, type, offset, flags, doc)
-   PyGetSetDef (name, get, set, doc, closure)
-   PyModuleDefBase (ob_base, m_init, m_index, m_copy)
-   PyModuleDef (m_base, m_name, m_doc, m_size, m_methods, m_traverse,
    m_clear, m_free)
-   PyStructSequence_Field (name, doc)
-   PyStructSequence_Desc (name, doc, fields, sequence)
-   PyType_Slot (see below)
-   PyType_Spec (see below)

The accessor macros to these fields (Py_REFCNT, Py_TYPE, Py_SIZE) are
also available to applications.

The following types are available, but opaque (i.e. incomplete):

-   PyThreadState
-   PyInterpreterState
-   struct _frame
-   struct symtable
-   struct _node
-   PyWeakReference
-   PyLongObject
-   PyTypeObject

Type Objects

The structure of type objects is not available to applications;
declaration of "static" type objects is not possible anymore (for
applications using this ABI). Instead, type objects get created
dynamically. To allow an easy creation of types (in particular, to be
able to fill out function pointers easily), the following structures and
functions are available:

    typedef struct{
      int slot;    /* slot id, see below */
      void *pfunc; /* function pointer */
    } PyType_Slot;

    typedef struct{
      const char* name;
      int basicsize;
      int itemsize;
      unsigned int flags;
      PyType_Slot *slots; /* terminated by slot==0. */
    } PyType_Spec;

    PyObject* PyType_FromSpec(PyType_Spec*);

To specify a slot, a unique slot id must be provided. New Python
versions may introduce new slot ids, but slot ids will never be
recycled. Slots may get deprecated, but continue to be supported
throughout Python 3.x.

The slot ids are named like the field names of the structures that hold
the pointers in Python 3.1, with an added Py_ prefix (i.e. Py_tp_dealloc
instead of just tp_dealloc):

-   tp_dealloc, tp_getattr, tp_setattr, tp_repr, tp_hash, tp_call,
    tp_str, tp_getattro, tp_setattro, tp_doc, tp_traverse, tp_clear,
    tp_richcompare, tp_iter, tp_iternext, tp_methods, tp_base,
    tp_descr_get, tp_descr_set, tp_init, tp_alloc, tp_new, tp_is_gc,
    tp_bases, tp_del
-   nb_add nb_subtract nb_multiply nb_remainder nb_divmod nb_power
    nb_negative nb_positive nb_absolute nb_bool nb_invert nb_lshift
    nb_rshift nb_and nb_xor nb_or nb_int nb_float nb_inplace_add
    nb_inplace_subtract nb_inplace_multiply nb_inplace_remainder
    nb_inplace_power nb_inplace_lshift nb_inplace_rshift nb_inplace_and
    nb_inplace_xor nb_inplace_or nb_floor_divide nb_true_divide
    nb_inplace_floor_divide nb_inplace_true_divide nb_index
-   sq_length sq_concat sq_repeat sq_item sq_ass_item sq_contains
    sq_inplace_concat sq_inplace_repeat
-   mp_length mp_subscript mp_ass_subscript

The following fields cannot be set during type definition: - tp_dict
tp_mro tp_cache tp_subclasses tp_weaklist tp_print - tp_weaklistoffset
tp_dictoffset

typedefs

In addition to the typedefs for structs listed above, the following
typedefs are available. Their inclusion in the ABI means that the
underlying type must not change on a platform (even though it may differ
across platforms).

-   Py_uintptr_t Py_intptr_t Py_ssize_t
-   unaryfunc binaryfunc ternaryfunc inquiry lenfunc ssizeargfunc
    ssizessizeargfunc ssizeobjargproc ssizessizeobjargproc objobjargproc
    objobjproc visitproc traverseproc destructor getattrfunc
    getattrofunc setattrfunc setattrofunc reprfunc hashfunc richcmpfunc
    getiterfunc iternextfunc descrgetfunc descrsetfunc initproc newfunc
    allocfunc
-   PyCFunction PyCFunctionWithKeywords PyNoArgsFunction
    PyCapsule_Destructor
-   getter setter
-   PyOS_sighandler_t
-   PyGILState_STATE
-   Py_UCS4

Most notably, Py_UNICODE is not available as a typedef, since the same
Python version may use different definitions of it on the same platform
(depending on whether it uses narrow or wide code units). Applications
that need to access the contents of a Unicode string can convert it to
wchar_t.

Functions and function-like Macros

By default, all functions are available, unless they are excluded below.
Whether a function is documented or not does not matter.

Function-like macros (in particular, field access macros) remain
available to applications, but get replaced by function calls (unless
their definition only refers to features of the ABI, such as the various
_Check macros)

ABI function declarations will not change their parameters or return
types. If a change to the signature becomes necessary, a new function
will be introduced. If the new function is source-compatible (e.g. if
just the return type changes), an alias macro may get added to redirect
calls to the new function when the applications is recompiled.

If continued provision of the old function is not possible, it may get
deprecated, then removed, causing applications that use that function to
break.

Excluded Functions

All functions starting with _Py are not available to applications. Also,
all functions that expect parameter types that are unavailable to
applications are excluded from the ABI, such as PyAST_FromNode (which
expects a node*).

Functions declared in the following header files are not part of the
ABI:

-   bytes_methods.h
-   cellobject.h
-   classobject.h
-   code.h
-   compile.h
-   datetime.h
-   dtoa.h
-   frameobject.h
-   funcobject.h
-   genobject.h
-   longintrepr.h
-   parsetok.h
-   pyarena.h
-   pyatomic.h
-   pyctype.h
-   pydebug.h
-   pytime.h
-   symtable.h
-   token.h
-   ucnhash.h

In addition, functions expecting FILE* are not part of the ABI, to avoid
depending on a specific version of the Microsoft C runtime DLL on
Windows.

Module and type initializer and finalizer functions are not available
(PyByteArray_Init, PyOS_FiniInterrupts and all functions ending in _Fini
or _ClearFreeList).

Several functions dealing with interpreter implementation details are
not available:

-   PyInterpreterState_Head, PyInterpreterState_Next,
    PyInterpreterState_ThreadHead, PyThreadState_Next
-   Py_SubversionRevision, Py_SubversionShortBranch

PyStructSequence_InitType is not available, as it requires the caller to
provide a static type object.

Py_FatalError will be moved from pydebug.h into some other header file
(e.g. pyerrors.h).

The exact list of functions being available is given in the Windows
module definition file for python3.dll[1].

Global Variables

Global variables representing types and exceptions are available to
applications. In addition, selected global variables referenced in
macros (such as Py_True and Py_False) are available.

A complete list of global variable definitions is given in the
python3.def file[2]; those declared DATA denote variables.

Other Macros

All macros defining symbolic constants are available to applications;
the numeric values will not change.

In addition, the following macros are available:

-   Py_BEGIN_ALLOW_THREADS, Py_BLOCK_THREADS, Py_UNBLOCK_THREADS,
    Py_END_ALLOW_THREADS

The Buffer Interface

The buffer interface (type Py_buffer, type slots bf_getbuffer and
bf_releasebuffer, etc) has been omitted from the ABI, since the
stability of the Py_buffer structure is not clear at this time.
Inclusion in the ABI can be considered in future releases.

Signature Changes

A number of functions currently expect a specific struct, even though
callers typically have PyObject* available. These have been changed to
expect PyObject* as the parameter; this will cause warnings in
applications that currently explicitly cast to the parameter type. These
functions are PySlice_GetIndices, PySlice_GetIndicesEx,
PyUnicode_AsWideChar, and PyEval_EvalCode.

Linkage

On Windows, applications shall link with python3.dll; an import library
python3.lib will be available. This DLL will redirect all of its API
functions through /export linker options to the full interpreter DLL,
i.e. python3y.dll.

On Unix systems, the ABI is typically provided by the python executable
itself. PyModule_Create is changed to pass 3 as the API version if the
extension module was compiled with Py_LIMITED_API; the version check for
the API version will accept either 3 or the current PYTHON_API_VERSION
as conforming. If Python is compiled as a shared library, it is
installed as both libpython3.so, and libpython3.y.so; applications
conforming to this PEP should then link to the former (extension modules
can continue to link with no libpython shared object, but rather rely on
runtime linking). The ABI version is symbolically available as
PYTHON_ABI_VERSION.

Also on Unix, the PEP 3149 tag abi<PYTHON_ABI_VERSION> is accepted in
file names of extension modules. No checking is performed that files
named in this way are actually restricted to the limited API, and no
support for building such files will be added to distutils due to the
distutils code freeze.

Implementation Strategy

This PEP will be implemented in a branch[3], allowing users to check
whether their modules conform to the ABI. To avoid users having to
rewrite their type definitions, a script to convert C source code
containing type definitions will be provided[4].

References

Copyright

This document has been placed in the public domain.

[1] "python3 module definition file":
http://svn.python.org/projects/python/branches/pep-0384/PC/python3.def

[2] "python3 module definition file":
http://svn.python.org/projects/python/branches/pep-0384/PC/python3.def

[3] "PEP 384 branch":
http://svn.python.org/projects/python/branches/pep-0384/

[4] "ABI type conversion script":
http://svn.python.org/projects/python/branches/pep-0384/Tools/scripts/abitype.py