PEP: 674 Title: Disallow using macros as l-values Author: Victor Stinner
<vstinner@python.org> Status: Deferred Type: Standards Track
Content-Type: text/x-rst Created: 30-Nov-2021 Python-Version: 3.12

Abstract

Disallow using macros as l-values. For example, Py_TYPE(obj) = new_type
now fails with a compiler error.

In practice, the majority of affected projects only have to make two
changes:

-   Replace Py_TYPE(obj) = new_type with Py_SET_TYPE(obj, new_type).
-   Replace Py_SIZE(obj) = new_size with Py_SET_SIZE(obj, new_size).

PEP Deferral

See SC reply to PEP 674 -- Disallow using macros as l-values (February
2022).

Rationale

Using a macro as a an l-value

In the Python C API, some functions are implemented as macro because
writing a macro is simpler than writing a regular function. If a macro
exposes directly a structure member, it is technically possible to use
this macro to not only get the structure member but also set it.

Example with the Python 3.10 Py_TYPE() macro:

    #define Py_TYPE(ob) (((PyObject *)(ob))->ob_type)

This macro can be used as a r-value to get an object type:

    type = Py_TYPE(object);

It can also be used as an l-value to set an object type:

    Py_TYPE(object) = new_type;

It is also possible to set an object reference count and an object size
using Py_REFCNT() and Py_SIZE() macros.

Setting directly an object attribute relies on the current exact CPython
implementation. Implementing this feature in other Python
implementations can make their C API implementation less efficient.

CPython nogil fork

Sam Gross forked Python 3.9 to remove the GIL: the nogil branch. This
fork has no PyObject.ob_refcnt member, but a more elaborated
implementation for reference counting, and so the
Py_REFCNT(obj) = new_refcnt; code fails with a compiler error.

Merging the nogil fork into the upstream CPython main branch requires
first to fix this C API compatibility issue. It is a concrete example of
a Python optimization blocked indirectly by the C API.

This issue was already fixed in Python 3.10: the Py_REFCNT() macro has
been already modified to disallow using it as an l-value.

These statements are endorsed by Sam Gross (nogil developer).

HPy project

The HPy project is a brand new C API for Python using only handles and
function calls: handles are opaque, structure members cannot be accessed
directly, and pointers cannot be dereferenced.

Searching and replacing Py_SET_SIZE() is easier and safer than searching
and replacing some strange macro uses of Py_SIZE(). Py_SIZE() can be
semi-mechanically replaced by HPy_Length(), whereas seeing Py_SET_SIZE()
would immediately make clear that the code needs bigger changes in order
to be ported to HPy (for example by using HPyTupleBuilder or
HPyListBuilder).

The fewer internal details exposed via macros, the easier it will be for
HPy to provide direct equivalents. Any macro that references
"non-public" interfaces effectively exposes those interfaces publicly.

These statements are endorsed by Antonio Cuni (HPy developer).

GraalVM Python

In GraalVM, when a Python object is accessed by the Python C API, the C
API emulation layer has to wrap the GraalVM objects into wrappers that
expose the internal structure of the CPython structures (PyObject,
PyLongObject, PyTypeObject, etc). This is because when the C code
accesses it directly or via macros, all GraalVM can intercept is a read
at the struct offset, which has to be mapped back to the representation
in GraalVM. The smaller the "effective" number of exposed struct members
(by replacing macros with functions), the simpler GraalVM wrappers can
be.

This PEP alone is not enough to get rid of the wrappers in GraalVM, but
it is a step towards this long term goal. GraalVM already supports HPy
which is a better solution in the long term.

These statements are endorsed by Tim Felgentreff (GraalVM Python
developer).

Specification

Disallow using macros as l-values

The following 65 macros are modified to disallow using them as l-values.

PyObject and PyVarObject macros

-   Py_TYPE(): Py_SET_TYPE() must be used instead
-   Py_SIZE(): Py_SET_SIZE() must be used instead

GET macros

-   PyByteArray_GET_SIZE()
-   PyBytes_GET_SIZE()
-   PyCFunction_GET_CLASS()
-   PyCFunction_GET_FLAGS()
-   PyCFunction_GET_FUNCTION()
-   PyCFunction_GET_SELF()
-   PyCell_GET()
-   PyCode_GetNumFree()
-   PyDict_GET_SIZE()
-   PyFunction_GET_ANNOTATIONS()
-   PyFunction_GET_CLOSURE()
-   PyFunction_GET_CODE()
-   PyFunction_GET_DEFAULTS()
-   PyFunction_GET_GLOBALS()
-   PyFunction_GET_KW_DEFAULTS()
-   PyFunction_GET_MODULE()
-   PyHeapType_GET_MEMBERS()
-   PyInstanceMethod_GET_FUNCTION()
-   PyList_GET_SIZE()
-   PyMemoryView_GET_BASE()
-   PyMemoryView_GET_BUFFER()
-   PyMethod_GET_FUNCTION()
-   PyMethod_GET_SELF()
-   PySet_GET_SIZE()
-   PyTuple_GET_SIZE()
-   PyUnicode_GET_DATA_SIZE()
-   PyUnicode_GET_LENGTH()
-   PyUnicode_GET_LENGTH()
-   PyUnicode_GET_SIZE()
-   PyWeakref_GET_OBJECT()

AS macros

-   PyByteArray_AS_STRING()
-   PyBytes_AS_STRING()
-   PyFloat_AS_DOUBLE()
-   PyUnicode_AS_DATA()
-   PyUnicode_AS_UNICODE()

PyUnicode macros

-   PyUnicode_1BYTE_DATA()
-   PyUnicode_2BYTE_DATA()
-   PyUnicode_4BYTE_DATA()
-   PyUnicode_DATA()
-   PyUnicode_IS_ASCII()
-   PyUnicode_IS_COMPACT()
-   PyUnicode_IS_READY()
-   PyUnicode_KIND()
-   PyUnicode_READ()
-   PyUnicode_READ_CHAR()

PyDateTime GET macros

-   PyDateTime_DATE_GET_FOLD()
-   PyDateTime_DATE_GET_HOUR()
-   PyDateTime_DATE_GET_MICROSECOND()
-   PyDateTime_DATE_GET_MINUTE()
-   PyDateTime_DATE_GET_SECOND()
-   PyDateTime_DATE_GET_TZINFO()
-   PyDateTime_DELTA_GET_DAYS()
-   PyDateTime_DELTA_GET_MICROSECONDS()
-   PyDateTime_DELTA_GET_SECONDS()
-   PyDateTime_GET_DAY()
-   PyDateTime_GET_MONTH()
-   PyDateTime_GET_YEAR()
-   PyDateTime_TIME_GET_FOLD()
-   PyDateTime_TIME_GET_HOUR()
-   PyDateTime_TIME_GET_MICROSECOND()
-   PyDateTime_TIME_GET_MINUTE()
-   PyDateTime_TIME_GET_SECOND()
-   PyDateTime_TIME_GET_TZINFO()

Port C extensions to Python 3.11

In practice, the majority of projects affected by these PEP only have to
make two changes:

-   Replace Py_TYPE(obj) = new_type with Py_SET_TYPE(obj, new_type).
-   Replace Py_SIZE(obj) = new_size with Py_SET_SIZE(obj, new_size).

The pythoncapi_compat project can be used to update automatically C
extensions: add Python 3.11 support without losing support with older
Python versions. The project provides a header file which provides
Py_SET_REFCNT(), Py_SET_TYPE() and Py_SET_SIZE() functions to Python 3.8
and older.

PyTuple_GET_ITEM() and PyList_GET_ITEM() are left unchanged

The PyTuple_GET_ITEM() and PyList_GET_ITEM() macros are left unchanged.

The code patterns &PyTuple_GET_ITEM(tuple, 0) and
&PyList_GET_ITEM(list, 0) are still commonly used to get access to the
inner PyObject** array.

Changing these macros is out of the scope of this PEP.

PyDescr_NAME() and PyDescr_TYPE() are left unchanged

The PyDescr_NAME() and PyDescr_TYPE() macros are left unchanged.

These macros give access to PyDescrObject.d_name and
PyDescrObject.d_type members. They can be used as l-values to set these
members.

The SWIG project uses these macros as l-values to set these members. It
would be possible to modify SWIG to prevent setting PyDescrObject
structure members directly, but it is not really worth it since the
PyDescrObject structure is not performance critical and is unlikely to
change soon.

See the bpo-46538 "[C API] Make the PyDescrObject structure opaque:
PyDescr_NAME() and PyDescr_TYPE()" issue for more details.

Implementation

The implementation is tracked by bpo-45476: [C API] PEP 674: Disallow
using macros as l-values.

Py_TYPE() and Py_SIZE() macros

In May 2020, the Py_TYPE() and Py_SIZE() macros have been modified to
disallow using them as l-values (Py_TYPE, Py_SIZE).

In November 2020, the change was reverted, since it broke too many third
party projects.

In June 2021, once most third party projects were updated, a second
attempt was done, but had to be reverted again , since it broke
test_exceptions on Windows.

In September 2021, once test_exceptions has been fixed, Py_TYPE() and
Py_SIZE() were finally changed.

In November 2021, this backward incompatible change got a Steering
Council exception.

In October 2022, Python 3.11 got released with Py_TYPE() and Py_SIZE()
incompatible changes.

Backwards Compatibility

The proposed C API changes are backward incompatible on purpose.

In practice, only Py_TYPE() and Py_SIZE() macros are used as l-values.

This change does not follow the PEP 387 deprecation process. There is no
known way to emit a deprecation warning only when a macro is used as an
l-value, but not when it's used differently (ex: as a r-value).

The following 4 macros are left unchanged to reduce the number of
affected projects: PyDescr_NAME(), PyDescr_TYPE(), PyList_GET_ITEM() and
PyTuple_GET_ITEM().

Statistics

In total (projects on PyPI and not on PyPI), 34 projects are known to be
affected by this PEP:

-   16 projects (47%) are already fixed
-   18 projects (53%) are not fixed yet (pending fix or have to
    regenerate their Cython code)

On September 1, 2022, the PEP affects 18 projects (0.4%) of the top 5000
PyPI projects:

-   15 projects (0.3%) have to regenerate their Cython code
-   3 projects (0.1%) have a pending fix

Top 5000 PyPI

Projects with a pending fix (3):

-   datatable (1.0.0): fixed
-   guppy3 (3.1.2): fixed
-   scipy (1.9.3): need to update boost python

Moreover, 15 projects have to regenerate their Cython code.

Projects released with a fix (12):

-   bitarray (1.6.2): commit
-   Cython (0.29.20): commit
-   immutables (0.15): commit
-   mercurial (5.7): commit, bug report
-   mypy (v0.930): commit
-   numpy (1.22.1): commit, commit 2
-   pycurl (7.44.1): commit
-   PyGObject (3.42.0)
-   pyside2 (5.15.1): bug report
-   python-snappy (0.6.1): fixed
-   recordclass (0.17.2): fixed
-   zstd (1.5.0.3): commit

There are also two backport projects which are affected by this PEP:

-   pickle5 (0.0.12): backport for Python <= 3.7
-   pysha3 (1.0.2): backport for Python <= 3.5

They must not be used and cannot be used on Python 3.11.

Other affected projects

Other projects released with a fix (4):

-   boost (1.78.0): commit
-   breezy (3.2.1): bug report
-   duplicity (0.8.18): commit
-   gobject-introspection (1.70.0): MR

Relationship with the HPy project

The HPy project

The hope with the HPy project is to provide a C API that is close to the
original API—to make porting easy—and have it perform as close to the
existing API as possible. At the same time, HPy is sufficiently removed
to be a good "C extension API" (as opposed to a stable subset of the
CPython implementation API) that does not leak implementation details.
To ensure this latter property, the HPy project tries to develop
everything in parallel for CPython, PyPy, and GraalVM Python.

HPy is still evolving very fast. Issues are still being solved while
migrating NumPy, and work has begun on adding support for HPy to Cython.
Work on pybind11 is starting soon. Tim Felgentreff believes by the time
HPy has these users of the existing C API working, HPy should be in a
state where it is generally useful and can be deemed stable enough that
further development can follow a more stable process.

In the long run the HPy project would like to become a promoted API to
write Python C extensions.

The HPy project is a good solution for the long term. It has the
advantage of being developed outside Python and it doesn't require any C
API change.

The C API is here is stay for a few more years

The first concern about HPy is that right now, HPy is not mature nor
widely used, and CPython still has to continue supporting a large amount
of C extensions which are not likely to be ported to HPy soon.

The second concern is the inability to evolve CPython internals to
implement new optimizations, and the inefficient implementation of the
current C API in PyPy, GraalPython, etc. Sadly, HPy will only solve
these problems when most C extensions will be fully ported to HPy: when
it will become reasonable to consider dropping the "legacy" Python C
API.

While porting a C extension to HPy can be done incrementally on CPython,
it requires to modify a lot of code and takes time. Porting most C
extensions to HPy is expected to take a few years.

This PEP proposes to make the C API "less bad" by fixing one problem
which is clearily identified as causing practical issues: macros used as
l-values. This PEP only requires updating a minority of C extensions,
and usually only a few lines need to be changed in impacted extensions.

For example, NumPy 1.22 is made of 307,300 lines of C code, and adapting
NumPy to the this PEP only modified 11 lines (use Py_SET_TYPE and
Py_SET_SIZE) and adding 4 lines (to define Py_SET_TYPE and Py_SET_SIZE
for Python 3.8 and older). The beginnings of the NumPy port to HPy
already required modifying more lines than that.

Right now, it's hard to bet which approach is the best: fixing the
current C API, or focusing on HPy. It would be risky to only focus on
HPy.

Rejected Idea: Leave the macros as they are

The documentation of each function can discourage developers to use
macros to modify Python objects.

If these is a need to make an assignment, a setter function can be added
and the macro documentation can require to use the setter function. For
example, a Py_SET_TYPE() function has been added to Python 3.9 and the
Py_TYPE() documentation now requires to use the Py_SET_TYPE() function
to set an object type.

If developers use macros as an l-value, it's their responsibility when
their code breaks, not Python's responsibility. We are operating under
the consenting adults principle: we expect users of the Python C API to
use it as documented and expect them to take care of the fallout, if
things break when they don't.

This idea was rejected because only few developers read the
documentation, and only a minority is tracking changes of the Python C
API documentation. The majority of developers are only using CPython and
so are not aware of compatibility issues with other Python
implementations.

Moreover, continuing to allow using macros as an l-value does not help
the HPy project, and leaves the burden of emulating them on GraalVM's
Python implementation.

Macros already modified

The following C API macros have already been modified to disallow using
them as l-value:

-   PyCell_SET()
-   PyList_SET_ITEM()
-   PyTuple_SET_ITEM()
-   Py_REFCNT() (Python 3.10): Py_SET_REFCNT() must be used
-   _PyGCHead_SET_FINALIZED()
-   _PyGCHead_SET_NEXT()
-   asdl_seq_GET()
-   asdl_seq_GET_UNTYPED()
-   asdl_seq_LEN()
-   asdl_seq_SET()
-   asdl_seq_SET_UNTYPED()

For example, PyList_SET_ITEM(list, 0, item) < 0 now fails with a
compiler error as expected.

Post History

-   PEP 674 "Disallow using macros as l-values" and Python 3.11 (August
    18, 2022)
-   SC reply to PEP 674 -- Disallow using macros as l-values (February
    22, 2022)
-   PEP 674: Disallow using macros as l-value (version 2) (Jan 18, 2022)
-   PEP 674: Disallow using macros as l-value (Nov 30, 2021)

References

-   Python C API: Add functions to access PyObject (October
    2021) article by Victor Stinner
-   [capi-sig] Py_TYPE() and Py_SIZE() become static inline functions
    (September 2021)
-   [C API] Avoid accessing PyObject and PyVarObject members directly:
    add Py_SET_TYPE() and Py_IS_TYPE(), disallow Py_TYPE(obj)=type
    (February 2020)
-   bpo-30459: PyList_SET_ITEM could be safer (May 2017)

Version History

-   Version 3: No longer change PyDescr_TYPE() and PyDescr_NAME() macros
-   Version 2: Add "Relationship with the HPy project" section, remove
    the PyPy section
-   Version 1: First public version

Copyright

This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.