PEP: 683 Title: Immortal Objects, Using a Fixed Refcount Author: Eric
Snow <ericsnowcurrently@gmail.com>, Eddie Elizondo
<eduardo.elizondorueda@gmail.com> Discussions-To:
https://discuss.python.org/t/18183 Status: Final Type: Standards Track
Created: 10-Feb-2022 Python-Version: 3.12 Post-History: 16-Feb-2022,
19-Feb-2022, 28-Feb-2022, 12-Aug-2022, Resolution:
https://discuss.python.org/t/18183/26

reference count

PEP Acceptance Conditions

The PEP was accepted with conditions:

-   the primary proposal in Solutions for Accidental De-Immortalization
    (reset the immortal refcount in tp_dealloc()) was applied
-   types without this were not immortalized (in CPython's code)
-   the PEP was updated with final benchmark results once the
    implementation is finalized (confirming the change is worthwhile)

Abstract

Currently the CPython runtime maintains a small amount of mutable state
in the allocated memory of each object. Because of this, otherwise
immutable objects are actually mutable. This can have a large negative
impact on CPU and memory performance, especially for approaches to
increasing Python's scalability.

This proposal mandates that, internally, CPython will support marking an
object as one for which that runtime state will no longer change.
Consequently, such an object's refcount will never reach 0, and thus the
object will never be cleaned up (except when the runtime knows it's safe
to do so, like during runtime finalization). We call these objects
"immortal". (Normally, only a relatively small number of internal
objects will ever be immortal.) The fundamental improvement here is that
now an object can be truly immutable.

Scope

Object immortality is meant to be an internal-only feature, so this
proposal does not include any changes to public API or behavior (with
one exception). As usual, we may still add some private (yet publicly
accessible) API to do things like immortalize an object or tell if one
is immortal. Any effort to expose this feature to users would need to be
proposed separately.

There is one exception to "no change in behavior": refcounting semantics
for immortal objects will differ in some cases from user expectations.
This exception, and the solution, are discussed below.

Most of this PEP focuses on an internal implementation that satisfies
the above mandate. However, those implementation details are not meant
to be strictly proscriptive. Instead, at the least they are included to
help illustrate the technical considerations required by the mandate.
The actual implementation may deviate somewhat as long as it satisfies
the constraints outlined below. Furthermore, the acceptability of any
specific implementation detail described below does not depend on the
status of this PEP, unless explicitly specified.

For example, the particular details of:

-   how to mark something as immortal
-   how to recognize something as immortal
-   which subset of functionally immortal objects are marked as immortal
-   which memory-management activities are skipped or modified for
    immortal objects

are not only CPython-specific but are also private implementation
details that are expected to change in subsequent versions.

Implementation Summary

Here's a high-level look at the implementation:

If an object's refcount matches a very specific value (defined below)
then that object is treated as immortal. The CPython C-API and runtime
will not modify the refcount (or other runtime state) of an immortal
object. The runtime will now be explicitly responsible for deallocating
all immortal objects during finalization, unless statically allocated.
(See Object Cleanup below.)

Aside from the change to refcounting semantics, there is one other
possible negative impact to consider. The threshold for an "acceptable"
performance penalty for immortal objects is 2% (the consensus at the
2022 Language Summit). A naive implementation of the approach described
below makes CPython roughly 4% slower. However, the implementation is
~performance-neutral~ once known mitigations are applied.

TODO: Update the performance impact for the latest branch (both for GCC
and for clang).

Motivation

As noted above, currently all objects are effectively mutable. That
includes "immutable" objects like str instances. This is because every
object's refcount is frequently modified as the object is used during
execution. This is especially significant for a number of commonly used
global (builtin) objects, e.g. None. Such objects are used a lot, both
in Python code and internally. That adds up to a consistent high volume
of refcount changes.

The effective mutability of all Python objects has a concrete impact on
parts of the Python community, e.g. projects that aim for scalability
like Instagram or the effort to make the GIL per-interpreter. Below we
describe several ways in which refcount modification has a real negative
effect on such projects. None of that would happen for objects that are
truly immutable.

Reducing CPU Cache Invalidation

Every modification of a refcount causes the corresponding CPU cache line
to be invalidated. This has a number of effects.

For one, the write must be propagated to other cache levels and to main
memory. This has small effect on all Python programs. Immortal objects
would provide a slight relief in that regard.

On top of that, multi-core applications pay a price. If two threads
(running simultaneously on distinct cores) are interacting with the same
object (e.g. None) then they will end up invalidating each other's
caches with each incref and decref. This is true even for otherwise
immutable objects like True, 0, and str instances. CPython's GIL helps
reduce this effect, since only one thread runs at a time, but it doesn't
completely eliminate the penalty.

Avoiding Data Races

Speaking of multi-core, we are considering making the GIL a
per-interpreter lock, which would enable true multi-core parallelism.
Among other things, the GIL currently protects against races between
multiple concurrent threads that may incref or decref the same object.
Without a shared GIL, two running interpreters could not safely share
any objects, even otherwise immutable ones like None.

This means that, to have a per-interpreter GIL, each interpreter must
have its own copy of every object. That includes the singletons and
static types. We have a viable strategy for that but it will require a
meaningful amount of extra effort and extra complexity.

The alternative is to ensure that all shared objects are truly
immutable. There would be no races because there would be no
modification. This is something that the immortality proposed here would
enable for otherwise immutable objects. With immortal objects, support
for a per-interpreter GIL becomes much simpler.

Avoiding Copy-on-Write

For some applications it makes sense to get the application into a
desired initial state and then fork the process for each worker. This
can result in a large performance improvement, especially memory usage.
Several enterprise Python users (e.g. Instagram, YouTube) have taken
advantage of this. However, the above refcount semantics drastically
reduce the benefits and have led to some sub-optimal workarounds.

Also note that "fork" isn't the only operating system mechanism that
uses copy-on-write semantics. Another example is mmap. Any such utility
will potentially benefit from fewer copy-on-writes when immortal objects
are involved, when compared to using only "mortal" objects.

Rationale

The proposed solution is obvious enough that both of this proposal's
authors came to the same conclusion (and implementation, more or less)
independently. The Pyston project uses a similar approach. Other designs
were also considered. Several possibilities have also been discussed on
python-dev in past years.

Alternatives include:

-   use a high bit to mark "immortal" but do not change Py_INCREF()
-   add an explicit flag to objects
-   implement via the type (tp_dealloc() is a no-op)
-   track via the object's type object
-   track with a separate table

Each of the above makes objects immortal, but none of them address the
performance penalties from refcount modification described above.

In the case of per-interpreter GIL, the only realistic alternative is to
move all global objects into PyInterpreterState and add one or more
lookup functions to access them. Then we'd have to add some hacks to the
C-API to preserve compatibility for the may objects exposed there. The
story is much, much simpler with immortal objects.

Impact

Benefits

Most notably, the cases described in the above examples stand to benefit
greatly from immortal objects. Projects using pre-fork can drop their
workarounds. For the per-interpreter GIL project, immortal objects
greatly simplifies the solution for existing static types, as well as
objects exposed by the public C-API.

In general, a strong immutability guarantee for objects enables Python
applications to scale better, particularly in multi-process deployments.
This is because they can then leverage multi-core parallelism without
such a significant tradeoff in memory usage as they now have. The cases
we just described, as well as those described above in Motivation,
reflect this improvement.

Performance

A naive implementation shows a 2% slowdown (3% with MSVC). We have
demonstrated a return to ~performance-neutral~ with a handful of basic
mitigations applied. See the mitigations section below.

On the positive side, immortal objects save a significant amount of
memory when used with a pre-fork model. Also, immortal objects provide
opportunities for specialization in the eval loop that would improve
performance.

Backward Compatibility

Ideally this internal-only feature would be completely compatible.
However, it does involve a change to refcount semantics in some cases.
Only immortal objects are affected, but this includes high-use objects
like None, True, and False.

Specifically, when an immortal object is involved:

-   code that inspects the refcount will see a really, really large
    value
-   the new noop behavior may break code that:
    -   depends specifically on the refcount to always increment or
        decrement (or have a specific value from Py_SET_REFCNT())
    -   relies on any specific refcount value, other than 0 or 1
    -   directly manipulates the refcount to store extra information
        there
-   in 32-bit pre-3.12 Stable ABI extensions, objects may leak due to
    Accidental Immortality
-   such extensions may crash due to Accidental De-Immortalizing

Again, those changes in behavior only apply to immortal objects, not the
vast majority of objects a user will use. Furthermore, users cannot mark
an object as immortal so no user-created objects will ever have that
changed behavior. Users that rely on any of the changing behavior for
global (builtin) objects are already in trouble. So the overall impact
should be small.

Also note that code which checks for refleaks should keep working fine,
unless it checks for hard-coded small values relative to some immortal
object. The problems noticed by Pyston shouldn't apply here since we do
not modify the refcount.

See Public Refcount Details below for further discussion.

Accidental Immortality

Hypothetically, a non-immortal object could be incref'ed so much that it
reaches the magic value needed to be considered immortal. That means it
would never be decref'ed all the way back to 0, so it would accidentally
leak (never be cleaned up).

With 64-bit refcounts, this accidental scenario is so unlikely that we
need not worry. Even if done deliberately by using Py_INCREF() in a
tight loop and each iteration only took 1 CPU cycle, it would take 2^60
cycles (if the immortal bit were 2^60). At a fast 5 GHz that would still
take nearly 250,000,000 seconds (over 2,500 days)!

Also note that it is doubly unlikely to be a problem because it wouldn't
matter until the refcount would have gotten back to 0 and the object
cleaned up. So any object that hit that magic "immortal" refcount value
would have to be decref'ed that many times again before the change in
behavior would be noticed.

Again, the only realistic way that the magic refcount would be reached
(and then reversed) is if it were done deliberately. (Of course, the
same thing could be done efficiently using Py_SET_REFCNT() though that
would be even less of an accident.) At that point we don't consider it a
concern of this proposal.

On builds with much smaller maximum refcounts, like 32-bit platforms,
the consequences aren't so obvious. Let's say the magic refcount were
2^30. Using the same specs as above, it would take roughly 4 seconds to
accidentally immortalize an object. Under reasonable conditions, it is
still highly unlikely that an object be accidentally immortalized. It
would have to meet these criteria:

-   targeting a non-immortal object (so not one of the high-use
    builtins)
-   the extension increfs without a corresponding decref (e.g. returns
    from a function or method)
-   no other code decrefs the object in the meantime

Even at a much less frequent rate it would not take long to reach
accidental immortality (on 32-bit). However, then it would have to run
through the same number of (now noop-ing) decrefs before that one object
would be effectively leaking. This is highly unlikely, especially
because the calculations assume no decrefs.

Furthermore, this isn't all that different from how such 32-bit
extensions can already incref an object past 2^31 and turn the refcount
negative. If that were an actual problem then we would have heard about
it.

Between all of the above cases, the proposal doesn't consider accidental
immortality a problem.

Stable ABI

The implementation approach described in this PEP is compatible with
extensions compiled to the stable ABI (with the exception of Accidental
Immortality and Accidental De-Immortalizing). Due to the nature of the
stable ABI, unfortunately, such extensions use versions of Py_INCREF(),
etc. that directly modify the object's ob_refcnt field. This will
invalidate all the performance benefits of immortal objects.

However, we do ensure that immortal objects (mostly) stay immortal in
that situation. We set the initial refcount of immortal objects to a
value for which we can identify the object as immortal and which
continues to do so even if the refcount is modified by an extension.
(For example, suppose we used one of the high refcount bits to indicate
that an object was immortal. We would set the initial refcount to a
higher value that still matches the bit, like halfway to the next bit.
See _Py_IMMORTAL_REFCNT.) At worst, objects in that situation would feel
the effects described in the Motivation section. Even then the overall
impact is unlikely to be significant.

Accidental De-Immortalizing

32-bit builds of older stable ABI extensions can take Accidental
Immortality to the next level.

Hypothetically, such an extension could incref an object to a value on
the next highest bit above the magic refcount value. For example, if the
magic value were 2^30 and the initial immortal refcount were thus 2^30 +
2^29 then it would take 2^29 increfs by the extension to reach a value
of 2^31, making the object non-immortal. (Of course, a refcount that
high would probably already cause a crash, regardless of immortal
objects.)

The more problematic case is where such a 32-bit stable ABI extension
goes crazy decref'ing an already immortal object. Continuing with the
above example, it would take 2^29 asymmetric decrefs to drop below the
magic immortal refcount value. So an object like None could be made
mortal and subject to decref. That still wouldn't be a problem until
somehow the decrefs continue on that object until it reaches 0. For
statically allocated immortal objects, like None, the extension would
crash the process if it tried to dealloc the object. For any other
immortal objects, the dealloc might be okay. However, there might be
runtime code expecting the formerly-immortal object to be around
forever. That code would probably crash.

Again, the likelihood of this happening is extremely small, even on
32-bit builds. It would require roughly a billion decrefs on that one
object without a corresponding incref. The most likely scenario is the
following:

A "new" reference to None is returned by many functions and methods.
Unlike with non-immortal objects, the 3.12 runtime will basically never
incref None before giving it to the extension. However, the extension
will decref it when done with it (unless it returns it). Each time that
exchange happens with the one object, we get one step closer to a crash.

How realistic is it that some form of that exchange (with a single
object) will happen a billion times in the lifetime of a Python process
on 32-bit? If it is a problem, how could it be addressed?

As to how realistic, the answer isn't clear currently. However, the
mitigation is simple enough that we can safely proceed under the
assumption that it would not be a problem.

We look at possible solutions later on.

Alternate Python Implementations

This proposal is CPython-specific. However, it does relate to the
behavior of the C-API, which may affect other Python implementations.
Consequently, the effect of changed behavior described in Backward
Compatibility above also applies here (e.g. if another implementation is
tightly coupled to specific refcount values, other than 0, or on exactly
how refcounts change, then they may impacted).

Security Implications

This feature has no known impact on security.

Maintainability

This is not a complex feature so it should not cause much mental
overhead for maintainers. The basic implementation doesn't touch much
code so it should have much impact on maintainability. There may be some
extra complexity due to performance penalty mitigation. However, that
should be limited to where we immortalize all objects post-init and
later explicitly deallocate them during runtime finalization. The code
for this should be relatively concentrated.

Specification

The approach involves these fundamental changes:

-   add _Py_IMMORTAL_REFCNT (the magic value) to the internal C-API
-   update Py_INCREF() and Py_DECREF() to no-op for objects that match
    the magic refcount
-   do the same for any other API that modifies the refcount
-   stop modifying PyGC_Head for immortal GC objects ("containers")
-   ensure that all immortal objects are cleaned up during runtime
    finalization

Then setting any object's refcount to _Py_IMMORTAL_REFCNT makes it
immortal.

(There are other minor, internal changes which are not described here.)

In the following sub-sections we dive into the most significant details.
First we will cover some conceptual topics, followed by more concrete
aspects like specific affected APIs.

Public Refcount Details

In Backward Compatibility we introduced possible ways that user code
might be broken by the change in this proposal. Any contributing
misunderstanding by users is likely due in large part to the names of
the refcount-related API and to how the documentation explains those API
(and refcounting in general).

Between the names and the docs, we can clearly see answers to the
following questions:

-   what behavior do users expect?
-   what guarantees do we make?
-   do we indicate how to interpret the refcount value they receive?
-   what are the use cases under which a user would set an object's
    refcount to a specific value?
-   are users setting the refcount of objects they did not create?

As part of this proposal, we must make sure that users can clearly
understand on which parts of the refcount behavior they can rely and
which are considered implementation details. Specifically, they should
use the existing public refcount-related API and the only refcount
values with any meaning are 0 and 1. (Some code relies on 1 as an
indicator that the object can be safely modified.) All other values are
considered "not 0 or 1".

This information will be clarified in the documentation.

Arguably, the existing refcount-related API should be modified to
reflect what we want users to expect. Something like the following:

-   Py_INCREF() -> Py_ACQUIRE_REF() (or only support Py_NewRef())
-   Py_DECREF() -> Py_RELEASE_REF()
-   Py_REFCNT() -> Py_HAS_REFS()
-   Py_SET_REFCNT() -> Py_RESET_REFS() and Py_SET_NO_REFS()

However, such a change is not a part of this proposal. It is included
here to demonstrate the tighter focus for user expectations that would
benefit this change.

Constraints

-   ensure that otherwise immutable objects can be truly immutable
-   minimize performance penalty for normal Python use cases
-   be careful when immortalizing objects that we don't actually expect
    to persist until runtime finalization.
-   be careful when immortalizing objects that are not otherwise
    immutable
-   __del__ and weakrefs must continue working properly

Regarding "truly" immutable objects, this PEP doesn't impact the
effective immutability of any objects, other than the per-object runtime
state (e.g. refcount). So whether or not some immortal object is truly
(or even effectively) immutable can only be settled separately from this
proposal. For example, str objects are generally considered immutable,
but PyUnicodeObject holds some lazily cached data. This PEP has no
influence on how that state affects str immutability.

Immortal Mutable Objects

Any object can be marked as immortal. We do not propose any restrictions
or checks. However, in practice the value of making an object immortal
relates to its mutability and depends on the likelihood it would be used
for a sufficient portion of the application's lifetime. Marking a
mutable object as immortal can make sense in some situations.

Many of the use cases for immortal objects center on immutability, so
that threads can safely and efficiently share such objects without
locking. For this reason a mutable object, like a dict or list, would
never be shared (and thus no immortality). However, immortality may be
appropriate if there is sufficient guarantee that the normally mutable
object won't actually be modified.

On the other hand, some mutable objects will never be shared between
threads (at least not without a lock like the GIL). In some cases it may
be practical to make some of those immortal too. For example,
sys.modules is a per-interpreter dict that we do not expect to ever get
freed until the corresponding interpreter is finalized (assuming it
isn't replaced). By making it immortal, we would no longer incur the
extra overhead during incref/decref.

We explore this idea further in the mitigations section below.

Implicitly Immortal Objects

If an immortal object holds a reference to a normal (mortal) object then
that held object is effectively immortal. This is because that object's
refcount can never reach 0 until the immortal object releases it.

Examples:

-   containers like dict and list
-   objects that hold references internally like PyTypeObject with its
    tp_subclasses and tp_weaklist
-   an object's type (held in ob_type)

Such held objects are thus implicitly immortal for as long as they are
held. In practice, this should have no real consequences since it really
isn't a change in behavior. The only difference is that the immortal
object (holding the reference) doesn't ever get cleaned up.

We do not propose that such implicitly immortal objects be changed in
any way. They should not be explicitly marked as immortal just because
they are held by an immortal object. That would provide no advantage
over doing nothing.

Un-Immortalizing Objects

This proposal does not include any mechanism for taking an immortal
object and returning it to a "normal" condition. Currently there is no
need for such an ability.

On top of that, the obvious approach is to simply set the refcount to a
small value. However, at that point there is no way in knowing which
value would be safe. Ideally we'd set it to the value that it would have
been if it hadn't been made immortal. However, that value will have long
been lost. Hence the complexities involved make it less likely that an
object could safely be un-immortalized, even if we had a good reason to
do so.

_Py_IMMORTAL_REFCNT

We will add two internal constants:

    _Py_IMMORTAL_BIT - has the top-most available bit set (e.g. 2^62)
    _Py_IMMORTAL_REFCNT - has the two top-most available bits set

The actual top-most bit depends on existing uses for refcount bits, e.g.
the sign bit or some GC uses. We will use the highest bit possible after
consideration of existing uses.

The refcount for immortal objects will be set to _Py_IMMORTAL_REFCNT
(meaning the value will be halfway between _Py_IMMORTAL_BIT and the
value at the next highest bit). However, to check if an object is
immortal we will compare (bitwise-and) its refcount against just
_Py_IMMORTAL_BIT.

The difference means that an immortal object will still be considered
immortal, even if somehow its refcount were modified (e.g. by an older
stable ABI extension).

Note that top two bits of the refcount are already reserved for other
uses. That's why we are using the third top-most bit.

The implementation is also open to using other values for the immortal
bit, such as the sign bit or 2^31 (for saturated refcounts on 64-bit).

Affected API

API that will now ignore immortal objects:

-   (public) Py_INCREF()
-   (public) Py_DECREF()
-   (public) Py_SET_REFCNT()
-   (private) _Py_NewReference()

API that exposes refcounts (unchanged but may now return large values):

-   (public) Py_REFCNT()
-   (public) sys.getrefcount()

(Note that _Py_RefTotal, and consequently sys.gettotalrefcount(), will
not be affected.)

TODO: clarify the status of _Py_RefTotal.

Also, immortal objects will not participate in GC.

Immortal Global Objects

All runtime-global (builtin) objects will be made immortal. That
includes the following:

-   singletons (None, True, False, Ellipsis, NotImplemented)
-   all static types (e.g. PyLong_Type, PyExc_Exception)
-   all static objects in _PyRuntimeState.global_objects (e.g.
    identifiers, small ints)

The question of making the full objects actually immutable (e.g. for
per-interpreter GIL) is not in the scope of this PEP.

Object Cleanup

In order to clean up all immortal objects during runtime finalization,
we must keep track of them.

For GC objects ("containers") we'll leverage the GC's permanent
generation by pushing all immortalized containers there. During runtime
shutdown, the strategy will be to first let the runtime try to do its
best effort of deallocating these instances normally. Most of the module
deallocation will now be handled by pylifecycle.c:finalize_modules()
where we clean up the remaining modules as best as we can. It will
change which modules are available during __del__, but that's already
explicitly undefined behavior in the docs. Optionally, we could do some
topological ordering to guarantee that user modules will be deallocated
first before the stdlib modules. Finally, anything left over (if any)
can be found through the permanent generation GC list which we can clear
after finalize_modules() is done.

For non-container objects, the tracking approach will vary on a
case-by-case basis. In nearly every case, each such object is directly
accessible on the runtime state, e.g. in a _PyRuntimeState or
PyInterpreterState field. We may need to add a tracking mechanism to the
runtime state for a small number of objects.

None of the cleanup will have a significant effect on performance.

Performance Regression Mitigations

In the interest of clarity, here are some of the ways we are going to
try to recover some of the 4% performance we lose with the naive
implementation of immortal objects.

Note that none of this section is actually part of the proposal.

at the end of runtime init, mark all objects as immortal

We can apply the concept from Immortal Mutable Objects in the pursuit of
getting back some of that 4% performance we lose with the naive
implementation of immortal objects. At the end of runtime init we can
mark all objects as immortal and avoid the extra cost in incref/decref.
We only need to worry about immutability with objects that we plan on
sharing between threads without a GIL.

drop unnecessary hard-coded refcount operations

Parts of the C-API interact specifically with objects that we know to be
immortal, like Py_RETURN_NONE. Such functions and macros can be updated
to drop any refcount operations.

specialize for immortal objects in the eval loop

There are opportunities to optimize operations in the eval loop
involving specific known immortal objects (e.g. None). The general
mechanism is described in PEP 659. Also see Pyston.

other possibilities

-   mark every interned string as immortal
-   mark the "interned" dict as immortal if shared else share all
    interned strings
-   (Larry,MAL) mark all constants unmarshalled for a module as immortal
-   (Larry,MAL) allocate (immutable) immortal objects in their own
    memory page(s)
-   saturated refcounts using the 32 least-significant bits

Solutions for Accidental De-Immortalization

In the Accidental De-Immortalizing section we outlined a possible
negative consequence of immortal objects. Here we look at some of the
options to deal with that.

Note that we enumerate solutions here to illustrate that satisfactory
options are available, rather than to dictate how the problem will be
solved.

Also note the following:

-   this only matters in the 32-bit stable-ABI case
-   it only affects immortal objects
-   there are no user-defined immortal objects, only built-in types
-   most immortal objects will be statically allocated (and thus already
    must fail if tp_dealloc() is called)
-   only a handful of immortal objects will be used often enough to
    possibly face this problem in practice (e.g. None)
-   the main problem to solve is crashes coming from tp_dealloc()

One fundamental observation for a solution is that we can reset an
immortal object's refcount to _Py_IMMORTAL_REFCNT when some condition is
met.

With all that in mind, a simple, yet effective, solution would be to
reset an immortal object's refcount in tp_dealloc(). NoneType and bool
already have a tp_dealloc() that calls Py_FatalError() if triggered. The
same goes for other types based on certain conditions, like
PyUnicodeObject (depending on unicode_is_singleton()), PyTupleObject,
and PyTypeObject. In fact, the same check is important for all
statically declared object. For those types, we would instead reset the
refcount. For the remaining cases we would introduce the check. In all
cases, the overhead of the check in tp_dealloc() should be too small to
matter.

Other (less practical) solutions:

-   periodically reset the refcount for immortal objects
-   only do that for high-use objects
-   only do it if a stable-ABI extension has been imported
-   provide a runtime flag for disabling immortality

(The discussion thread has further detail.)

Regardless of the solution we end up with, we can do something else
later if necessary.

TODO: Add a note indicating that the implemented solution does not
affect the overall ~performance-neutral~ outcome.

Documentation

The immortal objects behavior and API are internal, implementation
details and will not be added to the documentation.

However, we will update the documentation to make public guarantees
about refcount behavior more clear. That includes, specifically:

-   Py_INCREF() - change "Increment the reference count for object o."
    to "Indicate taking a new reference to object o."
-   Py_DECREF() - change "Decrement the reference count for object o."
    to "Indicate no longer using a previously taken reference to object
    o."
-   similar for Py_XINCREF(), Py_XDECREF(), Py_NewRef(), Py_XNewRef(),
    Py_Clear()
-   Py_REFCNT() - add "The refcounts 0 and 1 have specific meanings and
    all others only mean code somewhere is using the object, regardless
    of the value. 0 means the object is not used and will be cleaned up.
    1 means code holds exactly a single reference."
-   Py_SET_REFCNT() - refer to Py_REFCNT() about how values over 1 may
    be substituted with some over value

We may also add a note about immortal objects to the following, to help
reduce any surprise users may have with the change:

-   Py_SET_REFCNT() (a no-op for immortal objects)
-   Py_REFCNT() (value may be surprisingly large)
-   sys.getrefcount() (value may be surprisingly large)

Other API that might benefit from such notes are currently undocumented.
We wouldn't add such a note anywhere else (including for Py_INCREF() and
Py_DECREF()) since the feature is otherwise transparent to users.

Reference Implementation

The implementation is proposed on GitHub:

https://github.com/python/cpython/pull/19474

Open Issues

-   how realistic is the Accidental De-Immortalizing concern?

References

Prior Art

-   Pyston

Discussions

This was discussed in December 2021 on python-dev:

-   https://mail.python.org/archives/list/python-dev@python.org/thread/7O3FUA52QGTVDC6MDAV5WXKNFEDRK5D6/#TBTHSOI2XRWRO6WQOLUW3X7S5DUXFAOV
-   https://mail.python.org/archives/list/python-dev@python.org/thread/PNLBJBNIQDMG2YYGPBCTGOKOAVXRBJWY

Runtime Object State

Here is the internal state that the CPython runtime keeps for each
Python object:

-   PyObject.ob_refcnt: the object's refcount
-   _PyGC_Head: (optional) the object's node in a list of "GC" objects
-   _PyObject_HEAD_EXTRA: (optional) the object's node in the list of
    heap objects

ob_refcnt is part of the memory allocated for every object. However,
_PyObject_HEAD_EXTRA is allocated only if CPython was built with
Py_TRACE_REFS defined. PyGC_Head is allocated only if the object's type
has Py_TPFLAGS_HAVE_GC set. Typically this is only container types (e.g.
list). Also note that PyObject.ob_refcnt and _PyObject_HEAD_EXTRA are
part of PyObject_HEAD.

Reference Counting, with Cyclic Garbage Collection

Garbage collection is a memory management feature of some programming
languages. It means objects are cleaned up (e.g. memory freed) once they
are no longer used.

Refcounting is one approach to garbage collection. The language runtime
tracks how many references are held to an object. When code takes
ownership of a reference to an object or releases it, the runtime is
notified and it increments or decrements the refcount accordingly. When
the refcount reaches 0, the runtime cleans up the object.

With CPython, code must explicitly take or release references using the
C-API's Py_INCREF() and Py_DECREF(). These macros happen to directly
modify the object's refcount (unfortunately, since that causes ABI
compatibility issues if we want to change our garbage collection
scheme). Also, when an object is cleaned up in CPython, it also releases
any references (and resources) it owns (before it's memory is freed).

Sometimes objects may be involved in reference cycles, e.g. where object
A holds a reference to object B and object B holds a reference to object
A. Consequently, neither object would ever be cleaned up even if no
other references were held (i.e. a memory leak). The most common objects
involved in cycles are containers.

CPython has dedicated machinery to deal with reference cycles, which we
call the "cyclic garbage collector", or often just "garbage collector"
or "GC". Don't let the name confuse you. It only deals with breaking
reference cycles.

See the docs for a more detailed explanation of refcounting and cyclic
garbage collection:

-   https://docs.python.org/3.11/c-api/intro.html#reference-counts
-   https://docs.python.org/3.11/c-api/refcounting.html
-   https://docs.python.org/3.11/c-api/typeobj.html#c.PyObject.ob_refcnt
-   https://docs.python.org/3.11/c-api/gcsupport.html

Copyright

This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.