PEP 720 – Cross-compiling Python packages
- Author:
- Filipe Laíns <lains at riseup.net>
- PEP-Delegate:
- Status:
- Draft
- Type:
- Informational
- Created:
- 01-Jul-2023
- Python-Version:
- 3.12
Abstract
This PEP attempts to document the status of cross-compilation of downstream projects.
It should give an overview of the approaches currently used by distributors (Linux distros, WASM environment providers, etc.) to cross-compile downstream projects (3rd party extensions, etc.).
Motivation
We write this PEP to express the challenges in cross-compilation and act as a supporting document in future improvement proposals.
Analysis
Introduction
There are a couple different approaches being used to tackle this, with different levels of interaction required from the user, but they all require a significant amount of effort. This is due to the lack of standardized cross-compilation infrastructure on the Python packaging ecosystem, which itself stems from the complexity of cross-builds, making it a huge undertaking.
Upstream support
Some major projects like CPython, setuptools, etc. provide some support to help
with cross-compilation, but it’s unofficial and at a best-effort basis. For
example, the sysconfig
module allows overwriting the data module name via
the _PYTHON_SYSCONFIGDATA_NAME
environment variable, something that is
required for cross-builds, and setuptools accepts patches [1] to tweak/fix
its logic to be compatible with popular “environment faking” workflows [2].
The lack of first-party support in upstream projects leads to cross-compilation being fragile and requiring a significant effort from users, but at the same time, the lack of standardization makes it harder for upstreams to improve support as there’s no clarity on how this feature should be provided.
Projects with decent cross-build support
It seems relevant to point out that there are a few modern Python package build-backends with, at least, decent cross-compilation support, those being scikit-build and meson-python. Both these projects integrate external mature build-systems into Python packaging — CMake and Meson, respectively — so cross-build support is inherited from them.
Downstream approaches
Cross-compilation approaches fall in a spectrum that goes from, by design, requiring extensive user interaction to (ideally) almost none. Usually, they’ll be based on one of two main strategies, using a cross-build environment, or faking the target environment.
Cross-build environment
This consists of running the Python interpreter normally and utilizing the cross-build provided by the projects’ build-system. However, as we saw above, upstream support is lacking, so this approach only works for a small-ish set of projects. When this fails, the usual strategy is to patch the build-system code to build use the correct toolchain, system details, etc. [3].
Since this approach often requires package-specific patching, it requires a lot of user interaction.
Examples
python-for-android, kivy-ios, etc.
Faking the target environment
Aiming to drop the requirement for user input, a popular approach is trying to
fake the target environment. It generally consists of monkeypatching the Python
interpreter to get it to mimic the interpreter on the target system, which
constitutes of changing many of the sys
module attributes, the sysconfig
data, etc. Using this strategy, build-backends do not need to have any
cross-build support, and should just work without any code changes.
Unfortunately, though, it isn’t possible to truly fake the target environment. There are many reasons for this, one of the main ones being that it breaks code that actually needs to introspect the running interpreter. As a result, monkeypatching Python to look like target is very tricky — to achieve the less amount of breakage, we can only patch certain aspects of the interpreter. Consequently, build-backends may need some code changes, but these are generally much smaller than the previous approach. This is an inherent limitation of the technique, meaning this strategy still requires some user interaction.
Nonetheless, this strategy still works out-of-the-box with significantly more projects than the approach above, and requires much less effort in these cases. It is successful in decreasing the amount of user interaction needed, even though it doesn’t succeed in being generic.
Examples
crossenv, conda-forge, etc.
Environment introspection
As explained above, most build system code is written with the assumption that the target system is the same as where the build is occurring, so introspection is usually used to guide the build.
In this section, we try to document most of the ways this is accomplished. It should give a decent overview of of environment details that are required by build systems.
Snippet | Description | Variance |
---|---|---|
>>> importlib.machinery.EXTENSION_SUFFIXES
[
'.cpython-311-x86_64-linux-gnu.so',
'.abi3.so',
'.so',
]
|
Extension (native module) suffixes supported by this interpreter. | This is implementation-defined, but it usually differs based on the implementation, system architecture, build configuration, Python language version, and implementation version — if one exists. |
>>> importlib.machinery.SOURCE_SUFFIXES
['.py']
|
Source (pure-Python) suffixes supported by this interpreter. | This is implementation-defined, but it usually doesn’t differ (outside exotic implementations or systems). |
>>> importlib.machinery.all_suffixes()
[
'.py',
'.pyc',
'.cpython-311-x86_64-linux-gnu.so',
'.abi3.so',
'.so',
]
|
All module file suffixes supported by this interpreter. It should be the
union of all importlib.machinery.*_SUFFIXES attributes. |
This is implementation-defined, but it usually differs based on the implementation, system architecture, build configuration, Python language version, and implementation version — if one exists. See the entries above for more information. |
>>> sys.abiflags
''
|
ABI flags, as specified in PEP 3149. | Differs based on the build configuration. |
>>> sys.api_version
1013
|
C API version. | Differs based on the Python installation. |
>>> sys.base_prefix
/usr
|
Prefix of the installation-wide directories where platform independent files are installed. | Differs based on the platform, and installation. |
>>> sys.base_exec_prefix
/usr
|
Prefix of the installation-wide directories where platform dependent files are installed. | Differs based on the platform, and installation. |
>>> sys.byteorder
'little'
|
Native byte order. | Differs based on the platform. |
>>> sys.builtin_module_names
('_abc', '_ast', '_codecs', ...)
|
Names of all modules that are compiled into the Python interpreter. | Differs based on the platform, system architecture, and build configuration. |
>>> sys.exec_prefix
/usr
|
Prefix of the site-specific directories where platform independent files are installed. Because it concerns the site-specific directories, in standard virtual environment implementation, it will be a virtual-environment-specific path. | Differs based on the platform, installation, and environment. |
>>> sys.executable
'/usr/bin/python'
|
Path of the Python interpreter being used. | Differs based on the installation. |
>>> with open(sys.executable, 'rb') as f:
... header = f.read(4)
... if is_elf := (header == b'\x7fELF'):
... elf_class = int(f.read(1))
... size = {1: 52, 2: 64}.get(elf_class)
... elf_header = f.read(size - 5)
|
Whether the Python interpreter is an ELF file, and the ELF header. This approach is something used to identify the target architecture of the installation (example). | Differs based on the installation. |
>>> sys.float_info
sys.float_info(
max=1.7976931348623157e+308,
max_exp=1024,
max_10_exp=308,
min=2.2250738585072014e-308,
min_exp=-1021,
min_10_exp=-307,
dig=15,
mant_dig=53,
epsilon=2.220446049250313e-16,
radix=2,
rounds=1,
)
|
Low level information about the float type, as defined by float.h . |
Differs based on the architecture, and platform. |
>>> sys.getandroidapilevel()
21
|
Integer representing the Android API level. | Differs based on the platform. |
>>> sys.getwindowsversion()
sys.getwindowsversion(
major=10,
minor=0,
build=19045,
platform=2,
service_pack='',
)
|
Windows version of the system. | Differs based on the platform. |
>>> sys.hexversion
0x30b03f0
|
Python version encoded as an integer. | Differs based on the Python language version. |
>>> sys.implementation
namespace(
name='cpython',
cache_tag='cpython-311',
version=sys.version_info(
major=3,
minor=11,
micro=3,
releaselevel='final',
serial=0,
),
hexversion=0x30b03f0,
_multiarch='x86_64-linux-gnu',
)
|
Interpreter implementation details. | Differs based on the interpreter implementation, Python language version, and implementation version — if one exists. It may also include architecture-dependent information, so it may also differ based on the system architecture. |
>>> sys.int_info
sys.int_info(
bits_per_digit=30,
sizeof_digit=4,
default_max_str_digits=4300,
str_digits_check_threshold=640,
)
|
Low level information about Python’s internal integer representation. | Differs based on the architecture, platform, implementation, build, and runtime flags. |
>>> sys.maxsize
0x7fffffffffffffff
|
Maximum value a variable of type Py_ssize_t can take. |
Differs based on the architecture, platform, and implementation. |
>>> sys.maxunicode
0x10ffff
|
Value of the largest Unicode code point. | Differs based on the implementation, and on Python versions older than 3.3, the build. |
>>> sys.platform
linux
|
Platform identifier. | Differs based on the platform. |
>>> sys.prefix
/usr
|
Prefix of the site-specific directories where platform dependent files are installed. Because it concerns the site-specific directories, in standard virtual environment implementation, it will be a virtual-environment-specific path. | Differs based on the platform, installation, and environment. |
>>> sys.platlibdir
lib
|
Platform-specific library directory. | Differs based on the platform, and vendor. |
>>> sys.version_info
sys.version_info(
major=3,
minor=11,
micro=3,
releaselevel='final',
serial=0,
)
|
Python language version implemented by the interpreter. | Differs if the target Python version is not the same [4]. |
>>> sys.thread_info
sys.thread_info(
name='pthread',
lock='semaphore',
version='NPTL 2.37',
)
|
Information about the thread implementation. | Differs based on the platform, and implementation. |
>>> sys.winver
3.8-32
|
Version number used to form Windows registry keys. | Differs based on the platform, and implementation. |
>>> sysconfig.get_config_vars()
{ ... }
>>> sysconfig.get_config_var(...)
...
|
Python distribution configuration variables. It includes a set of
variables [5] — like prefix , exec_prefix , etc. — based on the
running context [6], and may include some extra variables based on the
Python implementation and system.In CPython and most other implementations that use the same
build-system, the “extra” variables mention above are: on POSIX, all
variables from the |
This is implementation-defined, but it usually differs between non-identical builds. Please refer to the sysconfig configuration variables table for a overview of the different configuration variable that are usually present. |
sys.prefix
and other
attributes accordingly.Makefile
, and instead
use the Visual Studio build system. A subset of the most relevant
Makefile
variables is provided to make user code that uses them
simpler.CPython (and similar)
Name | Example Value | Description | Variance |
---|---|---|---|
SOABI |
cpython-311-x86_64-linux-gnu |
ABI string — defined by PEP 3149. | Differs based on the implementation, system architecture, Python language version, and implementation version — if one exists. |
SHLIB_SUFFIX |
.so |
Shared library suffix. | Differs based on the platform. |
EXT_SUFFIX |
.cpython-311-x86_64-linux-gnu.so |
Interpreter-specific Python extension (native module) suffix — generally
defined as .{SOABI}.{SHLIB_SUFFIX} . |
Differs based on the implementation, system architecture, Python language version, and implementation version — if one exists. |
LDLIBRARY |
libpython3.11.so |
Shared libpython library name — if available. If unavailable [8],
the variable will be empty, if available, the library should be located
in LIBDIR . |
Differs based on the implementation, system architecture, build configuration, Python language version, and implementation version — if one exists. |
PY3LIBRARY |
libpython3.so |
Shared Python 3 only (major version bound only) [9] libpython
library name — if available. If unavailable [8], the variable will be
empty, if available, the library should be located in LIBDIR . |
Differs based on the implementation, system architecture, build configuration, Python language version, and implementation version — if one exists. |
LIBRARY |
libpython3.11.a |
Static libpython library name — if available. If unavailable [8],
the variable will be empty, if available, the library should be located
in LIBDIR . |
Differs based on the implementation, system architecture, build configuration, Python language version, and implementation version — if one exists. |
Py_DEBUG |
0 |
Whether this is a debug build. | Differs based on the build configuration. |
WITH_PYMALLOC |
1 |
Whether this build has pymalloc support. | Differs based on the build configuration. |
Py_TRACE_REFS |
0 |
Whether reference tracing (debug build only) is enabled. | Differs based on the build configuration. |
Py_UNICODE_SIZE |
Size of the Py_UNICODE object, in bytes. This variable is only
present in CPython versions older than 3.3, and was commonly used to
detect if the build uses UCS2 or UCS4 for unicode objects — before
PEP 393. |
Differs based on the build configuration. | |
Py_ENABLE_SHARED |
1 |
Whether a shared libpython is available. |
Differs based on the build configuration. |
PY_ENABLE_SHARED |
1 |
Whether a shared libpython is available. |
Differs based on the build configuration. |
CC |
gcc |
The C compiler used to build the Python distribution. | Differs based on the build configuration. |
CXX |
g++ |
The C compiler used to build the Python distribution. | Differs based on the build configuration. |
CFLAGS |
-DNDEBUG -g -fwrapv ... |
The C compiler flags used to build the Python distribution. | Differs based on the build configuration. |
py_version |
3.11.3 |
Full form of the Python version. | Differs based on the Python language version. |
py_version_short |
3.11 |
Custom form of the Python version, containing only the major and minor numbers. | Differs based on the Python language version. |
py_version_nodot |
311 |
Custom form of the Python version, containing only the major and minor numbers, and no dots. | Differs based on the Python language version. |
prefix |
/usr |
Same as sys.prefix , please refer to the entry in table above. |
Differs based on the platform, installation, and environment. |
base |
/usr |
Same as sys.prefix , please refer to the entry in table above. |
Differs based on the platform, installation, and environment. |
exec_prefix |
/usr |
Same as sys.exec_prefix , please refer to the entry in table above. |
Differs based on the platform, installation, and environment. |
platbase |
/usr |
Same as sys.exec_prefix , please refer to the entry in table above. |
Differs based on the platform, installation, and environment. |
installed_base |
/usr |
Same as sys.base_prefix , please refer to the entry in table above. |
Differs based on the platform, and installation. |
installed_platbase |
/usr |
Same as sys.base_exec_prefix , please refer to the entry in table
above. |
Differs based on the platform, and installation. |
platlibdir |
lib |
Same as sys.platlibdir , please refer to the entry in table above. |
Differs based on the platform, and vendor. |
SIZEOF_* |
4 |
Size of a certain C type (double , float , etc.). |
Differs based on the system architecture, and build details. |
libpython
support, respectively.libpython
library that users of the stable ABI should
link against, if they need to link against libpython
.Relevant Information
There are some bits of information required by build systems — eg. platform particularities — scattered across many places, and it often is difficult to identify code with assumptions based on them. In this section, we try to document the most relevant cases.
When should extensions be linked against libpython
?
- Short answer
- Yes, on Windows. No on POSIX platforms, except Android, Cygwin, and other Windows-based POSIX-like platforms.
When building extensions for dynamic loading, depending on the target platform,
they may need to be linked against libpython
.
On Windows, extensions need to link against libpython
, because all symbols
must be resolvable at link time. POSIX-like platforms based on Windows — like
Cygwin, MinGW, or MSYS — will also require linking against libpython
.
On most POSIX platforms, it is not necessary to link against libpython
, as
the symbols will already be available in the due to the interpreter — or, when
embedding, the executable/library in question — already linking to
libpython
. Not linking an extension module against libpython
will allow
it to be loaded by static Python builds, so when possible, it is desirable to do
so (see GH-65735).
This might not be the case on all POSIX platforms, so make sure you check. One
example is Android, where only the main executable and LD_PRELOAD
entries
are considered to be RTLD_GLOBAL
(meaning dependencies are RTLD_LOCAL
)
[10], which causes the libpython
symbols be unavailable when loading the
extension.
What are prefix
, exec_prefix
, base_prefix
, and base_exec_prefix
?
These are sys
attributes set in the Python initialization that describe
the running environment. They refer to the prefix of directories where
installation/environment files are installed, according to the table below.
Name | Target files | Environment Scope |
---|---|---|
prefix |
platform independent (eg. pure Python) | site-specific |
exec_prefix |
platform dependent (eg. native code) | site-specific |
base_prefix |
platform independent (eg. pure Python) | installation-wide |
base_exec_prefix |
platform dependent (eg. native code) | installation-wide |
Because the site-specific prefixes will be different inside virtual
environments, checking sys.prexix != sys.base_prefix
is commonly used to
check if we are in a virtual environment.
Case studies
crossenv
- Description:
- Virtual Environments for Cross-Compiling Python Extension Modules.
- URL:
- https://github.com/benfogle/crossenv
crossenv
is a tool to create a virtual environment with a monkeypatched
Python installation that tries to emulate the target machine in certain
scenarios. More about this approach can be found in the
Faking the target environment section.
conda-forge
- Description:
- A community-led collection of recipes, build infrastructure and distributions for the conda package manager.
- URL:
- https://conda-forge.org/
XXX: Jaime will write a quick summary once the PEP draft is public.
XXX Uses a modified crossenv.
Yocto Project
- Description:
- The Yocto Project is an open source collaboration project that helps developers create custom Linux-based systems regardless of the hardware architecture.
- URL:
- https://www.yoctoproject.org/
XXX: Sent email to the mailing list.
TODO
Buildroot
- Description:
- Buildroot is a simple, efficient and easy-to-use tool to generate embedded Linux systems through cross-compilation.
- URL:
- https://buildroot.org/
TODO
Pyodide
- Description:
- Pyodide is a Python distribution for the browser and Node.js based on WebAssembly.
- URL:
- https://pyodide.org/en/stable/
XXX: Hood should review/expand this section.
Pyodide
is a provides a Python distribution compiled to WebAssembly
using the Emscripten toolchain.
It patches several aspects of the CPython installation and some external components. A custom package manager — micropip — supporting both Pure and wasm32/Emscripten wheels, is also provided as a part of the distribution. On top of this, a repo with a selected set of 3rd party packages is also provided and enabled by default.
Beeware
- Description:
- BeeWare allows you to write your app in Python and release it on multiple platforms.
- URL:
- https://beeware.org/
TODO
python-for-android
- Description:
- Turn your Python application into an Android APK.
- URL:
- https://github.com/kivy/python-for-android
resource https://github.com/Android-for-Python/Android-for-Python-Users
python-for-android
is a tool to package Python apps on Android. It creates a
Python distribution with your app and its dependencies.
Pure-Python dependencies are handled automatically and in a generic way, but native dependencies need recipes. A set of recipes for popular dependencies is provided, but users need to provide their own recipes for any other native dependencies.
kivy-ios
- Description:
- Toolchain for compiling Python / Kivy / other libraries for iOS.
- URL:
- https://github.com/kivy/kivy-ios
kivy-ios
is a tool to package Python apps on iOS. It provides a toolchain to
build a Python distribution with your app and its dependencies, as well as a CLI
to create and manage Xcode projects that integrate with the toolchain.
It uses the same approach as python-for-android (also maintained by the Kivy project) for app dependencies — pure-Python dependencies are handled automatically, but native dependencies need recipes, and the project provides recipes for popular dependencies.
AidLearning
- Description:
- AI, Android, Linux, ARM: AI application development platform based on Android+Linux integrated ecology.
- URL:
- https://github.com/aidlearning/AidLearning-FrameWork
TODO
QPython
- Description:
- QPython is the Python engine for android.
- URL:
- https://github.com/qpython-android/qpython
TODO
pyqtdeploy
- Description:
- pyqtdeploy is a tool for deploying PyQt applications.
- URL:
- https://www.riverbankcomputing.com/software/pyqtdeploy/
contact https://www.riverbankcomputing.com/pipermail/pyqt/2023-May/thread.html contacted Phil, the maintainer
TODO
Chaquopy
- Description:
- Chaquopy provides everything you need to include Python components in an Android app.
- URL:
- https://chaquo.com/chaquopy/
TODO
EDK II
- Description:
- EDK II is a modern, feature-rich, cross-platform firmware development environment for the UEFI and PI specifications.
- URL:
- https://github.com/tianocore/edk2-libc/tree/master/AppPkg/Applications/Python
TODO
ActivePython
- Description:
- Commercial-grade, quality-assured Python distribution focusing on easy installation and cross-platform compatibility on Windows, Linux, Mac OS X, Solaris, HP-UX and AIX.
- URL:
- https://www.activestate.com/products/python/
TODO
Termux
- Description:
- Termux is an Android terminal emulator and Linux environment app that works directly with no rooting or setup required.
- URL:
- https://termux.dev/en/
TODO
Source: https://github.com/python/peps/blob/main/peps/pep-0720.rst
Last modified: 2023-09-09 17:39:29 GMT