Following system colour scheme Selected dark colour scheme Selected light colour scheme

Python Enhancement Proposals

PEP 451 – A ModuleSpec Type for the Import System

Author:
Eric Snow <ericsnowcurrently at gmail.com>
BDFL-Delegate:
Brett Cannon <brett at python.org>, Alyssa Coghlan <ncoghlan at gmail.com>
Discussions-To:
Import-SIG list
Status:
Final
Type:
Standards Track
Created:
08-Aug-2013
Python-Version:
3.4
Post-History:
08-Aug-2013, 28-Aug-2013, 18-Sep-2013, 24-Sep-2013, 04-Oct-2013
Resolution:
Python-Dev message

Table of Contents

Abstract

This PEP proposes to add a new class to importlib.machinery called “ModuleSpec”. It will provide all the import-related information used to load a module and will be available without needing to load the module first. Finders will directly provide a module’s spec instead of a loader (which they will continue to provide indirectly). The import machinery will be adjusted to take advantage of module specs, including using them to load modules.

Terms and Concepts

The changes in this proposal are an opportunity to make several existing terms and concepts more clear, whereas currently they are (unfortunately) ambiguous. New concepts are also introduced in this proposal. Finally, it’s worth explaining a few other existing terms with which people may not be so familiar. For the sake of context, here is a brief summary of all three groups of terms and concepts. A more detailed explanation of the import system is found at [2].

name

In this proposal, a module’s “name” refers to its fully-qualified name, meaning the fully-qualified name of the module’s parent (if any) joined to the simple name of the module by a period.

finder

A “finder” is an object that identifies the loader that the import system should use to load a module. Currently this is accomplished by calling the finder’s find_module() method, which returns the loader.

Finders are strictly responsible for providing the loader, which they do through their find_module() method. The import system then uses that loader to load the module.

loader

A “loader” is an object that is used to load a module during import. Currently this is done by calling the loader’s load_module() method. A loader may also provide APIs for getting information about the modules it can load, as well as about data from sources associated with such a module.

Right now loaders (via load_module()) are responsible for certain boilerplate, import-related operations. These are:

  1. Perform some (module-related) validation
  2. Create the module object
  3. Set import-related attributes on the module
  4. “Register” the module to sys.modules
  5. Exec the module
  6. Clean up in the event of failure while loading the module

This all takes place during the import system’s call to Loader.load_module().

origin

This is a new term and concept. The idea of it exists subtly in the import system already, but this proposal makes the concept explicit.

“origin” in an import context means the system (or resource within a system) from which a module originates. For the purposes of this proposal, “origin” is also a string which identifies such a resource or system. “origin” is applicable to all modules.

For example, the origin for built-in and frozen modules is the interpreter itself. The import system already identifies this origin as “built-in” and “frozen”, respectively. This is demonstrated in the following module repr: “<module ‘sys’ (built-in)>”.

In fact, the module repr is already a relatively reliable, though implicit, indicator of a module’s origin. Other modules also indicate their origin through other means, as described in the entry for “location”.

It is up to the loader to decide on how to interpret and use a module’s origin, if at all.

location

This is a new term. However the concept already exists clearly in the import system, as associated with the __file__ and __path__ attributes of modules, as well as the name/term “path” elsewhere.

A “location” is a resource or “place”, rather than a system at large, from which a module is loaded. It qualifies as an “origin”. Examples of locations include filesystem paths and URLs. A location is identified by the name of the resource, but may not necessarily identify the system to which the resource pertains. In such cases the loader would have to identify the system itself.

In contrast to other kinds of module origin, a location cannot be inferred by the loader just by the module name. Instead, the loader must be provided with a string to identify the location, usually by the finder that generates the loader. The loader then uses this information to locate the resource from which it will load the module. In theory you could load the module at a given location under various names.

The most common example of locations in the import system are the files from which source and extension modules are loaded. For these modules the location is identified by the string in the __file__ attribute. Although __file__ isn’t particularly accurate for some modules (e.g. zipped), it is currently the only way that the import system indicates that a module has a location.

A module that has a location may be called “locatable”.

cache

The import system stores compiled modules in the __pycache__ directory as an optimization. This module cache that we use today was provided by PEP 3147. For this proposal, the relevant API for module caching is the __cache__ attribute of modules and the cache_from_source() function in importlib.util. Loaders are responsible for putting modules into the cache (and loading out of the cache). Currently the cache is only used for compiled source modules. However, loaders may take advantage of the module cache for other kinds of modules.

package

The concept does not change, nor does the term. However, the distinction between modules and packages is mostly superficial. Packages are modules. They simply have a __path__ attribute and import may add attributes bound to submodules. The typically perceived difference is a source of confusion. This proposal explicitly de-emphasizes the distinction between packages and modules where it makes sense to do so.

Motivation

The import system has evolved over the lifetime of Python. In late 2002 PEP 302 introduced standardized import hooks via finders and loaders and sys.meta_path. The importlib module, introduced with Python 3.1, now exposes a pure Python implementation of the APIs described by PEP 302, as well as of the full import system. It is now much easier to understand and extend the import system. While a benefit to the Python community, this greater accessibility also presents a challenge.

As more developers come to understand and customize the import system, any weaknesses in the finder and loader APIs will be more impactful. So the sooner we can address any such weaknesses the import system, the better…and there are a couple we hope to take care of with this proposal.

Firstly, any time the import system needs to save information about a module we end up with more attributes on module objects that are generally only meaningful to the import system. It would be nice to have a per-module namespace in which to put future import-related information and to pass around within the import system. Secondly, there’s an API void between finders and loaders that causes undue complexity when encountered. The PEP 420 (namespace packages) implementation had to work around this. The complexity surfaced again during recent efforts on a separate proposal. [1]

The finder and loader sections above detail current responsibility of both. Notably, loaders are not required to provide any of the functionality of their load_module() method through other methods. Thus, though the import-related information about a module is likely available without loading the module, it is not otherwise exposed.

Furthermore, the requirements associated with load_module() are common to all loaders and mostly are implemented in exactly the same way. This means every loader has to duplicate the same boilerplate code. importlib.util provides some tools that help with this, but it would be more helpful if the import system simply took charge of these responsibilities. The trouble is that this would limit the degree of customization that load_module() could easily continue to facilitate.

More importantly, While a finder could provide the information that the loader’s load_module() would need, it currently has no consistent way to get it to the loader. This is a gap between finders and loaders which this proposal aims to fill.

Finally, when the import system calls a finder’s find_module(), the finder makes use of a variety of information about the module that is useful outside the context of the method. Currently the options are limited for persisting that per-module information past the method call, since it only returns the loader. Popular options for this limitation are to store the information in a module-to-info mapping somewhere on the finder itself, or store it on the loader.

Unfortunately, loaders are not required to be module-specific. On top of that, some of the useful information finders could provide is common to all finders, so ideally the import system could take care of those details. This is the same gap as before between finders and loaders.

As an example of complexity attributable to this flaw, the implementation of namespace packages in Python 3.3 (see PEP 420) added FileFinder.find_loader() because there was no good way for find_module() to provide the namespace search locations.

The answer to this gap is a ModuleSpec object that contains the per-module information and takes care of the boilerplate functionality involved with loading the module.

Specification

The goal is to address the gap between finders and loaders while changing as little of their semantics as possible. Though some functionality and information is moved to the new ModuleSpec type, their behavior should remain the same. However, for the sake of clarity the finder and loader semantics will be explicitly identified.

Here is a high-level summary of the changes described by this PEP. More detail is available in later sections.

importlib.machinery.ModuleSpec (new)

An encapsulation of a module’s import-system-related state during import. See the ModuleSpec section below for a more detailed description.

  • ModuleSpec(name, loader, *, origin=None, loader_state=None, is_package=None)

Attributes:

  • name - a string for the fully-qualified name of the module.
  • loader - the loader to use for loading.
  • origin - the name of the place from which the module is loaded, e.g. “builtin” for built-in modules and the filename for modules loaded from source.
  • submodule_search_locations - list of strings for where to find submodules, if a package (None otherwise).
  • loader_state - a container of extra module-specific data for use during loading.
  • cached (property) - a string for where the compiled module should be stored.
  • parent (RO-property) - the fully-qualified name of the package to which the module belongs as a submodule (or None).
  • has_location (RO-property) - a flag indicating whether or not the module’s “origin” attribute refers to a location.

importlib.util Additions

These are ModuleSpec factory functions, meant as a convenience for finders. See the Factory Functions section below for more detail.

  • spec_from_file_location(name, location, *, loader=None, submodule_search_locations=None) - build a spec from file-oriented information and loader APIs.
  • spec_from_loader(name, loader, *, origin=None, is_package=None) - build a spec with missing information filled in by using loader APIs.

Other API Additions

  • importlib.find_spec(name, path=None, target=None) will work exactly the same as importlib.find_loader() (which it replaces), but return a spec instead of a loader.

For finders:

  • importlib.abc.MetaPathFinder.find_spec(name, path, target) and importlib.abc.PathEntryFinder.find_spec(name, target) will return a module spec to use during import.

For loaders:

  • importlib.abc.Loader.exec_module(module) will execute a module in its own namespace. It replaces importlib.abc.Loader.load_module(), taking over its module execution functionality.
  • importlib.abc.Loader.create_module(spec) (optional) will return the module to use for loading.

For modules:

  • Module objects will have a new attribute: __spec__.

API Changes

  • InspectLoader.is_package() will become optional.

Deprecations

  • importlib.abc.MetaPathFinder.find_module()
  • importlib.abc.PathEntryFinder.find_module()
  • importlib.abc.PathEntryFinder.find_loader()
  • importlib.abc.Loader.load_module()
  • importlib.abc.Loader.module_repr()
  • importlib.util.set_package()
  • importlib.util.set_loader()
  • importlib.find_loader()

Removals

These were introduced prior to Python 3.4’s release, so they can simply be removed.

  • importlib.abc.Loader.init_module_attrs()
  • importlib.util.module_to_load()

Other Changes

  • The import system implementation in importlib will be changed to make use of ModuleSpec.
  • importlib.reload() will make use of ModuleSpec.
  • A module’s import-related attributes (other than __spec__) will no longer be used directly by the import system during that module’s import. However, this does not impact use of those attributes (e.g. __path__) when loading other modules (e.g. submodules).
  • Import-related attributes should no longer be added to modules directly, except by the import system.
  • The module type’s __repr__() will be a thin wrapper around a pure Python implementation which will leverage ModuleSpec.
  • The spec for the __main__ module will reflect the appropriate name and origin.

Backward-Compatibility

  • If a finder does not define find_spec(), a spec is derived from the loader returned by find_module().
  • PathEntryFinder.find_loader() still takes priority over find_module().
  • Loader.load_module() is used if exec_module() is not defined.

What Will not Change?

  • The syntax and semantics of the import statement.
  • Existing finders and loaders will continue to work normally.
  • The import-related module attributes will still be initialized with the same information.
  • Finders will still create loaders (now storing them in specs).
  • Loader.load_module(), if a module defines it, will have all the same requirements and may still be called directly.
  • Loaders will still be responsible for module data APIs.
  • importlib.reload() will still overwrite the import-related attributes.

Responsibilities

Here’s a quick breakdown of where responsibilities lie after this PEP.

finders:

  • create/identify a loader that can load the module.
  • create the spec for the module.

loaders:

  • create the module (optional).
  • execute the module.

ModuleSpec:

  • orchestrate module loading
  • boilerplate for module loading, including managing sys.modules and setting import-related attributes
  • create module if loader doesn’t
  • call loader.exec_module(), passing in the module in which to exec
  • contain all the information the loader needs to exec the module
  • provide the repr for modules

What Will Existing Finders and Loaders Have to Do Differently?

Immediately? Nothing. The status quo will be deprecated, but will continue working. However, here are the things that the authors of finders and loaders should change relative to this PEP:

  • Implement find_spec() on finders.
  • Implement exec_module() on loaders, if possible.

The ModuleSpec factory functions in importlib.util are intended to be helpful for converting existing finders. spec_from_loader() and spec_from_file_location() are both straightforward utilities in this regard.

For existing loaders, exec_module() should be a relatively direct conversion from the non-boilerplate portion of load_module(). In some uncommon cases the loader should also implement create_module().

ModuleSpec Users

ModuleSpec objects have 3 distinct target audiences: Python itself, import hooks, and normal Python users.

Python will use specs in the import machinery, in interpreter startup, and in various standard library modules. Some modules are import-oriented, like pkgutil, and others are not, like pickle and pydoc. In all cases, the full ModuleSpec API will get used.

Import hooks (finders and loaders) will make use of the spec in specific ways. First of all, finders may use the spec factory functions in importlib.util to create spec objects. They may also directly adjust the spec attributes after the spec is created. Secondly, the finder may bind additional information to the spec (in finder_extras) for the loader to consume during module creation/execution. Finally, loaders will make use of the attributes on a spec when creating and/or executing a module.

Python users will be able to inspect a module’s __spec__ to get import-related information about the object. Generally, Python applications and interactive users will not be using the ModuleSpec factory functions nor any the instance methods.

How Loading Will Work

Here is an outline of what the import machinery does during loading, adjusted to take advantage of the module’s spec and the new loader API:

module = None
if spec.loader is not None and hasattr(spec.loader, 'create_module'):
    module = spec.loader.create_module(spec)
if module is None:
    module = ModuleType(spec.name)
# The import-related module attributes get set here:
_init_module_attrs(spec, module)

if spec.loader is None and spec.submodule_search_locations is not None:
    # Namespace package
    sys.modules[spec.name] = module
elif not hasattr(spec.loader, 'exec_module'):
    spec.loader.load_module(spec.name)
    # __loader__ and __package__ would be explicitly set here for
    # backwards-compatibility.
else:
    sys.modules[spec.name] = module
    try:
        spec.loader.exec_module(module)
    except BaseException:
        try:
            del sys.modules[spec.name]
        except KeyError:
            pass
        raise
module_to_return = sys.modules[spec.name]

These steps are exactly what Loader.load_module() is already expected to do. Loaders will thus be simplified since they will only need to implement exec_module().

Note that we must return the module from sys.modules. During loading the module may have replaced itself in sys.modules. Since we don’t have a post-import hook API to accommodate the use case, we have to deal with it. However, in the replacement case we do not worry about setting the import-related module attributes on the object. The module writer is on their own if they are doing this.

How Reloading Will Work

Here is the corresponding outline for reload():

_RELOADING = {}

def reload(module):
    try:
        name = module.__spec__.name
    except AttributeError:
        name = module.__name__
    spec = find_spec(name, target=module)

    if sys.modules.get(name) is not module:
        raise ImportError
    if spec in _RELOADING:
        return _RELOADING[name]
    _RELOADING[name] = module
    try:
        if spec.loader is None:
            # Namespace loader
            _init_module_attrs(spec, module)
            return module
        if spec.parent and spec.parent not in sys.modules:
            raise ImportError

        _init_module_attrs(spec, module)
        # Ignoring backwards-compatibility call to load_module()
        # for simplicity.
        spec.loader.exec_module(module)
        return sys.modules[name]
    finally:
        del _RELOADING[name]

A key point here is the switch to Loader.exec_module() means that loaders will no longer have an easy way to know at execution time if it is a reload or not. Before this proposal, they could simply check to see if the module was already in sys.modules. Now, by the time exec_module() is called during load (not reload) the import machinery would already have placed the module in sys.modules. This is part of the reason why find_spec() has the “target” parameter.

The semantics of reload will remain essentially the same as they exist already [5]. The impact of this PEP on some kinds of lazy loading modules was a point of discussion. [4]

ModuleSpec

Attributes

Each of the following names is an attribute on ModuleSpec objects. A value of None indicates “not set”. This contrasts with module objects where the attribute simply doesn’t exist. Most of the attributes correspond to the import-related attributes of modules. Here is the mapping. The reverse of this mapping describes how the import machinery sets the module attributes right before calling exec_module().

On ModuleSpec On Modules
name __name__
loader __loader__
parent __package__
origin __file__*
cached __cached__*,**
submodule_search_locations __path__**
loader_state -
has_location -
* Set on the module only if spec.has_location is true.
** Set on the module only if the spec attribute is not None.

While parent and has_location are read-only properties, the remaining attributes can be replaced after the module spec is created and even after import is complete. This allows for unusual cases where directly modifying the spec is the best option. However, typical use should not involve changing the state of a module’s spec.

origin

“origin” is a string for the name of the place from which the module originates. See origin above. Aside from the informational value, it is also used in the module’s repr. In the case of a spec where “has_location” is true, __file__ is set to the value of “origin”. For built-in modules “origin” would be set to “built-in”.

has_location

As explained in the location section above, many modules are “locatable”, meaning there is a corresponding resource from which the module will be loaded and that resource can be described by a string. In contrast, non-locatable modules can’t be loaded in this fashion, e.g. builtin modules and modules dynamically created in code. For these, the name is the only way to access them, so they have an “origin” but not a “location”.

“has_location” is true if the module is locatable. In that case the spec’s origin is used as the location and __file__ is set to spec.origin. If additional location information is required (e.g. zipimport), that information may be stored in spec.loader_state.

“has_location” may be implied from the existence of a load_data() method on the loader.

Incidentally, not all locatable modules will be cache-able, but most will.

submodule_search_locations

The list of location strings, typically directory paths, in which to search for submodules. If the module is a package this will be set to a list (even an empty one). Otherwise it is None.

The name of the corresponding module attribute, __path__, is relatively ambiguous. Instead of mirroring it, we use a more explicit attribute name that makes the purpose clear.

loader_state

A finder may set loader_state to any value to provide additional data for the loader to use during loading. A value of None is the default and indicates that there is no additional data. Otherwise it can be set to any object, such as a dict, list, or types.SimpleNamespace, containing the relevant extra information.

For example, zipimporter could use it to pass the zip archive name to the loader directly, rather than needing to derive it from origin or create a custom loader for each find operation.

loader_state is meant for use by the finder and corresponding loader. It is not guaranteed to be a stable resource for any other use.

Factory Functions

spec_from_file_location(name, location, *, loader=None, submodule_search_locations=None)

Build a spec from file-oriented information and loader APIs.

  • “origin” will be set to the location.
  • “has_location” will be set to True.
  • “cached” will be set to the result of calling cache_from_source().
  • “origin” can be deduced from loader.get_filename() (if “location” is not passed in.
  • “loader” can be deduced from suffix if the location is a filename.
  • “submodule_search_locations” can be deduced from loader.is_package() and from os.path.dirname(location) if location is a filename.

spec_from_loader(name, loader, *, origin=None, is_package=None)

Build a spec with missing information filled in by using loader APIs.

  • “has_location” can be deduced from loader.get_data.
  • “origin” can be deduced from loader.get_filename().
  • “submodule_search_locations” can be deduced from loader.is_package() and from os.path.dirname(location) if location is a filename.

Backward Compatibility

ModuleSpec doesn’t have any. This would be a different story if Finder.find_module() were to return a module spec instead of loader. In that case, specs would have to act like the loader that would have been returned instead. Doing so would be relatively simple, but is an unnecessary complication. It was part of earlier versions of this PEP.

Subclassing

Subclasses of ModuleSpec are allowed, but should not be necessary. Simply setting loader_state or adding functionality to a custom finder or loader will likely be a better fit and should be tried first. However, as long as a subclass still fulfills the requirements of the import system, objects of that type are completely fine as the return value of Finder.find_spec(). The same points apply to duck-typing.

Existing Types

Module Objects

Other than adding __spec__, none of the import-related module attributes will be changed or deprecated, though some of them could be; any such deprecation can wait until Python 4.

A module’s spec will not be kept in sync with the corresponding import-related attributes. Though they may differ, in practice they will typically be the same.

One notable exception is that case where a module is run as a script by using the -m flag. In that case module.__spec__.name will reflect the actual module name while module.__name__ will be __main__.

A module’s spec is not guaranteed to be identical between two modules with the same name. Likewise there is no guarantee that successive calls to importlib.find_spec() will return the same object or even an equivalent object, though at least the latter is likely.

Finders

Finders are still responsible for identifying, and typically creating, the loader that should be used to load a module. That loader will now be stored in the module spec returned by find_spec() rather than returned directly. As is currently the case without the PEP, if a loader would be costly to create, that loader can be designed to defer the cost until later.

MetaPathFinder.find_spec(name, path=None, target=None)

PathEntryFinder.find_spec(name, target=None)

Finders must return ModuleSpec objects when find_spec() is called. This new method replaces find_module() and find_loader() (in the PathEntryFinder case). If a loader does not have find_spec(), find_module() and find_loader() are used instead, for backward-compatibility.

Adding yet another similar method to loaders is a case of practicality. find_module() could be changed to return specs instead of loaders. This is tempting because the import APIs have suffered enough, especially considering PathEntryFinder.find_loader() was just added in Python 3.3. However, the extra complexity and a less-than-explicit method name aren’t worth it.

The “target” parameter of find_spec()

A call to find_spec() may optionally include a “target” argument. This is the module object that will be used subsequently as the target of loading. During normal import (and by default) “target” is None, meaning the target module has yet to be created. During reloading the module passed in to reload() is passed through to find_spec() as the target. This argument allows the finder to build the module spec with more information than is otherwise available. Doing so is particularly relevant in identifying the loader to use.

Through find_spec() the finder will always identify the loader it will return in the spec (or return None). At the point the loader is identified, the finder should also decide whether or not the loader supports loading into the target module, in the case that “target” is passed in. This decision may entail consulting with the loader.

If the finder determines that the loader does not support loading into the target module, it should either find another loader or raise ImportError (completely stopping import of the module). This determination is especially important during reload since, as noted in How Reloading Will Work, loaders will no longer be able to trivially identify a reload situation on their own.

Two alternatives were presented to the “target” parameter: Loader.supports_reload() and adding “target” to Loader.exec_module() instead of find_spec(). supports_reload() was the initial approach to the reload situation. [6] However, there was some opposition to the loader-specific, reload-centric approach. [7]

As to “target” on exec_module(), the loader may need other information from the target module (or spec) during reload, more than just “does this loader support reloading this module”, that is no longer available with the move away from load_module(). A proposal on the table was to add something like “target” to exec_module(). [8] However, putting “target” on find_spec() instead is more in line with the goals of this PEP. Furthermore, it obviates the need for supports_reload().

Namespace Packages

Currently a path entry finder may return (None, portions) from find_loader() to indicate it found part of a possible namespace package. To achieve the same effect, find_spec() must return a spec with “loader” set to None (a.k.a. not set) and with submodule_search_locations set to the same portions as would have been provided by find_loader(). It’s up to PathFinder how to handle such specs.

Loaders

Loader.exec_module(module)

Loaders will have a new method, exec_module(). Its only job is to “exec” the module and consequently populate the module’s namespace. It is not responsible for creating or preparing the module object, nor for any cleanup afterward. It has no return value. exec_module() will be used during both loading and reloading.

exec_module() should properly handle the case where it is called more than once. For some kinds of modules this may mean raising ImportError every time after the first time the method is called. This is particularly relevant for reloading, where some kinds of modules do not support in-place reloading.

Loader.create_module(spec)

Loaders may also implement create_module() that will return a new module to exec. It may return None to indicate that the default module creation code should be used. One use case, though atypical, for create_module() is to provide a module that is a subclass of the builtin module type. Most loaders will not need to implement create_module(),

create_module() should properly handle the case where it is called more than once for the same spec/module. This may include returning None or raising ImportError.

Note

exec_module() and create_module() should not set any import-related module attributes. The fact that load_module() does is a design flaw that this proposal aims to correct.

Other changes:

PEP 420 introduced the optional module_repr() loader method to limit the amount of special-casing in the module type’s __repr__(). Since this method is part of ModuleSpec, it will be deprecated on loaders. However, if it exists on a loader it will be used exclusively.

Loader.init_module_attr() method, added prior to Python 3.4’s release, will be removed in favor of the same method on ModuleSpec.

However, InspectLoader.is_package() will not be deprecated even though the same information is found on ModuleSpec. ModuleSpec can use it to populate its own is_package if that information is not otherwise available. Still, it will be made optional.

In addition to executing a module during loading, loaders will still be directly responsible for providing APIs concerning module-related data.

Other Changes

  • The various finders and loaders provided by importlib will be updated to comply with this proposal.
  • Any other implementations of or dependencies on the import-related APIs (particularly finders and loaders) in the stdlib will be likewise adjusted to this PEP. While they should continue to work, any such changes that get missed should be considered bugs for the Python 3.4.x series.
  • The spec for the __main__ module will reflect how the interpreter was started. For instance, with -m the spec’s name will be that of the module used, while __main__.__name__ will still be “__main__”.
  • We will add importlib.find_spec() to mirror importlib.find_loader() (which becomes deprecated).
  • importlib.reload() is changed to use ModuleSpec.
  • importlib.reload() will now make use of the per-module import lock.

Reference Implementation

A reference implementation is available at http://bugs.python.org/issue18864.

Implementation Notes

* The implementation of this PEP needs to be cognizant of its impact on pkgutil (and setuptools). pkgutil has some generic function-based extensions to PEP 302 which may break if importlib starts wrapping loaders without the tools’ knowledge.

* Other modules to look at: runpy (and pythonrun.c), pickle, pydoc, inspect.

For instance, pickle should be updated in the __main__ case to look at module.__spec__.name.

Rejected Additions to the PEP

There were a few proposed additions to this proposal that did not fit well enough into its scope.

There is no “PathModuleSpec” subclass of ModuleSpec that separates out has_location, cached, and submodule_search_locations. While that might make the separation cleaner, module objects don’t have that distinction. ModuleSpec will support both cases equally well.

While “ModuleSpec.is_package” would be a simple additional attribute (aliasing self.submodule_search_locations is not None), it perpetuates the artificial (and mostly erroneous) distinction between modules and packages.

The module spec Factory Functions could be classmethods on ModuleSpec. However that would expose them on all modules via __spec__, which has the potential to unnecessarily confuse non-advanced Python users. The factory functions have a specific use case, to support finder authors. See ModuleSpec Users.

Likewise, several other methods could be added to ModuleSpec that expose the specific uses of module specs by the import machinery:

  • create() - a wrapper around Loader.create_module().
  • exec(module) - a wrapper around Loader.exec_module().
  • load() - an analogue to the deprecated Loader.load_module().

As with the factory functions, exposing these methods via module.__spec__ is less than desirable. They would end up being an attractive nuisance, even if only exposed as “private” attributes (as they were in previous versions of this PEP). If someone finds a need for these methods later, we can expose the via an appropriate API (separate from ModuleSpec) at that point, perhaps relative to PEP 406 (import engine).

Conceivably, the load() method could optionally take a list of modules with which to interact instead of sys.modules. Also, load() could be leveraged to implement multi-version imports. Both are interesting ideas, but definitely outside the scope of this proposal.

Others left out:

  • Add ModuleSpec.submodules (RO-property) - returns possible submodules relative to the spec.
  • Add ModuleSpec.loaded (RO-property) - the module in sys.module, if any.
  • Add ModuleSpec.data - a descriptor that wraps the data API of the spec’s loader.
  • Also see [3].

References


Source: https://github.com/python/peps/blob/main/peps/pep-0451.rst

Last modified: 2023-10-11 12:05:51 GMT