PEP: 451 Title: A ModuleSpec Type for the Import System Version:
$Revision$ Last-Modified: $Date$ Author: Eric Snow
<ericsnowcurrently@gmail.com> BDFL-Delegate: Brett Cannon
<brett@python.org>, Alyssa Coghlan <ncoghlan@gmail.com> Discussions-To:
import-sig@python.org Status: Final Type: Standards Track Content-Type:
text/x-rst Created: 08-Aug-2013 Python-Version: 3.4 Post-History:
08-Aug-2013, 28-Aug-2013, 18-Sep-2013, 24-Sep-2013, 04-Oct-2013
Resolution:
https://mail.python.org/pipermail/python-dev/2013-November/130104.html

Abstract

This PEP proposes to add a new class to importlib.machinery called
"ModuleSpec". It will provide all the import-related information used to
load a module and will be available without needing to load the module
first. Finders will directly provide a module's spec instead of a loader
(which they will continue to provide indirectly). The import machinery
will be adjusted to take advantage of module specs, including using them
to load modules.

Terms and Concepts

The changes in this proposal are an opportunity to make several existing
terms and concepts more clear, whereas currently they are
(unfortunately) ambiguous. New concepts are also introduced in this
proposal. Finally, it's worth explaining a few other existing terms with
which people may not be so familiar. For the sake of context, here is a
brief summary of all three groups of terms and concepts. A more detailed
explanation of the import system is found at [1].

name

In this proposal, a module's "name" refers to its fully-qualified name,
meaning the fully-qualified name of the module's parent (if any) joined
to the simple name of the module by a period.

finder

A "finder" is an object that identifies the loader that the import
system should use to load a module. Currently this is accomplished by
calling the finder's find_module() method, which returns the loader.

Finders are strictly responsible for providing the loader, which they do
through their find_module() method. The import system then uses that
loader to load the module.

loader

A "loader" is an object that is used to load a module during import.
Currently this is done by calling the loader's load_module() method. A
loader may also provide APIs for getting information about the modules
it can load, as well as about data from sources associated with such a
module.

Right now loaders (via load_module()) are responsible for certain
boilerplate, import-related operations. These are:

1.  Perform some (module-related) validation
2.  Create the module object
3.  Set import-related attributes on the module
4.  "Register" the module to sys.modules
5.  Exec the module
6.  Clean up in the event of failure while loading the module

This all takes place during the import system's call to
Loader.load_module().

origin

This is a new term and concept. The idea of it exists subtly in the
import system already, but this proposal makes the concept explicit.

"origin" in an import context means the system (or resource within a
system) from which a module originates. For the purposes of this
proposal, "origin" is also a string which identifies such a resource or
system. "origin" is applicable to all modules.

For example, the origin for built-in and frozen modules is the
interpreter itself. The import system already identifies this origin as
"built-in" and "frozen", respectively. This is demonstrated in the
following module repr: "<module 'sys' (built-in)>".

In fact, the module repr is already a relatively reliable, though
implicit, indicator of a module's origin. Other modules also indicate
their origin through other means, as described in the entry for
"location".

It is up to the loader to decide on how to interpret and use a module's
origin, if at all.

location

This is a new term. However the concept already exists clearly in the
import system, as associated with the __file__ and __path__ attributes
of modules, as well as the name/term "path" elsewhere.

A "location" is a resource or "place", rather than a system at large,
from which a module is loaded. It qualifies as an "origin". Examples of
locations include filesystem paths and URLs. A location is identified by
the name of the resource, but may not necessarily identify the system to
which the resource pertains. In such cases the loader would have to
identify the system itself.

In contrast to other kinds of module origin, a location cannot be
inferred by the loader just by the module name. Instead, the loader must
be provided with a string to identify the location, usually by the
finder that generates the loader. The loader then uses this information
to locate the resource from which it will load the module. In theory you
could load the module at a given location under various names.

The most common example of locations in the import system are the files
from which source and extension modules are loaded. For these modules
the location is identified by the string in the __file__ attribute.
Although __file__ isn't particularly accurate for some modules (e.g.
zipped), it is currently the only way that the import system indicates
that a module has a location.

A module that has a location may be called "locatable".

cache

The import system stores compiled modules in the __pycache__ directory
as an optimization. This module cache that we use today was provided by
PEP 3147. For this proposal, the relevant API for module caching is the
__cache__ attribute of modules and the cache_from_source() function in
importlib.util. Loaders are responsible for putting modules into the
cache (and loading out of the cache). Currently the cache is only used
for compiled source modules. However, loaders may take advantage of the
module cache for other kinds of modules.

package

The concept does not change, nor does the term. However, the distinction
between modules and packages is mostly superficial. Packages are
modules. They simply have a __path__ attribute and import may add
attributes bound to submodules. The typically perceived difference is a
source of confusion. This proposal explicitly de-emphasizes the
distinction between packages and modules where it makes sense to do so.

Motivation

The import system has evolved over the lifetime of Python. In late 2002
PEP 302 introduced standardized import hooks via finders and loaders and
sys.meta_path. The importlib module, introduced with Python 3.1, now
exposes a pure Python implementation of the APIs described by PEP 302,
as well as of the full import system. It is now much easier to
understand and extend the import system. While a benefit to the Python
community, this greater accessibility also presents a challenge.

As more developers come to understand and customize the import system,
any weaknesses in the finder and loader APIs will be more impactful. So
the sooner we can address any such weaknesses the import system, the
better...and there are a couple we hope to take care of with this
proposal.

Firstly, any time the import system needs to save information about a
module we end up with more attributes on module objects that are
generally only meaningful to the import system. It would be nice to have
a per-module namespace in which to put future import-related information
and to pass around within the import system. Secondly, there's an API
void between finders and loaders that causes undue complexity when
encountered. The PEP 420 (namespace packages) implementation had to work
around this. The complexity surfaced again during recent efforts on a
separate proposal.[2]

The finder and loader sections above detail current responsibility of
both. Notably, loaders are not required to provide any of the
functionality of their load_module() method through other methods. Thus,
though the import-related information about a module is likely available
without loading the module, it is not otherwise exposed.

Furthermore, the requirements associated with load_module() are common
to all loaders and mostly are implemented in exactly the same way. This
means every loader has to duplicate the same boilerplate code.
importlib.util provides some tools that help with this, but it would be
more helpful if the import system simply took charge of these
responsibilities. The trouble is that this would limit the degree of
customization that load_module() could easily continue to facilitate.

More importantly, While a finder could provide the information that the
loader's load_module() would need, it currently has no consistent way to
get it to the loader. This is a gap between finders and loaders which
this proposal aims to fill.

Finally, when the import system calls a finder's find_module(), the
finder makes use of a variety of information about the module that is
useful outside the context of the method. Currently the options are
limited for persisting that per-module information past the method call,
since it only returns the loader. Popular options for this limitation
are to store the information in a module-to-info mapping somewhere on
the finder itself, or store it on the loader.

Unfortunately, loaders are not required to be module-specific. On top of
that, some of the useful information finders could provide is common to
all finders, so ideally the import system could take care of those
details. This is the same gap as before between finders and loaders.

As an example of complexity attributable to this flaw, the
implementation of namespace packages in Python 3.3 (see PEP 420) added
FileFinder.find_loader() because there was no good way for find_module()
to provide the namespace search locations.

The answer to this gap is a ModuleSpec object that contains the
per-module information and takes care of the boilerplate functionality
involved with loading the module.

Specification

The goal is to address the gap between finders and loaders while
changing as little of their semantics as possible. Though some
functionality and information is moved to the new ModuleSpec type, their
behavior should remain the same. However, for the sake of clarity the
finder and loader semantics will be explicitly identified.

Here is a high-level summary of the changes described by this PEP. More
detail is available in later sections.

importlib.machinery.ModuleSpec (new)

An encapsulation of a module's import-system-related state during
import. See the ModuleSpec section below for a more detailed
description.

-   ModuleSpec(name, loader, *, origin=None, loader_state=None,
    is_package=None)

Attributes:

-   name - a string for the fully-qualified name of the module.
-   loader - the loader to use for loading.
-   origin - the name of the place from which the module is loaded, e.g.
    "builtin" for built-in modules and the filename for modules loaded
    from source.
-   submodule_search_locations - list of strings for where to find
    submodules, if a package (None otherwise).
-   loader_state - a container of extra module-specific data for use
    during loading.
-   cached (property) - a string for where the compiled module should be
    stored.
-   parent (RO-property) - the fully-qualified name of the package to
    which the module belongs as a submodule (or None).
-   has_location (RO-property) - a flag indicating whether or not the
    module's "origin" attribute refers to a location.

importlib.util Additions

These are ModuleSpec factory functions, meant as a convenience for
finders. See the Factory Functions section below for more detail.

-   spec_from_file_location(name, location, *, loader=None,
    submodule_search_locations=None)
    -   build a spec from file-oriented information and loader APIs.
-   spec_from_loader(name, loader, *, origin=None, is_package=None)
    - build a spec with missing information filled in by using loader
    APIs.

Other API Additions

-   importlib.find_spec(name, path=None, target=None) will work exactly
    the same as importlib.find_loader() (which it replaces), but return
    a spec instead of a loader.

For finders:

-   importlib.abc.MetaPathFinder.find_spec(name, path, target) and
    importlib.abc.PathEntryFinder.find_spec(name, target) will return a
    module spec to use during import.

For loaders:

-   importlib.abc.Loader.exec_module(module) will execute a module in
    its own namespace. It replaces importlib.abc.Loader.load_module(),
    taking over its module execution functionality.
-   importlib.abc.Loader.create_module(spec) (optional) will return the
    module to use for loading.

For modules:

-   Module objects will have a new attribute: __spec__.

API Changes

-   InspectLoader.is_package() will become optional.

Deprecations

-   importlib.abc.MetaPathFinder.find_module()
-   importlib.abc.PathEntryFinder.find_module()
-   importlib.abc.PathEntryFinder.find_loader()
-   importlib.abc.Loader.load_module()
-   importlib.abc.Loader.module_repr()
-   importlib.util.set_package()
-   importlib.util.set_loader()
-   importlib.find_loader()

Removals

These were introduced prior to Python 3.4's release, so they can simply
be removed.

-   importlib.abc.Loader.init_module_attrs()
-   importlib.util.module_to_load()

Other Changes

-   The import system implementation in importlib will be changed to
    make use of ModuleSpec.
-   importlib.reload() will make use of ModuleSpec.
-   A module's import-related attributes (other than __spec__) will no
    longer be used directly by the import system during that module's
    import. However, this does not impact use of those attributes (e.g.
    __path__) when loading other modules (e.g. submodules).
-   Import-related attributes should no longer be added to modules
    directly, except by the import system.
-   The module type's __repr__() will be a thin wrapper around a pure
    Python implementation which will leverage ModuleSpec.
-   The spec for the __main__ module will reflect the appropriate name
    and origin.

Backward-Compatibility

-   If a finder does not define find_spec(), a spec is derived from the
    loader returned by find_module().
-   PathEntryFinder.find_loader() still takes priority over
    find_module().
-   Loader.load_module() is used if exec_module() is not defined.

What Will not Change?

-   The syntax and semantics of the import statement.
-   Existing finders and loaders will continue to work normally.
-   The import-related module attributes will still be initialized with
    the same information.
-   Finders will still create loaders (now storing them in specs).
-   Loader.load_module(), if a module defines it, will have all the same
    requirements and may still be called directly.
-   Loaders will still be responsible for module data APIs.
-   importlib.reload() will still overwrite the import-related
    attributes.

Responsibilities

Here's a quick breakdown of where responsibilities lie after this PEP.

finders:

-   create/identify a loader that can load the module.
-   create the spec for the module.

loaders:

-   create the module (optional).
-   execute the module.

ModuleSpec:

-   orchestrate module loading
-   boilerplate for module loading, including managing sys.modules and
    setting import-related attributes
-   create module if loader doesn't
-   call loader.exec_module(), passing in the module in which to exec
-   contain all the information the loader needs to exec the module
-   provide the repr for modules

What Will Existing Finders and Loaders Have to Do Differently?

Immediately? Nothing. The status quo will be deprecated, but will
continue working. However, here are the things that the authors of
finders and loaders should change relative to this PEP:

-   Implement find_spec() on finders.
-   Implement exec_module() on loaders, if possible.

The ModuleSpec factory functions in importlib.util are intended to be
helpful for converting existing finders. spec_from_loader() and
spec_from_file_location() are both straightforward utilities in this
regard.

For existing loaders, exec_module() should be a relatively direct
conversion from the non-boilerplate portion of load_module(). In some
uncommon cases the loader should also implement create_module().

ModuleSpec Users

ModuleSpec objects have 3 distinct target audiences: Python itself,
import hooks, and normal Python users.

Python will use specs in the import machinery, in interpreter startup,
and in various standard library modules. Some modules are
import-oriented, like pkgutil, and others are not, like pickle and
pydoc. In all cases, the full ModuleSpec API will get used.

Import hooks (finders and loaders) will make use of the spec in specific
ways. First of all, finders may use the spec factory functions in
importlib.util to create spec objects. They may also directly adjust the
spec attributes after the spec is created. Secondly, the finder may bind
additional information to the spec (in finder_extras) for the loader to
consume during module creation/execution. Finally, loaders will make use
of the attributes on a spec when creating and/or executing a module.

Python users will be able to inspect a module's __spec__ to get
import-related information about the object. Generally, Python
applications and interactive users will not be using the ModuleSpec
factory functions nor any the instance methods.

How Loading Will Work

Here is an outline of what the import machinery does during loading,
adjusted to take advantage of the module's spec and the new loader API:

    module = None
    if spec.loader is not None and hasattr(spec.loader, 'create_module'):
        module = spec.loader.create_module(spec)
    if module is None:
        module = ModuleType(spec.name)
    # The import-related module attributes get set here:
    _init_module_attrs(spec, module)

    if spec.loader is None and spec.submodule_search_locations is not None:
        # Namespace package
        sys.modules[spec.name] = module
    elif not hasattr(spec.loader, 'exec_module'):
        spec.loader.load_module(spec.name)
        # __loader__ and __package__ would be explicitly set here for
        # backwards-compatibility.
    else:
        sys.modules[spec.name] = module
        try:
            spec.loader.exec_module(module)
        except BaseException:
            try:
                del sys.modules[spec.name]
            except KeyError:
                pass
            raise
    module_to_return = sys.modules[spec.name]

These steps are exactly what Loader.load_module() is already expected to
do. Loaders will thus be simplified since they will only need to
implement exec_module().

Note that we must return the module from sys.modules. During loading the
module may have replaced itself in sys.modules. Since we don't have a
post-import hook API to accommodate the use case, we have to deal with
it. However, in the replacement case we do not worry about setting the
import-related module attributes on the object. The module writer is on
their own if they are doing this.

How Reloading Will Work

Here is the corresponding outline for reload():

    _RELOADING = {}

    def reload(module):
        try:
            name = module.__spec__.name
        except AttributeError:
            name = module.__name__
        spec = find_spec(name, target=module)

        if sys.modules.get(name) is not module:
            raise ImportError
        if spec in _RELOADING:
            return _RELOADING[name]
        _RELOADING[name] = module
        try:
            if spec.loader is None:
                # Namespace loader
                _init_module_attrs(spec, module)
                return module
            if spec.parent and spec.parent not in sys.modules:
                raise ImportError

            _init_module_attrs(spec, module)
            # Ignoring backwards-compatibility call to load_module()
            # for simplicity.
            spec.loader.exec_module(module)
            return sys.modules[name]
        finally:
            del _RELOADING[name]

A key point here is the switch to Loader.exec_module() means that
loaders will no longer have an easy way to know at execution time if it
is a reload or not. Before this proposal, they could simply check to see
if the module was already in sys.modules. Now, by the time exec_module()
is called during load (not reload) the import machinery would already
have placed the module in sys.modules. This is part of the reason why
find_spec() has the "target" parameter.

The semantics of reload will remain essentially the same as they exist
already[3]. The impact of this PEP on some kinds of lazy loading modules
was a point of discussion.[4]

ModuleSpec

Attributes

Each of the following names is an attribute on ModuleSpec objects. A
value of None indicates "not set". This contrasts with module objects
where the attribute simply doesn't exist. Most of the attributes
correspond to the import-related attributes of modules. Here is the
mapping. The reverse of this mapping describes how the import machinery
sets the module attributes right before calling exec_module().

+----------------------------+----------------+
| On ModuleSpec              | On Modules     |
+============================+================+
| name                       | __name__       |
+----------------------------+----------------+
| loader                     | __loader__     |
+----------------------------+----------------+
| parent                     | __package__    |
+----------------------------+----------------+
| origin                     | __file__*      |
+----------------------------+----------------+
| cached                     | __cached__*,** |
+----------------------------+----------------+
| submodule_search_locations | __path__**     |
+----------------------------+----------------+
| loader_state               |   -            |
+----------------------------+----------------+
| has_location               |   -            |
+----------------------------+----------------+

* Set on the module only if spec.has_location is true.
** Set on the module only if the spec attribute is not None.

While parent and has_location are read-only properties, the remaining
attributes can be replaced after the module spec is created and even
after import is complete. This allows for unusual cases where directly
modifying the spec is the best option. However, typical use should not
involve changing the state of a module's spec.

origin

"origin" is a string for the name of the place from which the module
originates. See origin above. Aside from the informational value, it is
also used in the module's repr. In the case of a spec where
"has_location" is true, __file__ is set to the value of "origin". For
built-in modules "origin" would be set to "built-in".

has_location

As explained in the location section above, many modules are
"locatable", meaning there is a corresponding resource from which the
module will be loaded and that resource can be described by a string. In
contrast, non-locatable modules can't be loaded in this fashion, e.g.
builtin modules and modules dynamically created in code. For these, the
name is the only way to access them, so they have an "origin" but not a
"location".

"has_location" is true if the module is locatable. In that case the
spec's origin is used as the location and __file__ is set to
spec.origin. If additional location information is required (e.g.
zipimport), that information may be stored in spec.loader_state.

"has_location" may be implied from the existence of a load_data() method
on the loader.

Incidentally, not all locatable modules will be cache-able, but most
will.

submodule_search_locations

The list of location strings, typically directory paths, in which to
search for submodules. If the module is a package this will be set to a
list (even an empty one). Otherwise it is None.

The name of the corresponding module attribute, __path__, is relatively
ambiguous. Instead of mirroring it, we use a more explicit attribute
name that makes the purpose clear.

loader_state

A finder may set loader_state to any value to provide additional data
for the loader to use during loading. A value of None is the default and
indicates that there is no additional data. Otherwise it can be set to
any object, such as a dict, list, or types.SimpleNamespace, containing
the relevant extra information.

For example, zipimporter could use it to pass the zip archive name to
the loader directly, rather than needing to derive it from origin or
create a custom loader for each find operation.

loader_state is meant for use by the finder and corresponding loader. It
is not guaranteed to be a stable resource for any other use.

Factory Functions

spec_from_file_location(name, location, *, loader=None,
submodule_search_locations=None)

Build a spec from file-oriented information and loader APIs.

-   "origin" will be set to the location.
-   "has_location" will be set to True.
-   "cached" will be set to the result of calling cache_from_source().
-   "origin" can be deduced from loader.get_filename() (if "location" is
    not passed in.
-   "loader" can be deduced from suffix if the location is a filename.
-   "submodule_search_locations" can be deduced from loader.is_package()
    and from os.path.dirname(location) if location is a filename.

spec_from_loader(name, loader, *, origin=None, is_package=None)

Build a spec with missing information filled in by using loader APIs.

-   "has_location" can be deduced from loader.get_data.
-   "origin" can be deduced from loader.get_filename().
-   "submodule_search_locations" can be deduced from loader.is_package()
    and from os.path.dirname(location) if location is a filename.

Backward Compatibility

ModuleSpec doesn't have any. This would be a different story if
Finder.find_module() were to return a module spec instead of loader. In
that case, specs would have to act like the loader that would have been
returned instead. Doing so would be relatively simple, but is an
unnecessary complication. It was part of earlier versions of this PEP.

Subclassing

Subclasses of ModuleSpec are allowed, but should not be necessary.
Simply setting loader_state or adding functionality to a custom finder
or loader will likely be a better fit and should be tried first.
However, as long as a subclass still fulfills the requirements of the
import system, objects of that type are completely fine as the return
value of Finder.find_spec(). The same points apply to duck-typing.

Existing Types

Module Objects

Other than adding __spec__, none of the import-related module attributes
will be changed or deprecated, though some of them could be; any such
deprecation can wait until Python 4.

A module's spec will not be kept in sync with the corresponding
import-related attributes. Though they may differ, in practice they will
typically be the same.

One notable exception is that case where a module is run as a script by
using the -m flag. In that case module.__spec__.name will reflect the
actual module name while module.__name__ will be __main__.

A module's spec is not guaranteed to be identical between two modules
with the same name. Likewise there is no guarantee that successive calls
to importlib.find_spec() will return the same object or even an
equivalent object, though at least the latter is likely.

Finders

Finders are still responsible for identifying, and typically creating,
the loader that should be used to load a module. That loader will now be
stored in the module spec returned by find_spec() rather than returned
directly. As is currently the case without the PEP, if a loader would be
costly to create, that loader can be designed to defer the cost until
later.

MetaPathFinder.find_spec(name, path=None, target=None)

PathEntryFinder.find_spec(name, target=None)

Finders must return ModuleSpec objects when find_spec() is called. This
new method replaces find_module() and find_loader() (in the
PathEntryFinder case). If a loader does not have find_spec(),
find_module() and find_loader() are used instead, for
backward-compatibility.

Adding yet another similar method to loaders is a case of practicality.
find_module() could be changed to return specs instead of loaders. This
is tempting because the import APIs have suffered enough, especially
considering PathEntryFinder.find_loader() was just added in Python 3.3.
However, the extra complexity and a less-than-explicit method name
aren't worth it.

The "target" parameter of find_spec()

A call to find_spec() may optionally include a "target" argument. This
is the module object that will be used subsequently as the target of
loading. During normal import (and by default) "target" is None, meaning
the target module has yet to be created. During reloading the module
passed in to reload() is passed through to find_spec() as the target.
This argument allows the finder to build the module spec with more
information than is otherwise available. Doing so is particularly
relevant in identifying the loader to use.

Through find_spec() the finder will always identify the loader it will
return in the spec (or return None). At the point the loader is
identified, the finder should also decide whether or not the loader
supports loading into the target module, in the case that "target" is
passed in. This decision may entail consulting with the loader.

If the finder determines that the loader does not support loading into
the target module, it should either find another loader or raise
ImportError (completely stopping import of the module). This
determination is especially important during reload since, as noted in
How Reloading Will Work, loaders will no longer be able to trivially
identify a reload situation on their own.

Two alternatives were presented to the "target" parameter:
Loader.supports_reload() and adding "target" to Loader.exec_module()
instead of find_spec(). supports_reload() was the initial approach to
the reload situation.[5] However, there was some opposition to the
loader-specific, reload-centric approach. [6]

As to "target" on exec_module(), the loader may need other information
from the target module (or spec) during reload, more than just "does
this loader support reloading this module", that is no longer available
with the move away from load_module(). A proposal on the table was to
add something like "target" to exec_module().[7] However, putting
"target" on find_spec() instead is more in line with the goals of this
PEP. Furthermore, it obviates the need for supports_reload().

Namespace Packages

Currently a path entry finder may return (None, portions) from
find_loader() to indicate it found part of a possible namespace package.
To achieve the same effect, find_spec() must return a spec with "loader"
set to None (a.k.a. not set) and with submodule_search_locations set to
the same portions as would have been provided by find_loader(). It's up
to PathFinder how to handle such specs.

Loaders

Loader.exec_module(module)

Loaders will have a new method, exec_module(). Its only job is to "exec"
the module and consequently populate the module's namespace. It is not
responsible for creating or preparing the module object, nor for any
cleanup afterward. It has no return value. exec_module() will be used
during both loading and reloading.

exec_module() should properly handle the case where it is called more
than once. For some kinds of modules this may mean raising ImportError
every time after the first time the method is called. This is
particularly relevant for reloading, where some kinds of modules do not
support in-place reloading.

Loader.create_module(spec)

Loaders may also implement create_module() that will return a new module
to exec. It may return None to indicate that the default module creation
code should be used. One use case, though atypical, for create_module()
is to provide a module that is a subclass of the builtin module type.
Most loaders will not need to implement create_module(),

create_module() should properly handle the case where it is called more
than once for the same spec/module. This may include returning None or
raising ImportError.

Note

exec_module() and create_module() should not set any import-related
module attributes. The fact that load_module() does is a design flaw
that this proposal aims to correct.

Other changes:

PEP 420 introduced the optional module_repr() loader method to limit the
amount of special-casing in the module type's __repr__(). Since this
method is part of ModuleSpec, it will be deprecated on loaders. However,
if it exists on a loader it will be used exclusively.

Loader.init_module_attr() method, added prior to Python 3.4's release,
will be removed in favor of the same method on ModuleSpec.

However, InspectLoader.is_package() will not be deprecated even though
the same information is found on ModuleSpec. ModuleSpec can use it to
populate its own is_package if that information is not otherwise
available. Still, it will be made optional.

In addition to executing a module during loading, loaders will still be
directly responsible for providing APIs concerning module-related data.

Other Changes

-   The various finders and loaders provided by importlib will be
    updated to comply with this proposal.
-   Any other implementations of or dependencies on the import-related
    APIs (particularly finders and loaders) in the stdlib will be
    likewise adjusted to this PEP. While they should continue to work,
    any such changes that get missed should be considered bugs for the
    Python 3.4.x series.
-   The spec for the __main__ module will reflect how the interpreter
    was started. For instance, with -m the spec's name will be that of
    the module used, while __main__.__name__ will still be "__main__".
-   We will add importlib.find_spec() to mirror importlib.find_loader()
    (which becomes deprecated).
-   importlib.reload() is changed to use ModuleSpec.
-   importlib.reload() will now make use of the per-module import lock.

Reference Implementation

A reference implementation is available at
http://bugs.python.org/issue18864.

Implementation Notes

* The implementation of this PEP needs to be cognizant of its impact on
pkgutil (and setuptools). pkgutil has some generic function-based
extensions to PEP 302 which may break if importlib starts wrapping
loaders without the tools' knowledge.

* Other modules to look at: runpy (and pythonrun.c), pickle, pydoc,
inspect.

For instance, pickle should be updated in the __main__ case to look at
module.__spec__.name.

Rejected Additions to the PEP

There were a few proposed additions to this proposal that did not fit
well enough into its scope.

There is no "PathModuleSpec" subclass of ModuleSpec that separates out
has_location, cached, and submodule_search_locations. While that might
make the separation cleaner, module objects don't have that distinction.
ModuleSpec will support both cases equally well.

While "ModuleSpec.is_package" would be a simple additional attribute
(aliasing self.submodule_search_locations is not None), it perpetuates
the artificial (and mostly erroneous) distinction between modules and
packages.

The module spec Factory Functions could be classmethods on ModuleSpec.
However that would expose them on all modules via __spec__, which has
the potential to unnecessarily confuse non-advanced Python users. The
factory functions have a specific use case, to support finder authors.
See ModuleSpec Users.

Likewise, several other methods could be added to ModuleSpec that expose
the specific uses of module specs by the import machinery:

-   create() - a wrapper around Loader.create_module().
-   exec(module) - a wrapper around Loader.exec_module().
-   load() - an analogue to the deprecated Loader.load_module().

As with the factory functions, exposing these methods via
module.__spec__ is less than desirable. They would end up being an
attractive nuisance, even if only exposed as "private" attributes (as
they were in previous versions of this PEP). If someone finds a need for
these methods later, we can expose the via an appropriate API (separate
from ModuleSpec) at that point, perhaps relative to PEP 406 (import
engine).

Conceivably, the load() method could optionally take a list of modules
with which to interact instead of sys.modules. Also, load() could be
leveraged to implement multi-version imports. Both are interesting
ideas, but definitely outside the scope of this proposal.

Others left out:

-   Add ModuleSpec.submodules (RO-property) - returns possible
    submodules relative to the spec.
-   Add ModuleSpec.loaded (RO-property) - the module in sys.module, if
    any.
-   Add ModuleSpec.data - a descriptor that wraps the data API of the
    spec's loader.
-   Also see[8].

References

Copyright

This document has been placed in the public domain.



  Local Variables: mode: indented-text indent-tabs-mode: nil
  sentence-end-double-space: t fill-column: 70 coding: utf-8 End:

[1] http://docs.python.org/3/reference/import.html

[2] https://mail.python.org/pipermail/import-sig/2013-August/000658.html

[3] http://bugs.python.org/issue19413

[4] https://mail.python.org/pipermail/python-dev/2013-August/128129.html

[5] https://mail.python.org/pipermail/python-dev/2013-October/129913.html

[6] https://mail.python.org/pipermail/python-dev/2013-October/129971.html

[7] https://mail.python.org/pipermail/python-dev/2013-October/129933.html

[8] https://mail.python.org/pipermail/import-sig/2013-September/000735.html