PEP 3151 – Reworking the OS and IO exception hierarchy
- Antoine Pitrou <solipsis at pitrou.net>
- Barry Warsaw
- Standards Track
- Python-Dev message
Table of Contents
- Compatibility strategy
- Step 1: coalesce exception types
- Step 2: define additional subclasses
- Possible objections
- Earlier discussion
- Possible alternative
- Exceptions ignored by this PEP
- Appendix A: Survey of common errnos
- Appendix B: Survey of raised OS and IO errors
The standard exception hierarchy is an important part of the Python language. It has two defining qualities: it is both generic and selective. Generic in that the same exception type can be raised - and handled - regardless of the context (for example, whether you are trying to add something to an integer, to call a string method, or to write an object on a socket, a TypeError will be raised for bad argument types). Selective in that it allows the user to easily handle (silence, examine, process, store or encapsulate…) specific kinds of error conditions while letting other errors bubble up to higher calling contexts. For example, you can choose to catch ZeroDivisionErrors without affecting the default handling of other ArithmeticErrors (such as OverflowErrors).
This PEP proposes changes to a part of the exception hierarchy in order to better embody the qualities mentioned above: the errors related to operating system calls (OSError, IOError, mmap.error, select.error, and all their subclasses).
Lack of fine-grained exceptions
The current variety of OS-related exceptions doesn’t allow the user to filter easily for the desired kinds of failures. As an example, consider the task of deleting a file if it exists. The Look Before You Leap (LBYL) idiom suffers from an obvious race condition:
if os.path.exists(filename): os.remove(filename)
If a file named as
filename is created by another thread or process
between the calls to
os.remove, it won’t be
deleted. This can produce bugs in the application, or even security issues.
Therefore, the solution is to try to remove the file, and ignore the error if the file doesn’t exist (an idiom known as Easier to Ask Forgiveness than to get Permission, or EAFP). Careful code will read like the following (which works under both POSIX and Windows systems):
try: os.remove(filename) except OSError as e: if e.errno != errno.ENOENT: raise
try: os.remove(filename) except EnvironmentError as e: if e.errno != errno.ENOENT: raise
This is a lot more to type, and also forces the user to remember the various
cryptic mnemonics from the
errno module. It imposes an additional
cognitive burden and gets tiresome rather quickly. Consequently, many
programmers will instead write the following code, which silences exceptions
try: os.remove(filename) except OSError: pass
os.remove can raise an OSError not only when the file doesn’t exist,
but in other possible situations (for example, the filename points to a
directory, or the current process doesn’t have permission to remove
the file), which all indicate bugs in the application logic and therefore
shouldn’t be silenced. What the programmer would like to write instead is
something such as:
try: os.remove(filename) except FileNotFoundError: pass
Reworking the exception hierarchy will obviously change the exact semantics of at least some existing code. While it is not possible to improve on the current situation without changing exact semantics, it is possible to define a narrower type of compatibility, which we will call useful compatibility.
For this we first must explain what we will call careful and careless
exception handling. Careless (or “naïve”) code is defined as code which
blindly catches any of
select.error without checking the
attribute. This is because such exception types are much too broad to signify
anything. Any of them can be raised for error conditions as diverse as: a
bad file descriptor (which will usually indicate a programming error), an
unconnected socket (ditto), a socket timeout, a file type mismatch, an invalid
argument, a transmission failure, insufficient permissions, a non-existent
directory, a full filesystem, etc.
(moreover, the use of certain of these exceptions is irregular; Appendix B exposes the case of the select module, which raises different exceptions depending on the implementation)
Careful code is defined as code which, when catching any of the above
exceptions, examines the
errno attribute to determine the actual error
condition and takes action depending on it.
Then we can define useful compatibility as follows:
- useful compatibility doesn’t make exception catching any narrower, but
it can be broader for careless exception-catching code. Given the following
kind of snippet, all exceptions caught before this PEP will also be
caught after this PEP, but the reverse may be false (because the coalescing
IOErrorand others means the
exceptclause throws a slightly broader net):
try: ... os.remove(filename) ... except OSError: pass
- useful compatibility doesn’t alter the behaviour of careful
exception-catching code. Given the following kind of snippet, the same
errors should be silenced or re-raised, regardless of whether this PEP
has been implemented or not:
try: os.remove(filename) except OSError as e: if e.errno != errno.ENOENT: raise
The rationale for this compromise is that careless code can’t really be helped, but at least code which “works” won’t suddenly raise errors and crash. This is important since such code is likely to be present in scripts used as cron tasks or automated system administration programs.
Careful code, on the other hand, should not be penalized. Actually, one purpose of this PEP is to ease writing careful code.
Step 1: coalesce exception types
The first step of the resolution is to coalesce existing exception types. The following changes are proposed:
- alias both socket.error and select.error to OSError
- alias mmap.error to OSError
- alias both WindowsError and VMSError to OSError
- alias IOError to OSError
- coalesce EnvironmentError into OSError
Each of these changes doesn’t preserve exact compatibility, but it does preserve useful compatibility (see “compatibility” section above).
Each of these changes can be accepted or refused individually, but of course it is considered that the greatest impact can be achieved if this first step is accepted in full. In this case, the IO exception sub-hierarchy would become:
+-- OSError (replacing IOError, WindowsError, EnvironmentError, etc.) +-- io.BlockingIOError +-- io.UnsupportedOperation (also inherits from ValueError) +-- socket.gaierror +-- socket.herror +-- socket.timeout
Not only does this first step present the user a simpler landscape as explained in the rationale section, but it also allows for a better and more complete resolution of Step 2 (see Prerequisite).
The rationale for keeping
OSError as the official name for generic
OS-related exceptions is that it, precisely, is more generic than
EnvironmentError is more tedious to type and also much lesser-known.
The survey in Appendix B shows that IOError is the dominant error today in the standard library. As for third-party Python code, Google Code Search shows IOError being ten times more popular than EnvironmentError in user code, and three times more popular than OSError . However, with no intention to deprecate IOError in the middle term, the lesser popularity of OSError is not a problem.
Since WindowsError is coalesced into OSError, the latter gains a
attribute under Windows. It is set to None under situations where it is not
meaningful, as is already the case with the
strerror attributes (for example when OSError is raised directly by
Deprecation of names
The following paragraphs outline a possible deprecation strategy for old exception names. However, it has been decided to keep them as aliases for the time being. This decision could be revised in time for Python 4.0.
Deprecating the old built-in exceptions cannot be done in a straightforward fashion by intercepting all lookups in the builtins namespace, since these are performance-critical. We also cannot work at the object level, since the deprecated names will be aliased to non-deprecated objects.
A solution is to recognize these names at compilation time, and
then emit a separate
LOAD_OLD_GLOBAL opcode instead of the regular
LOAD_GLOBAL. This specialized opcode will handle the output of a
DeprecationWarning (or PendingDeprecationWarning, depending on the policy
decided upon) when the name doesn’t exist in the globals namespace, but
only in the builtins one. This will be enough to avoid false positives
(for example if someone defines their own
OSError in a module), and
false negatives will be rare (for example when someone accesses
builtins module rather than directly).
The above approach cannot be used easily, since it would require special-casing some modules when compiling code objects. However, these names are by construction much less visible (they don’t appear in the builtins namespace), and lesser-known too, so we might decide to let them live in their own namespaces.
Step 2: define additional subclasses
The second step of the resolution is to extend the hierarchy by defining subclasses which will be raised, rather than their parent, for specific errno values. Which errno values is subject to discussion, but a survey of existing exception matching practices (see Appendix A) helps us propose a reasonable subset of all values. Trying to map all errno mnemonics, indeed, seems foolish, pointless, and would pollute the root namespace.
Furthermore, in a couple of cases, different errno values could raise
the same exception subclass. For example, EAGAIN, EALREADY, EWOULDBLOCK
and EINPROGRESS are all used to signal that an operation on a non-blocking
socket would block (and therefore needs trying again later). They could
therefore all raise an identical subclass and let the user examine the
errno attribute if (s)he so desires (see below “exception
Step 1 is a loose prerequisite for this.
Prerequisite, because some errnos can currently be attached to different
exception classes: for example, ENOENT can be attached to both OSError and
IOError, depending on the context. If we don’t want to break useful
compatibility, we can’t make an
except OSError (or IOError) fail to
match an exception where it would succeed today.
Loose, because we could decide for a partial resolution of step 2 if existing exception classes are not coalesced: for example, ENOENT could raise a hypothetical FileNotFoundError where an IOError was previously raised, but continue to raise OSError otherwise.
The dependency on step 1 could be totally removed if the new subclasses used multiple inheritance to match with all of the existing superclasses (or, at least, OSError and IOError, which are arguable the most prevalent ones). It would, however, make the hierarchy more complicated and therefore harder to grasp for the user.
New exception classes
The following tentative list of subclasses, along with a description and the list of errnos mapped to them, is submitted to discussion:
FileExistsError: trying to create a file or directory which already exists (EEXIST)
FileNotFoundError: for all circumstances where a file and directory is requested but doesn’t exist (ENOENT)
IsADirectoryError: file-level operation (open(), os.remove()…) requested on a directory (EISDIR)
NotADirectoryError: directory-level operation requested on something else (ENOTDIR)
PermissionError: trying to run an operation without the adequate access rights - for example filesystem permissions (EACCES, EPERM)
BlockingIOError: an operation would block on an object (e.g. socket) set for non-blocking operation (EAGAIN, EALREADY, EWOULDBLOCK, EINPROGRESS); this is the existing
io.BlockingIOErrorwith an extended role
BrokenPipeError: trying to write on a pipe while the other end has been closed, or trying to write on a socket which has been shutdown for writing (EPIPE, ESHUTDOWN)
InterruptedError: a system call was interrupted by an incoming signal (EINTR)
ConnectionAbortedError: connection attempt aborted by peer (ECONNABORTED)
ConnectionRefusedError: connection reset by peer (ECONNREFUSED)
ConnectionResetError: connection reset by peer (ECONNRESET)
TimeoutError: connection timed out (ETIMEDOUT); this can be re-cast as a generic timeout exception, replacing
socket.timeoutand also useful for other types of timeout (for example in Lock.acquire())
ChildProcessError: operation on a child process failed (ECHILD); this is raised mainly by the wait() family of functions.
ProcessLookupError: the given process (as identified by, e.g., its process id) doesn’t exist (ESRCH).
In addition, the following exception class is proposed for inclusion:
ConnectionError: a base class for
The following drawing tries to sum up the proposed additions, along with the corresponding errno values (where applicable). The root of the sub-hierarchy (OSError, assuming Step 1 is accepted in full) is not shown:
+-- BlockingIOError EAGAIN, EALREADY, EWOULDBLOCK, EINPROGRESS +-- ChildProcessError ECHILD +-- ConnectionError +-- BrokenPipeError EPIPE, ESHUTDOWN +-- ConnectionAbortedError ECONNABORTED +-- ConnectionRefusedError ECONNREFUSED +-- ConnectionResetError ECONNRESET +-- FileExistsError EEXIST +-- FileNotFoundError ENOENT +-- InterruptedError EINTR +-- IsADirectoryError EISDIR +-- NotADirectoryError ENOTDIR +-- PermissionError EACCES, EPERM +-- ProcessLookupError ESRCH +-- TimeoutError ETIMEDOUT
Various naming controversies can arise. One of them is whether all
exception class names should end in “
Error”. In favour is consistency
with the rest of the exception hierarchy, against is concision (especially
with long names such as
In order to preserve useful compatibility, these subclasses should still
set adequate values for the various exception attributes defined on the
superclass (for example
filename, and optionally
Since it is proposed that the subclasses are raised based purely on the
errno, little or no changes should be required in extension
modules (either standard or third-party).
The first possibility is to adapt the
of functions (
PyErr_SetFromWindowsErr() under Windows) to raise the
appropriate OSError subclass. This wouldn’t cover, however, Python
code raising OSError directly, using the following idiom (seen in
raise IOError(_errno.EEXIST, "No usable temporary file name found")
A second possibility, suggested by Marc-Andre Lemburg, is to adapt
OSError.__new__ to instantiate the appropriate subclass. This has
the benefit of also covering Python code such as the above.
Making the exception hierarchy finer-grained makes the root (or builtins) namespace larger. This is to be moderated, however, as:
- only a handful of additional classes are proposed;
- while standard exception types live in the root namespace, they are visually distinguished by the fact that they use the CamelCase convention, while almost all other builtins use lowercase naming (except True, False, None, Ellipsis and NotImplemented)
An alternative would be to provide a separate module containing the finer-grained exceptions, but that would defeat the purpose of encouraging careful code over careless code, since the user would first have to import the new module instead of using names already accessible.
While this is the first time such as formal proposal is made, the idea has received informal support in the past ; both the introduction of finer-grained exception classes and the coalescing of OSError and IOError.
The removal of WindowsError alone has been discussed and rejected as part of another PEP, but there seemed to be a consensus that the distinction with OSError wasn’t meaningful. This supports at least its aliasing with OSError.
The reference implementation has been integrated into Python 3.3.
It was formerly developed in http://hg.python.org/features/pep-3151/ in
pep-3151, and also tracked on the bug tracker at
It has been successfully tested on a variety of systems: Linux, Windows,
OpenIndiana and FreeBSD buildbots.
One source of trouble has been with the respective constructors of
WindowsError, which were incompatible. The way it is solved is by
OSError signature and adding a fourth optional argument
to allow passing the Windows error code (which is different from the POSIX
errno). The fourth argument is stored as
winerror and its POSIX
PyErr_SetFromWindowsErr* functions have
been adapted to use the right constructor call.
A slight complication is when the
are called with
OSError rather than
attribute of the exception object would store the Windows error code (such
as 109 for ERROR_BROKEN_PIPE) rather than its POSIX translation (such as 32
for EPIPE), which it does now. For non-socket error codes, this only occurs
in the private
_multiprocessing module for which there is no compatibility
For socket errors, the “POSIX errno” as reflected by the
is numerically equal to the Windows Socket error code
returned by the
WSAGetLastError system call:
>>> errno.EWOULDBLOCK 10035 >>> errno.WSAEWOULDBLOCK 10035
Another possibility would be to introduce an advanced pattern matching syntax when catching exceptions. For example:
try: os.remove(filename) except OSError as e if e.errno == errno.ENOENT: pass
Several problems with this proposal:
- it introduces new syntax, which is perceived by the author to be a heavier change compared to reworking the exception hierarchy
- it doesn’t decrease typing effort significantly
- it doesn’t relieve the programmer from the burden of having to remember errno mnemonics
Exceptions ignored by this PEP
This PEP ignores
EOFError, which signals a truncated input stream in
various protocol and file format implementations (for example
EOFError is not OS- or IO-related, it is a logical error raised at
a higher level.
This PEP also ignores
SSLError, which is raised by the
in order to propagate errors signalled by the
OpenSSL library. Ideally,
SSLError would benefit from a similar but separate treatment since it
defines its own constants for error types (
etc.). In Python 3.2,
SSLError is already replaced with
when it signals a socket timeout (see issue 10272).
Endly, the fate of
socket.herror is not settled.
While they would deserve less cryptic names, this can be handled separately
from the exception hierarchy reorganization effort.
Appendix A: Survey of common errnos
This is a quick inventory of the various errno mnemonics checked for in
the standard library and its tests, as part of
Common errnos with OSError
EBADF: bad file descriptor (usually means the file descriptor was closed)
EEXIST: file or directory exists
EINTR: interrupted function call
EISDIR: is a directory
ENOTDIR: not a directory
ENOENT: no such file or directory
EOPNOTSUPP: operation not supported on socket (possible confusion with the existing io.UnsupportedOperation)
EPERM: operation not permitted (when using e.g. os.setuid())
Common errnos with IOError
EACCES: permission denied (for filesystem operations)
EBADF: bad file descriptor (with select.epoll); read operation on a write-only GzipFile, or vice-versa
EBUSY: device or resource busy
EISDIR: is a directory (when trying to open())
ENODEV: no such device
ENOENT: no such file or directory (when trying to open())
ETIMEDOUT: connection timed out
Common errnos with socket.error
All these errors may also be associated with a plain IOError, for example when calling read() on a socket’s file descriptor.
EAGAIN: resource temporarily unavailable (during a non-blocking socket call except connect())
EALREADY: connection already in progress (during a non-blocking connect())
EINPROGRESS: operation in progress (during a non-blocking connect())
EINTR: interrupted function call
EISCONN: the socket is connected
ECONNABORTED: connection aborted by peer (during an accept() call)
ECONNREFUSED: connection refused by peer
ECONNRESET: connection reset by peer
ENOTCONN: socket not connected
ESHUTDOWN: cannot send after transport endpoint shutdown
EWOULDBLOCK: same reasons as
Common errnos with select.error
EINTR: interrupted function call
Appendix B: Survey of raised OS and IO errors
VMSError is completely unused by the interpreter core and the standard library. It was added as part of the OpenVMS patches submitted in 2002 by Jean-François Piéronne ; the motivation for including VMSError was that it could be raised by third-party packages.
Handling of PYTHONSTARTUP raises IOError (but the error gets discarded):
$ PYTHONSTARTUP=foox ./python Python 3.2a0 (py3k:82920M, Jul 16 2010, 22:53:23) [GCC 4.4.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. Could not open PYTHONSTARTUP IOError: [Errno 2] No such file or directory: 'foox'
PyObject_Print() raises IOError when ferror() signals an error on the
FILE * parameter (which, in the source tree, is always either stdout or
Unicode encoding and decoding using the
mbcs encoding can raise
WindowsError for some error conditions.
Raises IOError throughout (OSError is unused):
>>> bz2.BZ2File("foox", "rb") Traceback (most recent call last): File "<stdin>", line 1, in <module> IOError: [Errno 2] No such file or directory >>> bz2.BZ2File("LICENSE", "rb").read() Traceback (most recent call last): File "<stdin>", line 1, in <module> IOError: invalid data stream >>> bz2.BZ2File("/tmp/zzz.bz2", "wb").read() Traceback (most recent call last): File "<stdin>", line 1, in <module> IOError: file is not ready for reading
_dbm.error and _gdbm.error inherit from IOError:
>>> dbm.gnu.open("foox") Traceback (most recent call last): File "<stdin>", line 1, in <module> _gdbm.error: [Errno 2] No such file or directory
Raises IOError throughout (OSError is unused).
Raises IOError for bad file descriptors:
>>> imp.load_source("foo", "foo", 123) Traceback (most recent call last): File "<stdin>", line 1, in <module> IOError: [Errno 9] Bad file descriptor
Raises IOError when trying to open a directory under Unix:
>>> open("Python/", "r") Traceback (most recent call last): File "<stdin>", line 1, in <module> IOError: [Errno 21] Is a directory: 'Python/'
Raises IOError or io.UnsupportedOperation (which inherits from the former) for unsupported operations:
>>> open("LICENSE").write("bar") Traceback (most recent call last): File "<stdin>", line 1, in <module> IOError: not writable >>> io.StringIO().fileno() Traceback (most recent call last): File "<stdin>", line 1, in <module> io.UnsupportedOperation: fileno >>> open("LICENSE").seek(1, 1) Traceback (most recent call last): File "<stdin>", line 1, in <module> IOError: can't do nonzero cur-relative seeks
Raises either IOError or TypeError when the inferior I/O layer misbehaves (i.e. violates the API it is expected to implement).
Raises IOError when the underlying OS resource becomes invalid:
>>> f = open("LICENSE") >>> os.close(f.fileno()) >>> f.read() Traceback (most recent call last): File "<stdin>", line 1, in <module> IOError: [Errno 9] Bad file descriptor
…or for implementation-specific optimizations:
>>> f = open("LICENSE") >>> next(f) 'A. HISTORY OF THE SOFTWARE\n' >>> f.tell() Traceback (most recent call last): File "<stdin>", line 1, in <module> IOError: telling position disabled by next() call
Raises BlockingIOError (inheriting from IOError) when a call on a non-blocking object would block.
Under Unix, raises its own
mmap.error (inheriting from EnvironmentError)
>>> mmap.mmap(123, 10) Traceback (most recent call last): File "<stdin>", line 1, in <module> mmap.error: [Errno 9] Bad file descriptor >>> mmap.mmap(os.open("/tmp", os.O_RDONLY), 10) Traceback (most recent call last): File "<stdin>", line 1, in <module> mmap.error: [Errno 13] Permission denied
Under Windows, however, it mostly raises WindowsError (the source code
also shows a few occurrences of
>>> fd = os.open("LICENSE", os.O_RDONLY) >>> m = mmap.mmap(fd, 16384) Traceback (most recent call last): File "<stdin>", line 1, in <module> WindowsError: [Error 5] Accès refusé >>> sys.last_value.errno 13 >>> errno.errorcode 'EACCES' >>> m = mmap.mmap(-1, 4096) >>> m.resize(16384) Traceback (most recent call last): File "<stdin>", line 1, in <module> WindowsError: [Error 87] Paramètre incorrect >>> sys.last_value.errno 22 >>> errno.errorcode 'EINVAL'
os / posix
posix) module raises OSError throughout, except under
Windows where WindowsError can be raised instead.
Raises IOError throughout (OSError is unused):
>>> ossaudiodev.open("foo", "r") Traceback (most recent call last): File "<stdin>", line 1, in <module> IOError: [Errno 2] No such file or directory: 'foo'
Raises IOError in various file-handling functions:
>>> readline.read_history_file("foo") Traceback (most recent call last): File "<stdin>", line 1, in <module> IOError: [Errno 2] No such file or directory >>> readline.read_init_file("foo") Traceback (most recent call last): File "<stdin>", line 1, in <module> IOError: [Errno 2] No such file or directory >>> readline.write_history_file("/dev/nonexistent") Traceback (most recent call last): File "<stdin>", line 1, in <module> IOError: [Errno 13] Permission denied
- select() and poll objects raise
select.error, which doesn’t inherit from anything (but poll.modify() raises IOError);
- epoll objects raise IOError;
- kqueue objects raise both OSError and IOError.
As a side-note, not deriving from
does not get the useful
errno attribute. User code must check
>>> signal.alarm(1); select.select(, , ) 0 Traceback (most recent call last): File "<stdin>", line 1, in <module> select.error: (4, 'Interrupted system call') >>> e = sys.last_value >>> e error(4, 'Interrupted system call') >>> e.errno == errno.EINTR Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'error' object has no attribute 'errno' >>> e.args == errno.EINTR True
signal.ItimerError inherits from IOError.
socket.error inherits from IOError.
sys.getwindowsversion() raises WindowsError with a bogus error number
GetVersionEx() call fails.
Raises IOError for internal errors in time.time() and time.sleep().
zipimporter.get_data() can raise IOError.
Significant input has been received from Nick Coghlan.
This document has been placed in the public domain.
Last modified: 2022-01-21 11:03:51 GMT