PEP: 391 Title: Dictionary-Based Configuration For Logging Version:
$Revision$ Last-Modified: $Date$ Author: Vinay Sajip <vinay_sajip at
red-dove.com> Status: Final Type: Standards Track Content-Type:
text/x-rst Created: 15-Oct-2009 Python-Version: 2.7, 3.2 Post-History:

Abstract

This PEP describes a new way of configuring logging using a dictionary
to hold configuration information.

Rationale

The present means for configuring Python's logging package is either by
using the logging API to configure logging programmatically, or else by
means of ConfigParser-based configuration files.

Programmatic configuration, while offering maximal control, fixes the
configuration in Python code. This does not facilitate changing it
easily at runtime, and, as a result, the ability to flexibly turn the
verbosity of logging up and down for different parts of a using
application is lost. This limits the usability of logging as an aid to
diagnosing problems - and sometimes, logging is the only diagnostic aid
available in production environments.

The ConfigParser-based configuration system is usable, but does not
allow its users to configure all aspects of the logging package. For
example, Filters cannot be configured using this system. Furthermore,
the ConfigParser format appears to engender dislike (sometimes strong
dislike) in some quarters. Though it was chosen because it was the only
configuration format supported in the Python standard at that time, many
people regard it (or perhaps just the particular schema chosen for
logging's configuration) as 'crufty' or 'ugly', in some cases apparently
on purely aesthetic grounds.

Recent versions of Python include JSON support in the standard library,
and this is also usable as a configuration format. In other
environments, such as Google App Engine, YAML is used to configure
applications, and usually the configuration of logging would be
considered an integral part of the application configuration. Although
the standard library does not contain YAML support at present, support
for both JSON and YAML can be provided in a common way because both of
these serialization formats allow deserialization to Python
dictionaries.

By providing a way to configure logging by passing the configuration in
a dictionary, logging will be easier to configure not only for users of
JSON and/or YAML, but also for users of custom configuration methods, by
providing a common format in which to describe the desired
configuration.

Another drawback of the current ConfigParser-based configuration system
is that it does not support incremental configuration: a new
configuration completely replaces the existing configuration. Although
full flexibility for incremental configuration is difficult to provide
in a multi-threaded environment, the new configuration mechanism will
allow the provision of limited support for incremental configuration.

Specification

The specification consists of two parts: the API and the format of the
dictionary used to convey configuration information (i.e. the schema to
which it must conform).

Naming

Historically, the logging package has not been PEP 8 conformant. At some
future time, this will be corrected by changing method and function
names in the package in order to conform with PEP 8. However, in the
interests of uniformity, the proposed additions to the API use a naming
scheme which is consistent with the present scheme used by logging.

API

The logging.config module will have the following addition:

-   A function, called dictConfig(), which takes a single argument
    - the dictionary holding the configuration. Exceptions will be
    raised if there are errors while processing the dictionary.

It will be possible to customize this API - see the section on API
Customization. Incremental configuration is covered in its own section.

Dictionary Schema - Overview

Before describing the schema in detail, it is worth saying a few words
about object connections, support for user-defined objects and access to
external and internal objects.

Object connections

The schema is intended to describe a set of logging objects - loggers,
handlers, formatters, filters - which are connected to each other in an
object graph. Thus, the schema needs to represent connections between
the objects. For example, say that, once configured, a particular logger
has attached to it a particular handler. For the purposes of this
discussion, we can say that the logger represents the source, and the
handler the destination, of a connection between the two. Of course in
the configured objects this is represented by the logger holding a
reference to the handler. In the configuration dict, this is done by
giving each destination object an id which identifies it unambiguously,
and then using the id in the source object's configuration to indicate
that a connection exists between the source and the destination object
with that id.

So, for example, consider the following YAML snippet:

    formatters:
      brief:
        # configuration for formatter with id 'brief' goes here
      precise:
        # configuration for formatter with id 'precise' goes here
    handlers:
      h1: #This is an id
       # configuration of handler with id 'h1' goes here
       formatter: brief
      h2: #This is another id
       # configuration of handler with id 'h2' goes here
       formatter: precise
    loggers:
      foo.bar.baz:
        # other configuration for logger 'foo.bar.baz'
        handlers: [h1, h2]

(Note: YAML will be used in this document as it is a little more
readable than the equivalent Python source form for the dictionary.)

The ids for loggers are the logger names which would be used
programmatically to obtain a reference to those loggers, e.g.
foo.bar.baz. The ids for Formatters and Filters can be any string value
(such as brief, precise above) and they are transient, in that they are
only meaningful for processing the configuration dictionary and used to
determine connections between objects, and are not persisted anywhere
when the configuration call is complete.

Handler ids are treated specially, see the section on Handler Ids,
below.

The above snippet indicates that logger named foo.bar.baz should have
two handlers attached to it, which are described by the handler ids h1
and h2. The formatter for h1 is that described by id brief, and the
formatter for h2 is that described by id precise.

User-defined objects

The schema should support user-defined objects for handlers, filters and
formatters. (Loggers do not need to have different types for different
instances, so there is no support - in the configuration -for
user-defined logger classes.)

Objects to be configured will typically be described by dictionaries
which detail their configuration. In some places, the logging system
will be able to infer from the context how an object is to be
instantiated, but when a user-defined object is to be instantiated, the
system will not know how to do this. In order to provide complete
flexibility for user-defined object instantiation, the user will need to
provide a 'factory' - a callable which is called with a configuration
dictionary and which returns the instantiated object. This will be
signalled by an absolute import path to the factory being made available
under the special key '()'. Here's a concrete example:

    formatters:
      brief:
        format: '%(message)s'
      default:
        format: '%(asctime)s %(levelname)-8s %(name)-15s %(message)s'
        datefmt: '%Y-%m-%d %H:%M:%S'
      custom:
          (): my.package.customFormatterFactory
          bar: baz
          spam: 99.9
          answer: 42

The above YAML snippet defines three formatters. The first, with id
brief, is a standard logging.Formatter instance with the specified
format string. The second, with id default, has a longer format and also
defines the time format explicitly, and will result in a
logging.Formatter initialized with those two format strings. Shown in
Python source form, the brief and default formatters have configuration
sub-dictionaries:

    {
      'format' : '%(message)s'
    }

and:

    {
      'format' : '%(asctime)s %(levelname)-8s %(name)-15s %(message)s',
      'datefmt' : '%Y-%m-%d %H:%M:%S'
    }

respectively, and as these dictionaries do not contain the special key
'()', the instantiation is inferred from the context: as a result,
standard logging.Formatter instances are created. The configuration
sub-dictionary for the third formatter, with id custom, is:

    {
      '()' : 'my.package.customFormatterFactory',
      'bar' : 'baz',
      'spam' : 99.9,
      'answer' : 42
    }

and this contains the special key '()', which means that user-defined
instantiation is wanted. In this case, the specified factory callable
will be used. If it is an actual callable it will be used directly -
otherwise, if you specify a string (as in the example) the actual
callable will be located using normal import mechanisms. The callable
will be called with the remaining items in the configuration
sub-dictionary as keyword arguments. In the above example, the formatter
with id custom will be assumed to be returned by the call:

    my.package.customFormatterFactory(bar='baz', spam=99.9, answer=42)

The key '()' has been used as the special key because it is not a valid
keyword parameter name, and so will not clash with the names of the
keyword arguments used in the call. The '()' also serves as a mnemonic
that the corresponding value is a callable.

Access to external objects

There are times where a configuration will need to refer to objects
external to the configuration, for example sys.stderr. If the
configuration dict is constructed using Python code then this is
straightforward, but a problem arises when the configuration is provided
via a text file (e.g. JSON, YAML). In a text file, there is no standard
way to distinguish sys.stderr from the literal string 'sys.stderr'. To
facilitate this distinction, the configuration system will look for
certain special prefixes in string values and treat them specially. For
example, if the literal string 'ext://sys.stderr' is provided as a value
in the configuration, then the ext:// will be stripped off and the
remainder of the value processed using normal import mechanisms.

The handling of such prefixes will be done in a way analogous to
protocol handling: there will be a generic mechanism to look for
prefixes which match the regular expression
^(?P<prefix>[a-z]+)://(?P<suffix>.*)$ whereby, if the prefix is
recognised, the suffix is processed in a prefix-dependent manner and the
result of the processing replaces the string value. If the prefix is not
recognised, then the string value will be left as-is.

The implementation will provide for a set of standard prefixes such as
ext:// but it will be possible to disable the mechanism completely or
provide additional or different prefixes for special handling.

Access to internal objects

As well as external objects, there is sometimes also a need to refer to
objects in the configuration. This will be done implicitly by the
configuration system for things that it knows about. For example, the
string value 'DEBUG' for a level in a logger or handler will
automatically be converted to the value logging.DEBUG, and the handlers,
filters and formatter entries will take an object id and resolve to the
appropriate destination object.

However, a more generic mechanism needs to be provided for the case of
user-defined objects which are not known to logging. For example, take
the instance of logging.handlers.MemoryHandler, which takes a target
which is another handler to delegate to. Since the system already knows
about this class, then in the configuration, the given target just needs
to be the object id of the relevant target handler, and the system will
resolve to the handler from the id. If, however, a user defines a
my.package.MyHandler which has a alternate handler, the configuration
system would not know that the alternate referred to a handler. To cater
for this, a generic resolution system will be provided which allows the
user to specify:

    handlers:
      file:
        # configuration of file handler goes here

      custom:
        (): my.package.MyHandler
        alternate: cfg://handlers.file

The literal string 'cfg://handlers.file' will be resolved in an
analogous way to the strings with the ext:// prefix, but looking in the
configuration itself rather than the import namespace. The mechanism
will allow access by dot or by index, in a similar way to that provided
by str.format. Thus, given the following snippet:

    handlers:
      email:
        class: logging.handlers.SMTPHandler
        mailhost: localhost
        fromaddr: my_app@domain.tld
        toaddrs:
          - support_team@domain.tld
          - dev_team@domain.tld
        subject: Houston, we have a problem.

in the configuration, the string 'cfg://handlers' would resolve to the
dict with key handlers, the string 'cfg://handlers.email would resolve
to the dict with key email in the handlers dict, and so on. The string
'cfg://handlers.email.toaddrs[1] would resolve to 'dev_team.domain.tld'
and the string 'cfg://handlers.email.toaddrs[0]' would resolve to the
value 'support_team@domain.tld'. The subject value could be accessed
using either 'cfg://handlers.email.subject' or, equivalently,
'cfg://handlers.email[subject]'. The latter form only needs to be used
if the key contains spaces or non-alphanumeric characters. If an index
value consists only of decimal digits, access will be attempted using
the corresponding integer value, falling back to the string value if
needed.

Given a string cfg://handlers.myhandler.mykey.123, this will resolve to
config_dict['handlers']['myhandler']['mykey']['123']. If the string is
specified as cfg://handlers.myhandler.mykey[123], the system will
attempt to retrieve the value from
config_dict['handlers']['myhandler']['mykey'][123], and fall back to
config_dict['handlers']['myhandler']['mykey']['123'] if that fails.

Handler Ids

Some specific logging configurations require the use of handler levels
to achieve the desired effect. However, unlike loggers which can always
be identified by their names, handlers have no persistent handles
whereby levels can be changed via an incremental configuration call.

Therefore, this PEP proposes to add an optional name property to
handlers. If used, this will add an entry in a dictionary which maps the
name to the handler. (The entry will be removed when the handler is
closed.) When an incremental configuration call is made, handlers will
be looked up in this dictionary to set the handler level according to
the value in the configuration. See the section on incremental
configuration for more details.

In theory, such a "persistent name" facility could also be provided for
Filters and Formatters. However, there is not a strong case to be made
for being able to configure these incrementally. On the basis that
practicality beats purity, only Handlers will be given this new name
property. The id of a handler in the configuration will become its name.

The handler name lookup dictionary is for configuration use only and
will not become part of the public API for the package.

Dictionary Schema - Detail

The dictionary passed to dictConfig() must contain the following keys:

-   version - to be set to an integer value representing the schema
    version. The only valid value at present is 1, but having this key
    allows the schema to evolve while still preserving backwards
    compatibility.

All other keys are optional, but if present they will be interpreted as
described below. In all cases below where a 'configuring dict' is
mentioned, it will be checked for the special '()' key to see if a
custom instantiation is required. If so, the mechanism described above
is used to instantiate; otherwise, the context is used to determine how
to instantiate.

-   formatters - the corresponding value will be a dict in which each
    key is a formatter id and each value is a dict describing how to
    configure the corresponding Formatter instance.

    The configuring dict is searched for keys format and datefmt (with
    defaults of None) and these are used to construct a
    logging.Formatter instance.

-   filters - the corresponding value will be a dict in which each key
    is a filter id and each value is a dict describing how to configure
    the corresponding Filter instance.

    The configuring dict is searched for key name (defaulting to the
    empty string) and this is used to construct a logging.Filter
    instance.

-   handlers - the corresponding value will be a dict in which each key
    is a handler id and each value is a dict describing how to configure
    the corresponding Handler instance.

    The configuring dict is searched for the following keys:

    -   class (mandatory). This is the fully qualified name of the
        handler class.
    -   level (optional). The level of the handler.
    -   formatter (optional). The id of the formatter for this handler.
    -   filters (optional). A list of ids of the filters for this
        handler.

    All other keys are passed through as keyword arguments to the
    handler's constructor. For example, given the snippet:

        handlers:
          console:
            class : logging.StreamHandler
            formatter: brief
            level   : INFO
            filters: [allow_foo]
            stream  : ext://sys.stdout
          file:
            class : logging.handlers.RotatingFileHandler
            formatter: precise
            filename: logconfig.log
            maxBytes: 1024
            backupCount: 3

    the handler with id console is instantiated as a
    logging.StreamHandler, using sys.stdout as the underlying stream.
    The handler with id file is instantiated as a
    logging.handlers.RotatingFileHandler with the keyword arguments
    filename='logconfig.log', maxBytes=1024, backupCount=3.

-   loggers - the corresponding value will be a dict in which each key
    is a logger name and each value is a dict describing how to
    configure the corresponding Logger instance.

    The configuring dict is searched for the following keys:

    -   level (optional). The level of the logger.
    -   propagate (optional). The propagation setting of the logger.
    -   filters (optional). A list of ids of the filters for this
        logger.
    -   handlers (optional). A list of ids of the handlers for this
        logger.

    The specified loggers will be configured according to the level,
    propagation, filters and handlers specified.

-   root - this will be the configuration for the root logger.
    Processing of the configuration will be as for any logger, except
    that the propagate setting will not be applicable.

-   incremental - whether the configuration is to be interpreted as
    incremental to the existing configuration. This value defaults to
    False, which means that the specified configuration replaces the
    existing configuration with the same semantics as used by the
    existing fileConfig() API.

    If the specified value is True, the configuration is processed as
    described in the section on Incremental Configuration, below.

-   disable_existing_loggers - whether any existing loggers are to be
    disabled. This setting mirrors the parameter of the same name in
    fileConfig(). If absent, this parameter defaults to True. This value
    is ignored if incremental is True.

A Working Example

The following is an actual working configuration in YAML format (except
that the email addresses are bogus):

    formatters:
      brief:
        format: '%(levelname)-8s: %(name)-15s: %(message)s'
      precise:
        format: '%(asctime)s %(name)-15s %(levelname)-8s %(message)s'
    filters:
      allow_foo:
        name: foo
    handlers:
      console:
        class : logging.StreamHandler
        formatter: brief
        level   : INFO
        stream  : ext://sys.stdout
        filters: [allow_foo]
      file:
        class : logging.handlers.RotatingFileHandler
        formatter: precise
        filename: logconfig.log
        maxBytes: 1024
        backupCount: 3
      debugfile:
        class : logging.FileHandler
        formatter: precise
        filename: logconfig-detail.log
        mode: a
      email:
        class: logging.handlers.SMTPHandler
        mailhost: localhost
        fromaddr: my_app@domain.tld
        toaddrs:
          - support_team@domain.tld
          - dev_team@domain.tld
        subject: Houston, we have a problem.
    loggers:
      foo:
        level : ERROR
        handlers: [debugfile]
      spam:
        level : CRITICAL
        handlers: [debugfile]
        propagate: no
      bar.baz:
        level: WARNING
    root:
      level     : DEBUG
      handlers  : [console, file]

Incremental Configuration

It is difficult to provide complete flexibility for incremental
configuration. For example, because objects such as filters and
formatters are anonymous, once a configuration is set up, it is not
possible to refer to such anonymous objects when augmenting a
configuration.

Furthermore, there is not a compelling case for arbitrarily altering the
object graph of loggers, handlers, filters, formatters at run-time, once
a configuration is set up; the verbosity of loggers and handlers can be
controlled just by setting levels (and, in the case of loggers,
propagation flags). Changing the object graph arbitrarily in a safe way
is problematic in a multi-threaded environment; while not impossible,
the benefits are not worth the complexity it adds to the implementation.

Thus, when the incremental key of a configuration dict is present and is
True, the system will ignore any formatters and filters entries
completely, and process only the level settings in the handlers entries,
and the level and propagate settings in the loggers and root entries.

It's certainly possible to provide incremental configuration by other
means, for example making dictConfig() take an incremental keyword
argument which defaults to False. The reason for suggesting that a value
in the configuration dict be used is that it allows for configurations
to be sent over the wire as pickled dicts to a socket listener. Thus,
the logging verbosity of a long-running application can be altered over
time with no need to stop and restart the application.

Note: Feedback on incremental configuration needs based on your
practical experience will be particularly welcome.

API Customization

The bare-bones dictConfig() API will not be sufficient for all use
cases. Provision for customization of the API will be made by providing
the following:

-   A class, called DictConfigurator, whose constructor is passed the
    dictionary used for configuration, and which has a configure()
    method.
-   A callable, called dictConfigClass, which will (by default) be set
    to DictConfigurator. This is provided so that if desired,
    DictConfigurator can be replaced with a suitable user-defined
    implementation.

The dictConfig() function will call dictConfigClass passing the
specified dictionary, and then call the configure() method on the
returned object to actually put the configuration into effect:

    def dictConfig(config):
        dictConfigClass(config).configure()

This should cater to all customization needs. For example, a subclass of
DictConfigurator could call DictConfigurator.__init__() in its own
__init__(), then set up custom prefixes which would be usable in the
subsequent configure() call. The dictConfigClass would be bound to the
subclass, and then dictConfig() could be called exactly as in the
default, uncustomized state.

Change to Socket Listener Implementation

The existing socket listener implementation will be modified as follows:
when a configuration message is received, an attempt will be made to
deserialize to a dictionary using the json module. If this step fails,
the message will be assumed to be in the fileConfig format and processed
as before. If deserialization is successful, then dictConfig() will be
called to process the resulting dictionary.

Configuration Errors

If an error is encountered during configuration, the system will raise a
ValueError, TypeError, AttributeError or ImportError with a suitably
descriptive message. The following is a (possibly incomplete) list of
conditions which will raise an error:

-   A level which is not a string or which is a string not corresponding
    to an actual logging level
-   A propagate value which is not a boolean
-   An id which does not have a corresponding destination
-   A non-existent handler id found during an incremental call
-   An invalid logger name
-   Inability to resolve to an internal or external object

Discussion in the community

The PEP has been announced on python-dev and python-list. While there
hasn't been a huge amount of discussion, this is perhaps to be expected
for a niche topic.

Discussion threads on python-dev:

https://mail.python.org/pipermail/python-dev/2009-October/092695.html
https://mail.python.org/pipermail/python-dev/2009-October/092782.html
https://mail.python.org/pipermail/python-dev/2009-October/093062.html

And on python-list:

https://mail.python.org/pipermail/python-list/2009-October/1223658.html
https://mail.python.org/pipermail/python-list/2009-October/1224228.html

There have been some comments in favour of the proposal, no objections
to the proposal as a whole, and some questions and objections about
specific details. These are believed by the author to have been
addressed by making changes to the PEP.

Reference implementation

A reference implementation of the changes is available as a module
dictconfig.py with accompanying unit tests in test_dictconfig.py, at:

http://bitbucket.org/vinay.sajip/dictconfig

This incorporates all features other than the socket listener change.

Copyright

This document has been placed in the public domain.



  Local Variables: mode: indented-text indent-tabs-mode: nil
  sentence-end-double-space: t fill-column: 70 coding: utf-8 End: