The Python logging
package is a a lightweight but extensible package for keeping better track of what your own code does. Using it gives you much more flexibility than just littering your code with superfluous print()
calls.
However, Python’s logging
package can be complicated in certain spots. Handlers, loggers, levels, namespaces, filters: it’s not easy to keep track of all of these pieces and how they interact.
One way to tie up the loose ends in your understanding of logging
is to peek under the hood to its CPython source code. The Python code behind logging
is concise and modular, and reading through it can help you get that aha moment.
This article is meant to complement the logging HOWTO document as well as Logging in Python, which is a walkthrough on how to use the package.
By the end of this article, you’ll be familiar with the following:
logging
levels and how they work- Thread-safety versus process-safety in
logging
- The design of
logging
from an OOP perspective - Logging in libraries vs applications
- Best practices and design patterns for using
logging
For the most part, we’ll go line-by-line down the core module in Python’s logging
package in order to build a picture of how it’s laid out.
Free Bonus: 5 Thoughts On Python Mastery, a free course for Python developers that shows you the roadmap and the mindset you'll need to take your Python skills to the next level.
How to Follow Along
Because the logging
source code is central to this article, you can assume that any code block or link is based on a specific commit in the Python 3.7 CPython repository, namely commit d730719
. You can find the logging
package itself in the Lib/
directory within the CPython source.
Within the logging
package, most of the heavy lifting occurs within logging/__init__.py
, which is the file you’ll spend the most time on here:
cpython/
│
├── Lib/
│ ├── logging/
│ │ ├── __init__.py
│ │ ├── config.py
│ │ └── handlers.py
│ ├── ...
├── Modules/
├── Include/
...
... [truncated]
With that, let’s jump in.
Preliminaries
Before we get to the heavyweight classes, the top hundred lines or so of __init__.py
introduce a few subtle but important concepts.
Preliminary #1: A Level Is Just an int
!
Objects like logging.INFO
or logging.DEBUG
can seem a bit opaque. What are these variables internally, and how are they defined?
In fact, the uppercase constants from Python’s logging
are just integers, forming an enum-like collection of numerical levels:
CRITICAL = 50
FATAL = CRITICAL
ERROR = 40
WARNING = 30
WARN = WARNING
INFO = 20
DEBUG = 10
NOTSET = 0
Why not just use the strings "INFO"
or "DEBUG"
? Levels are int
constants to allow for the simple, unambiguous comparison of one level with another. They are given names as well to lend them semantic meaning. Saying that a message has a severity of 50 may not be immediately clear, but saying that it has a level of CRITICAL
lets you know that you’ve got a flashing red light somewhere in your program.
Now, technically, you can pass just the str
form of a level in some places, such as logger.setLevel("DEBUG")
. Internally, this will call _checkLevel()
, which ultimately does a dict
lookup for the corresponding int
:
_nameToLevel = {
'CRITICAL': CRITICAL,
'FATAL': FATAL,
'ERROR': ERROR,
'WARN': WARNING,
'WARNING': WARNING,
'INFO': INFO,
'DEBUG': DEBUG,
'NOTSET': NOTSET,
}
def _checkLevel(level):
if isinstance(level, int):
rv = level
elif str(level) == level:
if level not in _nameToLevel:
raise ValueError("Unknown level: %r" % level)
rv = _nameToLevel[level]
else:
raise TypeError("Level not an integer or a valid string: %r" % level)
return rv
Which should you prefer? I’m not too opinionated on this, but it’s notable that the logging
docs consistently use the form logging.DEBUG
rather than "DEBUG"
or 10
. Also, passing the str
form isn’t an option in Python 2, and some logging
methods such as logger.isEnabledFor()
will accept only an int
, not its str
cousin.
Preliminary #2: Logging Is Thread-Safe, but Not Process-Safe
A few lines down, you’ll find the following short code block, which is sneakily critical to the whole package:
import threading
_lock = threading.RLock()
def _acquireLock():
if _lock:
_lock.acquire()
def _releaseLock():
if _lock:
_lock.release()
The _lock
object is a reentrant lock that sits in the global namespace of the logging/__init__.py
module. It makes pretty much every object and operation in the entire logging
package thread-safe, enabling threads to do read and write operations without the threat of a race condition. You can see in the module source code that _acquireLock()
and _releaseLock()
are ubiquitous to the module and its classes.
There’s something not accounted for here, though: what about process safety? The short answer is that the logging
module is not process safe. This isn’t inherently a fault of logging
—generally, two processes can’t write to same file without a lot of proactive effort on behalf of the programmer first.
This means that you’ll want to be careful before using classes such as a logging.FileHandler
with multiprocessing involved. If two processes want to read from and write to the same underlying file concurrently, then you can run into a nasty bug halfway through a long-running routine.
If you want to get around this limitation, there’s a thorough recipe in the official Logging Cookbook. Because this entails a decent amount of setup, one alternative is to have each process log to a separate file based on its process ID, which you can grab with os.getpid()
.
Package Architecture: Logging’s MRO
Now that we’ve covered some preliminary setup code, let’s take a high-level look at how logging
is laid out. The logging
package uses a healthy dose of OOP and inheritance. Here’s a partial look at the method resolution order (MRO) for some of the most important classes in the package:
object
│
├── LogRecord
├── Filterer
│ ├── Logger
│ │ └── RootLogger
│ └── Handler
│ ├── StreamHandler
│ └── NullHandler
├── Filter
└── Manager
The tree diagram above doesn’t cover all of the classes in the module, just those that are most worth highlighting.
Note: You can use the dunder attribute logging.StreamHandler.__mro__
to see the chain of inheritance. A definitive guide to the MRO can be found in the Python 2 docs, though it is applicable to Python 3 as well.
This litany of classes is typically one source of confusion because there’s a lot going on, and it’s all jargon-heavy. Filter
versus Filterer
? Logger
versus Handler
? It can be challenging to keep track of everything, much less visualize how it fits together. A picture is worth a thousand words, so here’s a diagram of a scenario where one logger with two handlers attached to it writes a log message with level logging.INFO
:

In Python code, everything above would look like this:
import logging
import sys
logger = logging.getLogger("pylog")
logger.setLevel(logging.DEBUG)
h1 = logging.FileHandler(filename="/tmp/records.log")
h1.setLevel(logging.INFO)
h2 = logging.StreamHandler(sys.stderr)
h2.setLevel(logging.ERROR)
logger.addHandler(h1)
logger.addHandler(h2)
logger.info("testing %d.. %d.. %d..", 1, 2, 3)
There’s a more detailed map of this flow in the Logging HOWTO. What’s shown above is a simplified scenario.
Your code defines just one Logger
instance, logger
, along with two Handler
instances, h1
and h2
.
When you call logger.info("testing %d.. %d.. %d..", 1, 2, 3)
, the logger
object serves as a filter because it also has a level
associated with it. Only if the message level is severe enough will the logger do anything with the message. Because the logger has level DEBUG
, and the message carries a higher INFO
level, it gets the go-ahead to move on.
Internally, logger
calls logger.makeRecord()
to put the message string "testing %d.. %d.. %d.."
and its arguments (1, 2, 3)
into a bona fide class instance of a LogRecord
, which is just a container for the message and its metadata.
The logger
object looks around for its handlers (instances of Handler
), which may be tied directly to logger
itself or to its parents (a concept that we’ll touch on later). In this example, it finds two handlers:
- One with level
INFO
that dumps log data to a file at/tmp/records.log
- One that writes to
sys.stderr
but only if the incoming message is at levelERROR
or higher
At this point, there’s another round of tests that kicks in. Because the LogRecord
and its message only carry level INFO
, the record gets written to Handler 1 (green arrow), but not to Handler 2’s stderr
stream (red arrow). For Handlers, writing the LogRecord
to their stream is called emitting it, which is captured in their .emit()
.
Next, let’s further dissect everything from above.
The LogRecord
Class
What is a LogRecord
? When you log a message, an instance of the LogRecord
class is the object you send to be logged. It’s created for you by a Logger
instance and encapsulates all the pertinent info about that event. Internally, it’s little more than a wrapper around a dict
that contains attributes for the record. A Logger
instance sends a LogRecord
instance to zero or more Handler
instances.
The LogRecord
contains some metadata, such as the following:
- A name
- The creation time as a Unix timestamp
- The message itself
- Information on what function made the logging call
Here’s a peek into the metadata that it carries with it, which you can introspect by stepping through a logging.error()
call with the pdb
module: