An Unexpected Encounter with Python Class Attributes
Recently, I encountered a mysterious bug in my Python application. The cause of the bug was my poor understanding of:
- how class instances in Python initialise their attributes, and
- how instance attributes and class attributes are resolved.
In this post I will explain my incorrect assumptions about this part of Python.
People with Hobbies
The following Python snippet defines the Person
class, with a method for adding a hobby to the person's list of hobbies:
class Person:
hobbies = []
def add_hobby(self, hobby):
self.hobbies.append(hobby)
That's marvellous. With this class, we can instantiate some persons (people in human-speak) and give them hobbies:
tolkien = Person()
tolkien.add_hobby("writing")
elvis = Person()
elvis.add_hobby("music")
Now, pause for a moment.
elevator music starts playing
Ponder for a while.
elevator music suddenly stops
Did you spot the mistake?
I had been programming in Python for more than five years, and I didn't!
Unforeseen Consequences
What do you think the following two lines would print?
print(f"Tolkien's hobbies: {tolkien.hobbies}")
print(f"Elvis's hobbies: {elvis.hobbies}")
This is what I expected it to print:
Tolkien's hobbies: ['writing']
Elvis's hobbies: ['music']
Here's what it actually prints:
Tolkien's hobbies: ['writing', 'music']
Elvis's hobbies: ['writing', 'music']
What?! Why are both hobbies listed for both persons?
The Incident
My bug was rooted in this statement in the Person
class:
hobbies = []
I thought this statement made sure that every new Person
's hobbies
attribute was set to a new empty list.1
However, it actually sets the class attribute Person.hobbies
to an empty list.
The statement is only evaluated once, namely when the class is first evaluated, and no new hobby lists are made when new Person
instances are made.
Resolving Hobbies
When appending to the person's hobby list in add_hobby
, the attribute self.hobbies
is used.
When looking up that attribute, Python first looks for a variable named hobbies
in the instance's namespace.
If that fails, it tries to find the variable in the namespace of the instance's class!
So, it finds the Person.hobbies
attribute in the class's namespace.
The calls to add_hobby
mutated that single list in the class attribute, resulting in all hobbies from different person instances being added to the same list.
We can verify that they resolve to the same list by evaluating the expression:
tolkien.hobbies is elvis.hobbies
which results in True
.
This behaviour is well documented in the Python tutorial which goes to show that I should have RTFM.
Why it behaves like this is unclear to me (perhaps I should read some more).
To me, it would make sense for Python to raise an AttributeError
when trying to access an instance attribute that doesn't exist on the instance.
Feel free to reach out if you can enlighten me.
The Fix
Here's a corrected version of the Person
class:
class Person:
def __init__(self):
self.hobbies = []
def add_hobby(self, hobby):
self.hobbies.append(hobby)
In this version, the hobbies
attribute is assigned to the instance's namespace in the __init__
method, which is executed when an instance is initialised.
This way, every person instance gets its own list of hobbies in its instance attributes.
Footnotes
- This is vaguely how it works in some other programming languages like Java and PHP. ↩