Thursday, December 3, 2009

Pitfalls class and instance variables in Python

Here is how you can get really confused with class (static) and instance variables in Python.  In this short tutorial I am using IPython which is why each input starts with 'In' and output begins with 'Out'

Create a class C
In [1]: class C(object):
   ...:     x = 10
   ...:  

Then check that x exists in the namespace of C
In [2]: print C.x
10

In [3]: print C.__dict__
{'__dict__': , 'x': 10, '__module__': '__main__', '__weakref__': , '__doc__': None}



Now create an object of the class C
In [4]: foo = C()






And now check that x exists in the namespace of foo
In [5]: foo.x
Out[5]: 10


Just to double check, lets examine the __dict__
In [6]: print foo.__dict__
{}





What actually occurs when you do foo.x is that the object foo is checked for an instance variable x which fails.  Then the class C (which is foo's type) is checked for the class variable x.  The variable x is found and the value is printed.  The pitfall here is that it now looks like the object foo has an instance variable called x with a value of 10.

Now lets try to modify the value of x in the class
In [7]: C.x = 11

In [8]: print C.x
11


In [28]: print foo.x
11

Here you might be surprised that foo.x was updated, unless you know that foo.x is actually printing out the value of C.x


Now lets try to modify the value of x in the object foo
In [9]: foo.x = 12

In [10]: print foo.x
12

In [11]: print C.x
11


Note that foo.x was set to 12 as we expected, but C.x remains as 11.
The reason for this is that when foo.x is assigned a value, the object foo creates a new instance variable called 'x'.  This can be checked by looking at the namespace of foo:
In [12]: print foo.__dict__
{'x': 12}


Remember before we gave foo.x a value the namespace of foo did not contain any variables.

The difference between using instance and class variables inside of class methods is that you use 'self' to refer to instance variables.  Here we create a class C which has both a class and instance variable called 'x':

   In [13]: class C:
   ....:     x = 10
   ....:     def __init__(self):
   ....:         self.x = 11
   ....:       

In [14]: bar = C()

In [15]: print C.__dict__
{'x': 10, '__module__': '__main__', '__doc__': None, '__init__': }


In [16]: print bar.__dict__
{'x': 11}


This shows that the C has a class variable x of value 10 and the object bar has an instance variable x (self.x) of value 11

My recommendation is to not use the same name for class and instance variables in the same class.

No comments: