1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260
|
.. _muppy_tutorial:
==================================
Tutorial - Track Down Memory Leaks
==================================
This tutorial shows you ways in which :term:`muppy` can be used to track down
memory leaks. From my experience, this can be done in 3 steps, each answering
a different question.
#. Is there a leak?
#. What objects leak?
#. Where does it leak?
IDLE
====
My first real-life test for :term:`muppy` was IDLE_, which is "the Python
IDE built with the Tkinter GUI toolkit." It offers the following features:
- coded in 100% pure Python, using the Tkinter GUI toolkit
- cross-platform: works on Windows and Unix (on Mac OS, there are currently
problems with Tcl/Tk)
- multi-window text editor with multiple undo, Python colorizing and many other
features, e.g. smart indent and call tips
- Python shell window (a.k.a. interactive interpreter)
- debugger (not complete, but you can set breakpoints, view and step)
Because it is integrated in every Python distribution, runs locally and provides
easy interactive feedback, it was a nice first candidate to test the tools of muppy.
The task was to check if IDLE leaks memory, if so, what objects are leaking, and
finally, why are they leaking.
Preparations
------------
IDLE is part of every Python distribution and can be found at
:file:`Lib/idlelib`. The modified version which makes use of muppy can be found
at http://code.google.com/p/muppy/source/browse/trunk#trunk/playground/idlelib.
With IDLE having a GUI, I also wanted to be able to interact with muppy through
the GUI. This can be done in :file:`Lib/idlelib/Bindings.py` and
:file:`Lib/idlelib/PyShell.py`. For details, please refer to the modified
version mentioned above.
Task 1: Is there a leak?
------------------------
At first, we need to find out if there are any objects leaking at all. We will
have a look at the objects, invoke an action, and look at the objects again.
.. code-block:: python
from pympler import tracker
self.memory_tracker = tracker.SummaryTracker()
self.memory_tracker.print_diff()
The last step is repeated after each invocation. Let's start with something
simple which should not leak. We will check the Windows resize. You can invoke
it in the menu at `Windows->Zoom Height`.
At first call `print_diff` till it has calibrated. That is, the first one or two
times, you will get some output because there is still something going on in the
background. But then you should get this::
types | # objects | total size
====== | =========== | ============
Which means nothing has changed since the last invocation of `print_diff`. Now
let's call `Windows->Zoom Height` and invoke `print_diff` again.::
types | # objects | total size
================== | =========== | ============
dict | 1 | 280 B
list | 1 | 176 B
_sre.SRE_Pattern | 1 | 88 B
tuple | 1 | 80 B
str | 0 | 7 B
Seems as this requires some of the above mentioned objects. Let's repeat it.::
types | # objects | total size
====== | =========== | ============
Okay, nothing changed, so nothing is leaking. But we see that often, the first
call to a function creates some objects, which then exist on a second
invocation.
Next, we try something different. We will open a new window. Let's have a look
at the Path Browser at `File->Path Browser`.::
types | # objects | total size
===================================================== | =========== | ============
dict | 18 | 14.26 KB
tuple | 146 | 13.17 KB
list | 2 | 11.67 KB
str | 97 | 7.85 KB
code | 46 | 5.52 KB
function | 45 | 5.40 KB
classobj | 9 | 864 B
instancemethod (<function wakeup>) | 3 | 240 B
instancemethod (<function __call__>) | 3 | 240 B
instance(<class Tkinter.CallWrapper>) | 3 | 216 B
module | 3 | 168 B
instance(<class idlelib.WindowList.ListedToplevel>) | 1 | 72 B
Let's repeat it.::
types | # objects | total size
===================================================== | =========== | ============
dict | 5 | 2.17 KB
list | 0 | 384 B
str | 5 | 259 B
instancemethod (<function wakeup>) | 3 | 240 B
instancemethod (<function __call__>) | 3 | 240 B
instance(<class Tkinter.CallWrapper>) | 3 | 216 B
instance(<class idlelib.WindowList.ListedToplevel>) | 1 | 72 B
Mh, still some new objects. Repeating this procedure several times will reveal
that here indeed we have a leak.
Task 2: What objects leak?
--------------------------
So let's have a closer look at the diff. We see 5 new `dicts` and `strings`, a
bit more memory usage by `lists`, 3 `wakeup` and `__call__` instance methods, 3
`CallWrapper` and 1 `ListedToplevel`. We know the standard types, but the last
couple of objects seem to be from IDLE.
We ignore the standard type objects for now. It is more likely that these are
only children of some other instances which are causing the leak.
We start with the `ListedTopLevel` object. One invocation of `File->Path
Browser` and one more of this type looks like this object is not garbage
collected, although it should have been. Searching for `ListedTopLevel` in
`idlelib/` reveals that is the base class to all window objects of IDLE. We can
assume that opening the Path Browser, a new window object is created, but
closing the window does not remove the reference.
Next, we take a look at the `wakeup` instance method of which we have three more
on each invocation. Searching the code, we find it to be defined in
`idlelib/WindowList.py`. This piece of code is used to give users of IDLE a list
of currently open windows. Every time a new window is created, it will be added
to the `Windows` menu, from where the user can select any open window. `wakeup`
is the method which will bring the selected window up front. Adding a window
calls menu.add_command, linking menu and the wakeup command together.
.. _menu_add_command:
.. code-block:: python
menu.add_command(label=title, command=window.wakeup)
So we are getting closer. Only `__call__` and `Tkinter.CallWrapper` are left. As
the name indicates, the latter is located in the Tkinter module, which is part
of the standard library of Python. So let's dive into it. The CallWrapper
docstring states::
Internal class. Stores function to call when some user defined Tcl function is
called e.g. after an event occurred.
Also, CallWrapper contains a method called `__call__`, which is used to invoke
the stored function call. A CallWrapper is created by the method `_register`
which then creates a command (Tk speak) and adds it's name to a list called
`self._tclCommands`.
So what do we know so far? Every time a Path Browser is opened, a window is
created, but not deleted when closed again. It has something to do with the
`wakeup` method of the window. This method is wrapped as a Tcl command and then
linked to the window list menu. Also, we have traced this wrapping back to
Tkinter library, where names of the function wrappers are stored in an attribute
called `_tclCommands`.
This brings us to the third question.
Task 3: Where is the leak?
--------------------------
`_tclCommands` stores the names of all commands linked to a widget. The base
class for interior widgets (of which the menu is one), has a method called
`destroy` which::
Delete all Tcl commands created for this widget in the Tcl
interpreter.
as well as a method `deletecommand` which deletes a single Tcl command. Both
remove commands as by there name. Among them, we find our CallWrappers'
`__call__` used to wrap the wakeup of the Path Browser window.
So we should expect at least either one to be invoked when a window is closed
(best would be the invocation of only deletecommand). This would also go in line
with `menu.add_command` we identified :ref:`above<menu_add_command>`. And
indeed, in `idlelib/EditorWindow.py`, `menu.delete` is called. So where is the
problem?
We return to `Tkinter.py` and have a closer look at `delete` method::
def delete(self, index1, index2=None):
"""Delete menu items between INDEX1 and INDEX2 (not included)."""
self.tk.call(self._w, 'delete', index1, index2)
Mh, it looks like the menu item is deleted, but what about the attached
command? Let's ask the Web for "tkinter deletecommand". Turns out that somebody
some years ago filed a bug (see bugreport_) which states::
Tkinter.Menu.delete does not delete the commands
defined for the entries it deletes. Those objects
will be retained until the menu itself is deleted.
[..]
the command function will still be referenced and
kept in memory - until the menu object itself is
destroyed.
Well, this seems to be the root of our memory leak. Let's adapt the `delete`
method a bit, so that the associated commands are deleted as well::
def delete(self, index1, index2=None):
"""Delete menu items between INDEX1 and INDEX2 (not included)."""
if index2 is None:
index2 = index1
cmds = []
(num_index1, num_index2) = (self.index(index1), self.index(index2))
if (num_index1 is not None) and (num_index2 is not None):
for i in range(num_index1, num_index2 + 1):
if 'command' in self.entryconfig(i):
c = str(self.entrycget(i, 'command'))
if c in self._tclCommands:
cmds.append(c)
self.tk.call(self._w, 'delete', index1, index2)
for c in cmds:
self.deletecommand(c)
Now we restart IDLE, calibrate our tracker and do another round of `print_diff`.
After the first time the Path Browser is opened we get this::
types | # objects | total size
========== | =========== | ============
tuple | 146 | 13.17 KB
dict | 13 | 12.01 KB
list | 2 | 11.26 KB
str | 92 | 7.59 KB
code | 46 | 5.52 KB
function | 45 | 5.40 KB
classobj | 9 | 864 B
module | 3 | 168 B
Okay, still some objects created, but no more instances and instance
methods. Let's do it again.::
types | # objects | total size
======= | =========== | ============
Yes, this looks definitely better. The memory leak is gone.
The problem is fixed for Python versions 2.5 and higher so updated
installations will not face this leak.
.. http://bugs.python.org/issue1342811
.. http://www.uk.debian.org/~graham/python/tkleak.py
.. _IDLE: http://docs.python.org/lib/idle.html
.. _bugreport: http://bugs.python.org/issue1342811
|