File: muppy.rst

package info (click to toggle)
pympler 1.1%2Bdfsg1-1
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 1,196 kB
  • sloc: python: 9,816; javascript: 2,775; makefile: 17
file content (211 lines) | stat: -rw-r--r-- 8,326 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
.. _muppy:

========================
Identifying memory leaks
========================

Muppy tries to help developers to identity memory leaks of Python
applications. It enables the tracking of memory usage during runtime and the
identification of objects which are leaking. Additionally, tools are provided
which allow to locate the source of not released objects.

Muppy is (yet another) Memory Usage Profiler for Python. The focus of this
toolset is laid on the identification of memory leaks. Let's have a look what
you can do with muppy.

The muppy module
================

Muppy allows you to get hold of all objects,

>>> from pympler import muppy
>>> all_objects = muppy.get_objects()
>>> len(all_objects)                           # doctest: +SKIP
19700

or filter out certain types of objects.

>>> my_types = muppy.filter(all_objects, Type=type)
>>> len(my_types)                                    # doctest: +SKIP
72
>>> for t in my_types:
...     print t
...                                               # doctest: +SKIP
UserDict.IterableUserDict
UserDict.UserDict
UserDict.DictMixin
os._Environ
sre_parse.Tokenizer
sre_parse.SubPattern
re.Scanner
string._multimap
distutils.log.Log
encodings.utf_8.StreamWriter
encodings.utf_8.StreamReader
codecs.StreamWriter
codecs.StreamReader
codecs.StreamReaderWriter
codecs.Codec
codecs.StreamRecoder
tokenize.Untokenizer
inspect.BlockFinder
sre_parse.Pattern
. . .

This result, for example, tells us that the number of lists remained the same,
but the memory allocated by lists has increased by 8 bytes. The correct increase
for a LP64 system (see 64-Bit_Programming_Models_). 

The summary module
==================

You can create summaries

>>> from pympler import summary
>>> sum1 = summary.summarize(all_objects)
>>> summary.print_(sum1)                          # doctest: +SKIP
                       types |   # objects |   total size
============================ | =========== | ============
                        dict |         546 |    953.30 KB
                         str |        8270 |    616.46 KB
                        list |         127 |    529.44 KB
                       tuple |        5021 |    410.62 KB
                        code |        1378 |    161.48 KB
                        type |          70 |     61.80 KB
          wrapper_descriptor |         508 |     39.69 KB
  builtin_function_or_method |         515 |     36.21 KB
                         int |         900 |     21.09 KB
           method_descriptor |         269 |     18.91 KB
                     weakref |         177 |     15.21 KB
         <class 'abc.ABCMeta |          16 |     14.12 KB
                         set |          48 |     10.88 KB
         function (__init__) |          81 |      9.49 KB
           member_descriptor |         131 |      9.21 KB

and compare them with other summaries.

>>> sum2 = summary.summarize(muppy.get_objects())
>>> diff = summary.get_diff(sum1, sum2)
>>> summary.print_(diff)                          # doctest: +SKIP
                          types |   # objects |   total size
=============================== | =========== | ============
                           list |        1097 |      1.07 MB
                            str |        1105 |     68.21 KB
                           dict |          14 |     21.08 KB
             wrapper_descriptor |         215 |     16.80 KB
                            int |         121 |      2.84 KB
                          tuple |          30 |      2.02 KB
              member_descriptor |          25 |      1.76 KB
                        weakref |          14 |      1.20 KB
              getset_descriptor |          15 |      1.05 KB
              method_descriptor |          12 |    864     B
  frame (codename: get_objects) |           1 |    488     B
     builtin_function_or_method |           6 |    432     B
     frame (codename: <module>) |           1 |    424     B
         classmethod_descriptor |           3 |    216     B
                           code |           1 |    120     B

The tracker module
==================
Of course we don't have to do all these steps manually, instead we can use
muppy's tracker.

>>> from pympler import tracker
>>> tr = tracker.SummaryTracker()
>>> tr.print_diff()                               # doctest: +SKIP
                                 types |   # objects |   total size
====================================== | =========== | ============
                                  list |        1095 |    160.78 KB
                                   str |        1093 |     66.33 KB
                                   int |         120 |      2.81 KB
                                  dict |           3 |    840     B
      frame (codename: create_summary) |           1 |    560     B
          frame (codename: print_diff) |           1 |    480     B
                frame (codename: diff) |           1 |    464     B
                 function (store_info) |           1 |    120     B
                                  cell |           2 |    112     B

A tracker object creates a summary (that is a summary which it will remember)
on initialization. Now whenever you call tracker.print_diff(), a new summary of
the current state is created, compared to the previous summary and printed to
the console. As you can see here, quite a few objects got in between these two
invocations. 
But if you don't do anything, nothing will change.

>>> tr.print_diff()                               # doctest: +SKIP
  types |   # objects |   total size
======= | =========== | ============

Now check out this code snippet

>>> i = 1
>>> l = [1,2,3,4]
>>> d = {}
>>> tr.print_diff()                               # doctest: +SKIP
  types |   # objects |   total size
======= | =========== | ============
   dict |           1 |    280     B
   list |           1 |    192     B

As you can see both, the new list and the new dict appear in the summary, but
not the 4 integers used. Why is that? Because they existed already before they
were used here, that is some other part in the Python interpreter code makes
already use of them. Thus, they are not new.

The refbrowser module
=====================

In case some objects are leaking and you don't know where they are still
referenced, you can use the referrers browser.
At first let's create a root object which we then reference from a tuple and a
list.

>>> from pympler import refbrowser
>>> root = "some root object"
>>> root_ref1 = [root]
>>> root_ref2 = (root, )

>>> def output_function(o):
...     return str(type(o))
...
>>> cb = refbrowser.ConsoleBrowser(root, maxdepth=2, str_func=output_function)

Then we create a ConsoleBrowser, which will give us a referrers tree starting at
`root`, printing to a maximum depth of 2, and uses `str_func` to represent
objects. Now it's time to see where we are at.

>>> cb.print_tree()                               # doctest: +SKIP
<type 'str'>-+-<type 'dict'>-+-<type 'list'>
             |               +-<type 'list'>
             |               +-<type 'list'>
             |
             +-<type 'dict'>-+-<type 'module'>
             |               +-<type 'list'>
             |               +-<type 'frame'>
             |               +-<type 'function'>
             |               +-<type 'list'>
             |               +-<type 'frame'>
             |               +-<type 'list'>
             |               +-<type 'function'>
             |               +-<type 'frame'>
             |
             +-<type 'list'>--<type 'dict'>
             +-<type 'tuple'>--<type 'dict'>
             +-<type 'dict'>--<class 'muppy.refbrowser.ConsoleBrowser'>

What we see is that the root object is referenced by the tuple and the list, as
well as by three dictionaries. These dictionaries belong to the environment,
e.g. the ConsoleBrowser we just started and the current execution context.

This console browsing is of course kind of inconvenient. Much better would be an
InteractiveBrowser. Let's see what we got.

>>> from pympler import refbrowser
>>> ib = refbrowser.InteractiveBrowser(root)
>>> ib.main()

.. image:: images/muppy_guibrowser.png

Now you can click through all referrers of the root object.

.. _64-Bit_Programming_Models: http://www.unix.org/version2/whatsnew/lp64_wp.html