1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283
|
The ``run`` subcommand
======================
The ``run`` subcommand is used to launch a new Python process and track the memory allocations it
performs while it runs.
Basic tracking
--------------
The general form of the ``run`` subcommand is one of:
.. code:: shell
memray run [options] file.py [args]
memray run [options] -m module [args]
Like the Python interpreter itself, the ``run`` subcommand can take a path to a Python file to run,
or the name of a Python module to run if you use the ``-m`` flag. While it runs, memory allocations
and deallocations throughout the program are tracked. By default they are saved into a file with the
following pattern:
``memray-<script>.<pid>.bin``
where ``<script>`` is the name of the executed script and ``<pid>`` is the process id it ran with.
A different filename can be provided with the ``-o`` or ``--output`` argument.
.. _Native tracking:
Native tracking
---------------
Overview
~~~~~~~~
Memray supports tracking native C/C++ functions as well as Python functions. This can be especially useful
when profiling libraries that have extension modules (such as ``numpy`` or ``pandas``) as this
gives a holistic vision of how much memory is allocated by the extension and how much is allocated by Python itself.
For instance, consider the Mandelbrot example from the :ref:`example-applications` section with native tracking
disabled. Some of the most important allocations happen when operating on NumPy arrays:
.. image:: _static/images/mandelbrot_operation_non_native.png
Here, we can see that the allocation happens when doing some math on NumPy arrays but unfortunately this doesn't inform us
of what exact operation is allocating memory or how temporaries are being used. We also don't know if the memory was
allocated by NumPy or by the interpreter itself. By using the native tracking mode with Memray we can get a much richer report:
.. image:: _static/images/mandelbrot_operation_native.png
In this native report, we can see all the internal C calls that are underneath. We can see that the memory allocation
happens when the NumPy arrays are being added, due to ``PyNumber_Add`` appearing in the stack trace. Based on
``PyNumber_Multiply`` not appearing in the stack trace, we can conclude that the temporary array created by NumPy is
immediately freed (or that it didn't need to allocate memory in the first place, perhaps because it could reuse some
already allocated memory).
.. tip::
Memray will also include *inlined* functions and *macros* when native tracking is enabled.
.. caution::
Activating native tracking has a moderate impact on performance as every instruction pointer in the call stack needs
to be resolved whenever an allocation happens. This effect is more noticeable the more allocations the traced
application performs.
Check the :doc:`section on native symbolification <native_mode>` for more
information on how to obtain the best reports with native information, and on
how to debug problems in reports with native information.
Usage
~~~~~
To activate native tracking, you need to provide the ``--native`` argument when using the ``run`` subcommand:
.. code:: shell
memray run --native example.py
This will add native stack information to the result file, which any reporter will automatically use.
.. important::
When generating reports for result files that contain native frames, the report needs to be generated **on the same
machine** where the result file was generated. This is because the shared libraries that were loaded by the process
need to be inspected by Memray to get the correct symbol names.
When reporters display native information they will normally use a different color for the Python frames than the native
frames. This can also be distinguished by looking at the file name in a frame, since Python frames will generally come
from source files with a ``.py`` extension.
Python allocator tracking
-------------------------
Memray normally tracks allocation and deallocation requests made to the system
allocator, but by default it won't see individual Python objects being created.
That's because the Python interpreter normally uses its own memory pools for
creating most objects, only making calls to the system allocator as needed to
grow or shrink its memory pools. Our documentation on :doc:`python allocators
<python_allocators>` describes this memory pooling in greater detail. This
behavior speeds the Python interpreter up, and by extension speeds up profiling
with Memray, while still allowing Memray to show you each place where your
program needs to acquire more memory.
You can ask Memray to show you each individual object being created and
destroyed, instead, by providing the ``--trace-python-allocators`` argument to
the ``run`` subcommand. This records a lot more data and makes profiling much
slower. It will show you all allocations, even ones that don't result in your
program requesting more memory from the system because the interpreter already
had memory available for reuse. It can be useful in some cases, though,
especially when tracking down memory leaks.
.. note::
This acts also as an alternative way to run with ``PYTHONMALLOC=malloc`` but
in a way that allows distiguishing allocations made by using the system
allocator directly and ones made by using the Python allocator.
.. code:: shell
memray run --trace-python-allocators example.py
.. caution::
Tracking the Python allocators will result in much larger report files and
slower profiling due to the larger amount of data that needs to be collected.
.. _Live tracking:
Live tracking
-------------
Overview
~~~~~~~~
Memray supports presenting a "live" view for observing the memory usage of a running Python program.
.. image:: _static/images/live_running.png
Usage
~~~~~
You can run a program in live mode using ``run --live``:
.. code:: shell
memray3.9 run --live application.py
Immediately Memray will start your application in the background and will run a TUI in the foreground that you can use
to analyze your application's memory usage. If you don't want to run your program in the background, you can instead
use ``run --live-remote``:
.. code:: shell
memray3.9 run --live-remote application.py
In this mode, Memray will choose an unused port, bind to it, and display a message saying:
.. code:: text
Run 'memray live <port>' in another shell to see live results
It will wait for you to run:
.. code:: shell
memray3.9 live <port>
in another terminal window to attach to it. Regardless of whether you choose to use one terminal or two, the resulting
TUI is exactly the same. See :doc:`live` for details on how to interpret and control the TUI.
.. _Tracking across forks:
Tracking across forks
---------------------
Overview
~~~~~~~~
Memray can optionally continue tracking in a child process after a parent process forks. This can be useful when using
``multiprocessing``, or a framework utilizing a pre-fork pattern like Celery or Gunicorn.
Usage
~~~~~
To activate tracking through forks, you need to provide the ``--follow-fork`` argument to the ``run`` subcommand:
.. code:: shell
memray run --follow-fork example.py
In this mode, each time the process forks, a new output file will be created for the new child process, with the new
child's process ID appended to the original capture file's name. The capture files for child processes are exactly like
any other capture file, and can be fed into any reporter of your choosing.
.. note::
``--follow-fork`` mode can only be used with an output file. It is incompatible with ``--live``
mode and ``--live-remote`` mode, since the TUI can't be attached to multiple processes at once.
Losing capture files after OOM Errors
-------------------------------------
When a process runs out of memory, this commonly causes an Out Of Memory error,
or "OOM Error". That causes the process to be killed by its operating system.
Within orchestrations like Kubernetes the termination of the main process might
immediately lead to the destruction of the container and the loss of the files
that Memray uses to collect its results.
When running ``memray run myprogram.py`` a capture file gets created on the file
system, but the entire file system will be thrown away as soon as the
orchestration cleans up the container. If the program exits unexpectedly,
perhaps because the kernel kills it due to an OOM error, the orchestration might
throw away the capture file before you ever get a chance to use it. Since Memray
is often used to chase memory leaks, this condition might happen more often than
you'd like.
Since Memray is running in the same process as your application, it has no way
to prevent this data loss (by sending it over the network, for example) because
any work it does will be terminated when the process crashes.
Instead of directly calling ``memray run myprogram.py`` you can wrap it in
a script that will run Memray and run post-processing operations on the capture
file. That way, the container won't be destroyed by the orchestration until
after the wrapper script exits, rather than being destroyed as soon as the
Python script being tracked by Memray exits.
.. code-block:: shell
memray run --output /tmp/capture.bin myprogram.py
echo "Program finished"
# Do your post-processing here. This example just logs a summary of what's
# in the capture file, but you might want to generate reports from it and
# copy them over the network to some persistent storage, for instance.
memray summary /tmp/capture.bin
.. _aggregated capture files:
Aggregated capture files
------------------------
If you supply the ``--aggregate`` argument to ``memray run``, it will write
much smaller capture files. Instead of containing information about every
individual allocation performed by the tracked program, a capture file produced
using ``--aggregate`` will contain some statistics about the process's
allocations aggregated by the location where the allocation happened.
Specifically, for every location where the tracked process performed any
allocations, an aggregated capture file includes a count of:
- How many allocations at that location had not yet been deallocated when the
process reached its heap memory high water mark
- How many bytes had been allocated at that location and not yet deallocated
when the process reached its heap memory high water mark
- How many allocations at that location were leaked (i.e. not deallocated
before tracking stopped)
- How many bytes were leaked by allocations at that location
These counts provide enough information to generate flame graphs. In fact, this
information is enough to run most of our reporters, with just a few exceptions:
- You cannot find :doc:`temporary allocations </temporary_allocations>` using
this capture file format, since finding temporary allocations requires
knowing when each individual allocation was deallocated.
- You cannot use the :doc:`stats reporter <stats>` with this capture file
format, because it needs to see each individual allocation's size.
- You cannot use ``--aggregate`` with :ref:`live tracking <live tracking>`,
since the live TUI needs to see each allocation as it happens.
Also, note that if the process is killed before tracking ends (for instance, by
the Linux OOM killer), then the process will die before it finishes calculating
its statistics, and so no useful information is ever written to the capture
file. With the default file format, the capture file is usually still usable
even if the process crashes or is killed, but with the aggregated file format
it is not.
If you can live with these limitations, then using ``--aggregate`` results in
much smaller capture files that can be used seamlessly with most reporters.
CLI Reference
-------------
.. argparse::
:ref: memray.commands.get_argument_parser
:path: run
:prog: memray
|