1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145
|
# xrprof
<!-- badges: start -->

[](https://travis-ci.org/atheriel/xrprof)
<!-- badges: end -->
`xrprof` (formerly `rtrace`) is an external sampling profiler for R on Linux and
Windows.
Many R users will be familiar with using the built-in sampling profiler
`Rprof()` to generate data on what their code is doing, and there are several
excellent tools to facilitate understanding these samples (or serve as a
front-end), including the [**profvis**](https://rstudio.github.io/profvis/)
package.
However, the reach of `Rprof()` and related tools is limited: the profiler is
"internal", in the sense that it must be manually switched on to work, either
during interactive work (for example, to profile an individual function), or
perhaps by modifying the script to include `Rprof()` calls before running it
again.
In contrast, `xrprof` can be used to profile code that is *already running*:
```console
$ Rscript myscript.R &
# sudo may be required.
$ xrprof -p <PID> -F 50 > Rprof.out
```
External sampling profilers have proven extremely useful for diagnosing and
fixing performance issues (or other bugs) in production environments. This
project joins a large list similar tools for other languages, such as `perf`
(the Linux system profiler), `jstack` (for Java), `rbspy` (for Ruby), `Pyflame`
(for Python), `VSPerfCmd` for C#/.NET, and many others.
## Building
### On Linux
`xrprof` depends on libelf and libunwind, so you must have their headers to
compile the program. For example, on Debian-based systems (including Ubuntu),
you can install these with
```console
$ sudo apt-get install libelf-dev libunwind-dev
```
A simple `Makefile` is provided. Build the binary with
```console
$ make
```
To install the profiler to your system, use
```console
$ sudo make install
```
This will install the binary to `/usr/local/bin` and use `setcap` to mark it for
use without `sudo`. The `install` target supports `prefix` and `DESTDIR`.
### On Windows
You must have a build environment set up. For R users, the best option is to use
R's own [Rtools for Windows](https://cran.r-project.org/bin/windows/Rtools/)
(which is also used to install packages from source). You can then launch
"Rtools MinGW 64-bit" from the Start Menu and navigate to the source directory;
then run
```console
$ make -f Makefile.win
```
The resulting `xrprof.exe` program can be run from `cmd.exe` or PowerShell.
## Usage
The profiler has a simple interface:
Usage: xrprof [-F <freq>] [-d <duration>] -p <pid>
The `Rprof.out` format is written to standard output and errors or other
messages are written to standard error.
On Windows, R's process ID (PID) can be looked up in Task Manager.
Along with the sampling profiler itself, there is also a `stackcollapse-Rprof.R`
script in `tools/` that converts the `Rprof.out` format to one that can be
understood by Brendan Gregg's [FlameGraph](http://www.brendangregg.com/flamegraphs.html)
tool. You can use this to produce graphs like the one below:
```shell
$ stackcollapse-Rprof.R Rprof.out | flamegraph.pl > Rprof.svg
```

## Running Under Docker
A public Docker image is available at `atheriel/xrprof`. Since `xrprof` reads
the memory of other running programs, it must be run as a privileged container
in the host PID namespace. For example:
```console
$ docker run --privileged --pid=host -it atheriel/xrprof -p <PID>
```
## Okay, How Does it Work?
Much like other sampling profilers, the program uses Linux's `ptrace` system
calls to attach to running R processes and a mix of `ptrace` and
`process_vm_readv` to read the memory contents of that process, following
pointers along the way.
The R-specific aspect of this is to locate and decode the `R_GlobalContext`
structure inside of the R interpreter that stores information on the currently
executing R code.
In order to defeat address space randomization, `xrprof` will search through the
ELF files loaded into memory (at `/proc/<pid>/maps`) for the symbols required,
either in the executable itself or in `libR.so` (if it appears R has been
compiled to use it).
`xrprof` is mount-namespace-aware, so it supports profiling R processes running
inside Docker containers.
On Windows, `xrprof` makes use of APIs like `ReadProcessMemory()`,
`NtSuspendProcess()`, and `SymFromName()` to achieve the analogous result.
## Credits
The project was inspired by Julia Evan's blog posts on writing
[`rbspy`](https://rbspy.github.io/) and later by my discovery of Evan Klitzke's
work (and writing) on [Pyflame](https://github.com/uber/pyflame).
## License
This project contains portions of the source code of R itself, which is
copyright the R Core Developers and licensed under the GPLv2.
The remaining code is copyright its authors and also available under the same
license, GPLv2.
|