File: instrumented-runtime.html

package info (click to toggle)
ocaml-doc 4.11-2
  • links: PTS, VCS
  • area: non-free
  • in suites: bookworm, bullseye, forky, sid, trixie
  • size: 20,580 kB
  • sloc: sh: 37; makefile: 11
file content (249 lines) | stat: -rw-r--r-- 14,948 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
<!DOCTYPE html>
<html>
<head>

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="generator" content="hevea 2.32">

  <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
<link rel="stylesheet" type="text/css" href="manual.css">
<title>Chapter 24  Runtime tracing with the instrumented runtime</title>
</head>
<body>
<a href="afl-fuzz.html"><img src="previous_motif.svg" alt="Previous"></a>
<a href="index.html"><img src="contents_motif.svg" alt="Up"></a>
<a href="core.html"><img src="next_motif.svg" alt="Next"></a>
<hr>
<h1 class="chapter" id="sec560">Chapter 24  Runtime tracing with the instrumented runtime</h1>
<ul>
<li><a href="instrumented-runtime.html#s%3Ainstr-runtime-overview">24.1  Overview</a>
</li><li><a href="instrumented-runtime.html#s%3Ainstr-runtime-enabling">24.2  Enabling runtime instrumentation</a>
</li><li><a href="instrumented-runtime.html#s%3Ainstr-runtime-read">24.3  Reading traces</a>
</li><li><a href="instrumented-runtime.html#s%3Ainstr-runtime-more">24.4  Controlling instrumentation and limitations</a>
</li></ul>
<p>This chapter describes the OCaml instrumented runtime, a runtime variant
allowing the collection of events and metrics.</p><p>Collected metrics include time spent executing the <em>garbage collector</em>.
The overall execution time of individual pauses are measured
down to the time spent in specific parts of the garbage collection.
Insight is also given on memory allocation and motion by recording
the size of allocated memory blocks, as well as value promotions from the
<em>minor heap</em> to the <em>major heap</em>.</p>
<h2 class="section" id="s:instr-runtime-overview"><a class="section-anchor" href="#s:instr-runtime-overview" aria-hidden="true"></a>24.1  Overview</h2>
<p>Once compiled and linked with the instrumented runtime, any OCaml program
can generate <em>trace files</em> that can then be read
and analyzed by users in order to understand specific runtime behaviors.</p><p>The generated trace files are stored using the <em>Common Trace Format</em>, which
is a general purpose binary tracing format.
A complete trace consists of:
</p><ul class="itemize"><li class="li-itemize">
a <em>metadata file</em>, part of the OCaml distribution
</li><li class="li-itemize">and a <em>trace file</em>, generated by the runtime
 in the program being traced.
</li></ul><p>For more information on the <em>Common Trace Format</em>, see
<a href="https://diamon.org/ctf/">https://diamon.org/ctf/</a>.</p>
<h2 class="section" id="s:instr-runtime-enabling"><a class="section-anchor" href="#s:instr-runtime-enabling" aria-hidden="true"></a>24.2  Enabling runtime instrumentation</h2>
<p>For the following examples, we will use the following example program:</p><div class="caml-example verbatim">

<div class="ocaml">



<div class="pre caml-input"> <span class="ocamlkeyword">module</span> SMap = Map.Make(String)

 <span class="ocamlkeyword">let</span> s i = String.make 512 (Char.chr (i <span class="ocamlkeyword">mod</span> 256))

 <span class="ocamlkeyword">let</span> clear map = SMap.fold (<span class="ocamlkeyword">fun</span> k _ m -&gt; SMap.remove k m) map map

 <span class="ocamlkeyword">let</span> <span class="ocamlkeyword">rec</span> seq i =
   <span class="ocamlkeyword">if</span> i = 0 <span class="ocamlkeyword">then</span> Seq.empty <span class="ocamlkeyword">else</span> <span class="ocamlkeyword">fun</span> () -&gt; (Seq.Cons (i, seq (i - 1)))

 <span class="ocamlkeyword">let</span> () =
   seq 1_000_000
   |&gt; Seq.fold_left (<span class="ocamlkeyword">fun</span> m i -&gt; SMap.add (s i) i m) SMap.empty
   |&gt; clear
   |&gt; ignore</div></div>

</div><p>The next step is to compile and link the program with the instrumented runtime.
This can be done by using the <span class="c003">-runtime-variant</span> flag:</p><pre>       ocamlopt -runtime-variant i program.ml -o program
</pre><p>
Note that the instrumented runtime is an alternative runtime for OCaml
programs. It is only referenced during the linking stage of the final
executable. This means that the compilation stage does not need to be altered
to enable instrumentation.</p><p>The resulting program can then be traced by running it with the environment
variable <span class="c003">OCAML_EVENTLOG_ENABLED</span>:</p><pre>        OCAML_EVENTLOG_ENABLED=1 ./program
</pre><p>
During execution, a trace file will be generated in the
program’s current working directory.</p><h4 class="subsubsection" id="sss:instr-runtime-build-more"><a class="section-anchor" href="#sss:instr-runtime-build-more" aria-hidden="true"></a>More build examples</h4>
<p>When using the <em>dune</em> build system, this compiler invocation can be
replicated using the <span class="c003">flags</span> <span class="c003">stanza</span> when building an executable.</p><pre>       (executable
         (name program)
         (flags "-runtime-variant=i"))
</pre><p>
The instrumented runtime can also be used with the OCaml bytecode interpreter.
This can be done by either using the
<span class="c003">-runtime-variant=i</span> flag when linking the program with <span class="c003">ocamlc</span>, or by running the generated
bytecode through <span class="c003">ocamlruni</span>:</p><pre>       ocamlc program.ml -o program.byte
       OCAML_EVENTLOG_ENABLED=1 ocamlruni program.byte
</pre><p>
See chapter <a href="comp.html#c%3Acamlc">9</a> and chapter <a href="runtime.html#c%3Aruntime">11</a> for more information about
<span class="c003">ocamlc</span> and <span class="c003">ocamlrun</span>.</p>
<h2 class="section" id="s:instr-runtime-read"><a class="section-anchor" href="#s:instr-runtime-read" aria-hidden="true"></a>24.3  Reading traces</h2>
<p>Traces generated by the instrumented runtime can be analyzed with tooling
available outside of the OCaml distribution.</p><p>A complete trace consists of a <em>metadata file</em> and a <em>trace file</em>.
Two simple ways to work with the traces are the <em>eventlog-tools</em> and
<em>babeltrace</em> libraries.</p>
<h3 class="subsection" id="ss:instr-runtime-tools"><a class="section-anchor" href="#ss:instr-runtime-tools" aria-hidden="true"></a>24.3.1  eventlog-tools</h3>
<p>
<em>eventlog-tools</em> is a library implementing a parser, as well as a
a set of tools that allows to perform basic format conversions and analysis.</p><p>For more information about <em>eventlog-tools</em>, refer to the project’s
main page: <a href="https://github.com/ocaml-multicore/eventlog-tools">https://github.com/ocaml-multicore/eventlog-tools</a></p>
<h3 class="subsection" id="ss:instr-runtime-babeltrace"><a class="section-anchor" href="#ss:instr-runtime-babeltrace" aria-hidden="true"></a>24.3.2  babeltrace</h3>
<p><em>babeltrace</em> is a C library, as well as a Python binding and set of tools
that serve as the reference implementation for the <em>Common Trace Format</em>.
The <em>babeltrace</em> command line utility allows for a basic rendering
of a trace’s content, while the high level Python API can be used to
decode the trace and process them programmatically with libraries
such as <em>numpy</em> or <em>Jupyter</em>.</p><p>Unlike <em>eventlog-tools</em>, which possesses a specific knowledge of
OCaml’s <em>Common Trace Format</em> schema, it is required to provide
the OCaml <em>metadata</em> file to <em>babeltrace</em>.</p><p>The metadata file is available in the OCaml installation.
Its location can be obtained using the following command:</p><pre>        ocamlc -where
</pre><p>
The <em>eventlog_metadata</em> file can be found at this path and
copied in the same directory as the generated trace file.
However, <em>babeltrace</em> expects the file to be named
<span class="c003">metadata</span> in order to process the trace.
Thus, it will need to be renamed when copied to the trace’s directory.</p><p>Here is a naive decoder example, using <em>babeltrace</em>’s Python
library, and <em>Python 3.8</em>:</p><pre>
import subprocess
import shutil
import sys
import babeltrace as bt

def print_event(ev):
    print(ev['timestamp'])
    print(ev['pid'])
    if ev.name == "entry":
        print('entry_event')
        print(ev['phase'])
    if ev.name == "exit":
        print('exit_event')
        print(ev['phase'])
    if ev.name == "alloc":
        print(ev['count'])
        print(ev['bucket'])
    if ev.name == "counter":
        print(ev['count'])
        print(ev['kind'])
    if ev.name == "flush":
        print("flush")

def get_ocaml_dir():
    # Fetching OCaml's installation directory to extract the CTF metadata
    ocamlc_where = subprocess.run(['ocamlc', '-where'], stdout=subprocess.PIPE)
    ocaml_dir = ocamlc_where.stdout.decode('utf-8').rstrip('\n')
    return(ocaml_dir)

def main():
    trace_dir = sys.argv[1]
    ocaml_dir = get_ocaml_dir()
    metadata_path = ocaml_dir + "/eventlog_metadata"
    # copying the metadata to the trace's directory,
    # and renaming it to 'metadata'.
    shutil.copyfile(metadata_path, trace_dir + "/metadata")
    tr = bt.TraceCollection()
    tr.add_trace(trace_dir, 'ctf')
    for event in tr.events:
        print_event(event)

if __name__ == '__main__':
    main()

</pre><p>
This script expect to receive as an argument the directory containing the
trace file. It will then copy the <em>CTF</em> metadata file to the trace’s
directory, and then decode the trace, printing each event in the process.</p><p>For more information on <em>babeltrace</em>, see the website at:
<a href="https://babeltrace.org/">https://babeltrace.org/</a></p>
<h2 class="section" id="s:instr-runtime-more"><a class="section-anchor" href="#s:instr-runtime-more" aria-hidden="true"></a>24.4  Controlling instrumentation and limitations</h2>
<h3 class="subsection" id="ss:instr-runtime-prefix"><a class="section-anchor" href="#ss:instr-runtime-prefix" aria-hidden="true"></a>24.4.1  Trace filename</h3>
<p>The default trace filename is <span class="c003">caml-{PID}.eventlog</span>, where <span class="c003">{PID}</span>
is the process identifier of the traced program.</p><p>This filename can also be specified using the
<span class="c003">OCAML_EVENTLOG_PREFIX</span> environment variable.
The given path will be suffixed with <span class="c003">{.PID}.eventlog</span>.</p><pre>        OCAML_EVENTLOG_PREFIX=/tmp/a_prefix OCAML_EVENTLOG_ENABLED=1 ./program
</pre><p>
In this example, the trace will be available at path
<span class="c003">/tmp/a_prefix.{PID}.eventlog</span>.</p><p>Note that this will only affect the prefix of the trace file, there is no
option to specify the full effective file name.
This restriction is in place to make room for future improvements to the
instrumented runtime, where the single trace file per session design
may be replaced.</p><p>For scripting purpose, matching against ‘{PID}‘, as well as the
<span class="c003">.eventlog</span> file extension should provide enough control over
the generated files.</p><p>Note as well that parent directories in the given path will not be created
when opening the trace. The runtime assumes the path is
accessible for creating and writing the trace. The program will
fail to start if this requirement isn’t met.</p>
<h3 class="subsection" id="ss:instr-runtime-pause"><a class="section-anchor" href="#ss:instr-runtime-pause" aria-hidden="true"></a>24.4.2  Pausing and resuming tracing</h3>
<p>
Mechanisms are available to control event collection at runtime.</p><p><span class="c003">OCAML_EVENTLOG_ENABLED</span> can be set to the <span class="c003">p</span> flag in order
to start the program with event collection paused.</p><pre>        OCAML_EVENTLOG_ENABLED=p ./program
</pre><p>
The program will have to start event collection explicitly.
Starting and stopping event collection programmatically can be done by calling
<span class="c003">Gc.eventlog_resume</span> and <span class="c003">Gc.eventlog_pause</span>) from within the program.
Refer to the <a href="libref/Gc.html"><span class="c003">Gc</span></a> module documentation for more information.</p><p>Running the program provided earlier with <span class="c003">OCAML_EVENTLOG_ENABLED=p</span>
will for example yield the following result.</p><pre>$ OCAML_EVENTLOG_ENABLED=p ./program
$ ocaml-eventlog-report caml-{PID}.eventlog
==== eventlog/flush
median flush time: 58ns
total flush time: 58ns
flush count: 1
</pre><p>
The resulting trace contains only one event payload, namely a <em>flush</em> event,
indicating how much time was spent flushing the trace file to disk.</p><p>However, if the program is changed to include a call to
<span class="c003">Gc.eventlog_resume</span>, events payloads can be seen again
in the trace file.</p><div class="caml-example verbatim">

<div class="ocaml">



<div class="pre caml-input">        <span class="ocamlkeyword">let</span> () =
          Gc.eventlog_resume();
          seq 1_000_000
          |&gt; Seq.fold_left (<span class="ocamlkeyword">fun</span> m i -&gt; SMap.add (s i) i m) SMap.empty
          |&gt; clear
          |&gt; ignore</div></div>

</div><p>The resulting trace will contain all events encountered during
the program’s execution:</p><pre>        $ ocaml-eventlog-report caml-{PID}.eventlog
        [..omitted..]
        ==== force_minor/alloc_small
        100.0K..200.0K: 174
        20.0K..30.0K: 1
        0..100: 1

        ==== eventlog/flush
        median flush time: 207.8us
        total flush time: 938.1us
        flush count: 5
</pre>
<h3 class="subsection" id="ss:instr-runtime-limitations"><a class="section-anchor" href="#ss:instr-runtime-limitations" aria-hidden="true"></a>24.4.3  Limitations</h3>
<p>The instrumented runtime does not support the <span class="c003">fork</span> system call.
A child process forked from an instrumented program will not be traced.</p><p>The instrumented runtime aims to provide insight into the runtime’s execution
while maintaining a low overhead.
However, this overhead may become more noticeable depending on how a program
executes.
The instrumented runtime currently puts a strong emphasis on
tracing <em>garbage collection</em> events. This means that programs
with heavy garbage collection activity may be more susceptible to
tracing induced performance penalties.</p><p>While providing an accurate estimate of potential performance loss is difficult,
test on various OCaml programs showed a total running time increase ranging
from 1% to 8%.</p><p>For a program with an extended running time where the collection of only a
small sample of events is required, using the <em>eventlog_resume</em> and
<em>eventlog_pause</em> primitives may help relieve some of the
tracing induced performance impact.
</p>
<hr>
<a href="afl-fuzz.html"><img src="previous_motif.svg" alt="Previous"></a>
<a href="index.html"><img src="contents_motif.svg" alt="Up"></a>
<a href="core.html"><img src="next_motif.svg" alt="Next"></a>
</body>
</html>