1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284
|
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/2002/REC-xhtml1-20020801/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>The LTTng trace format</title>
</head>
<body>
<h1>The LTTng trace format</h1>
<p>
<em>Last update: 2008/06/02</em>
</p>
<p>
This document describes the LTTng trace format. It should be useful mainly to
developers who code the LTTng tracer or the traceread LTTV library, as this
library offers all the necessary abstractions on top of the raw trace data.
</p>
<p>
A trace is contained in a directory tree. To send a trace remotely, the
directory tree may be tar-gzipped. The trace <tt>foo</tt>, placed in the home
directory of user john, /home/john, would have the following contents:
</p>
<pre><tt>
$ cd /home/john
$ tree foo
foo/
|-- control
| |-- facilities_0
| |-- facilities_1
| |-- facilities_...
| |-- interrupts_0
| |-- interrupts_1
| |-- interrupts_...
| |-- modules_0
| |-- modules_1
| |-- modules_...
| |-- network_0
| |-- network_1
| |-- network_...
| |-- processes_0
| |-- processes_1
| `-- processes_...
|-- cpu_0
|-- cpu_1
`-- cpu_...
</tt></pre>
<p>
The root directory contains a tracefile for each cpu, numbered from 0,
in .trace format. A uniprocessor thus only contains the file cpu_0.
A multi-processor with some unused (possibly hotplug) CPU slots may have some
unused CPU numbers. For instance an 8 way SMP board with 6 CPUs randomly
installed may produce tracefiles named 0, 1, 2, 4, 6, 7.
</p>
<p>
The files in the control directory also follow the .trace format and are
also per cpu. The "facilities" files only contain "core" marker_id,
marker_format and time_heartbeat events. The first two are used to describe the
events that are in the trace. The other control files contain the initial
system state and various subsequent important events, for example process
creations and exit. The interest of placing such subsequent events in control
trace files instead of (or in addition to) in the per cpu trace files is that
they may be accessed more quickly/conveniently and that they may be kept even
when the per cpu files are overwritten in "flight recorder mode".
</p>
<h2>Trace format</h2>
<p>
Each tracefile is divided into equal size blocks with a header at the beginning
of the block. Events are packed sequentially in the block starting right after
the block header.
</p>
<p>
Each block consists of :
</p>
<pre><tt>
block start/end header
trace header
event 1 header
event 1 variable length data
event 2 header
event 2 variable length data
....
padding
</tt></pre>
<h3>The block start/end header</h3>
<pre><tt>
begin
* the beginning of buffer information
uint64 cycle_count
* TSC at the beginning of the buffer
uint64 freq
* frequency of the CPUs at the beginning of the buffer.
end
* the end of buffer information
uint64 cycle_count
* TSC at the end of the buffer
uint64 freq
* frequency of the CPUs at the end of the buffer.
uint32 lost_size
* number of bytes of padding at the end of the buffer.
uint32 buf_size
* size of the sub-buffer.
</tt></pre>
<h3>The trace header</h3>
<pre><tt>
uint32 magic_number
* 0x00D6B7ED, used to check the trace byte order vs host byte order.
uint32 arch_type
* Architecture type of the traced machine.
uint32 arch_variant
* Architecture variant of the traced machine. May be unused on some arch.
uint32 float_word_order
* Byte order of floats and doubles, sometimes different from integer byte
order. Useful only for user space traces.
uint8 arch_size
* Size (in bytes) of the void * on the traced machine.
uint8 major_version
* major version of the trace.
uint8 minor_version
* minor version of the trace.
uint8 flight_recorder
* Is flight recorder mode activated ? If yes, data might be missing
(overwritten) in the trace.
uint8 has_heartbeat
* Does this trace have heartbeat timer event activated ?
Yes (1) -> Event header has 32 bits TSC
No (0) -> Event header has 64 bits TSC
uint8 alignment
* Are event headers in this trace aligned ?
Yes -> the value indicates the alignment
No (0) -> data is packed.
uint8 tsc_lsb_truncate
* Used for compact channels
uint8 tscbits
* Used for compact channels
uint8 compact_data_shift
* Used for compact channels
uint32 freq_scale
event time is always calculated from :
trace_start_time + ((event_tsc - trace_start_tsc) * (freq / freq_scale))
uint64 start_freq
* CPUs clock frequency at the beginnig of the trace.
uint64 start_tsc
* TSC at the beginning of the trace.
uint64 start_monotonic
* monotonically increasing time at the beginning of the trace.
(currently not supported)
start_time
* Real time at the beginning of the trace (as given by date, adjusted by NTP)
This is the only time reference with the real world : the rest of the trace
has monotonically increasing time from this point (with TSC difference and
clock frequency).
uint32 seconds
uint32 nanoseconds
</tt></pre>
<h3>Event header</h3>
<p>
Event headers differ according to the following conditions : does the
traced system have a heartbeat timer? Is tracing alignment activated?
</p>
<p>
Event header :
</p>
<pre><tt>
{ uint32 timestamp
or
uint64 timestamp }
* if has_heartbeat : 32 LSB of the cycle counter at the event record time.
* else : 64 bits complete cycle counter.
uint8 facility_id
* Numerical ID of the facility corresponding to the event. See the facility
tracefile to know which facility ID matches which facility name and
description.
uint8 event_id
* Numerical ID of the event inside the facility.
uint16 event_size
* Size of the variable length data that follows this header.
</tt></pre>
<p>
Event header alignment
</p>
<p>
If trace alignment is activated (<tt>alignment</tt>), the event header is
aligned. In addition, padding is automatically added after the event header so
the variable length data is automatically aligned on the architecture size.
</p>
<!--
<h2>System description</h2>
<p>
The system type description, in system.xml, looks like:
</p>
<pre><tt>
<system
node_name="vaucluse"
domainname="polymtl.ca"
cpu=4
arch_size="ILP32"
endian="little"
kernel_name="Linux"
kernel_release="2.4.18-686-smp"
kernel_version="#1 SMP Sun Apr 14 12:07:19 EST 2002"
machine="i686"
processor="unknown"
hardware_platform="unknown"
operating_system="Linux"
ltt_major_version="2"
ltt_minor_version="0"
ltt_block_size="100000"
>
Some comments about the system
</system>
</tt></pre>
<p>
The system attributes kernel_name, node_name, kernel_release,
kernel_version, machine, processor, hardware_platform and operating_system
come from the uname(1) program. The domainname attribute is obtained from
the "hostname --domain" command. The arch_size attribute is one of
LP32, ILP32, LP64 or ILP64 and specifies the length in bits of integers (I),
long (L) and pointers (P). The endian attribute is "little" or "big".
While the arch_size and endian attributes could be deduced from the platform
type, having these explicit allows analysing traces from yet unknown
platforms. The cpu attribute specifies the maximum number of processors in
the system; only tracefiles 0 to this maximum - 1 may exist in the cpu
directory.
</p>
<p>
Within the system element, the text enclosed may describe further the
system traced.
</p>
<h2>Bookmarks</h2>
<p>
Bookmarks are user supplied information added to a trace. They contain user
annotations attached to a time interval.
</p>
<pre><tt>
<bookmarks>
<location name=name cpu=n start_time=t end_time=t>Some text</location>
...
</bookmarks>
</tt></pre>
<p>
The interval is defined using either "time=" or "start_time=" and
"end_time=", or "cycle=" or "start_cycle=" and "end_cycle=".
The time is in seconds with decimals up to nanoseconds and cycle counts
are unsigned integers with a 64 bits range. The cpu attribute is optional.
</p>
-->
</body>
</html>
|