File: pfmon.1

package info (click to toggle)
pfmon 3.2.070507-1
  • links: PTS
  • area: main
  • in suites: lenny
  • size: 1,164 kB
  • ctags: 1,922
  • sloc: ansic: 18,321; makefile: 235
file content (348 lines) | stat: -rw-r--r-- 14,764 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
.TH PFMON 1 "April 2006" "pfmon 3.2" "User's command"
.SH NAME
pfmon \- a hardware-based performance monitoring tool
.SH SYNOPSIS
.nf
.B pfmon [OPTION] [PROGNAME]
.sp
.SH DESCRIPTION
The pfmon tool is a command line performance monitoring tool using the perfmon
interface to access to hardware performance counters of certain processors.
This version supports the following processors:
.TP
.B Itanium processors
Itanium, Itanium 2 (McKinley, Madison and variants), Dual-Core Itanium 2
(Montecito). Pfmon runs with any 2.6.x kernels for Itanium processors.

.TP 
.B AMD Opteron processors
You need to have a kernel with perfmon v2.2 or higher or pfmon to work.

.TP 
.B Intel Pentium M, PIII, Pentium 4/Xeon, Core Duo/Solo, Core 2 Duo processors, 
You need to have a kernel with perfmon v2.2 or higher or pfmon to work.
.LP
.sp
With pfmon, it is possible to monitor a single thread or the entire system. 
It is also possible to monitor multi-process and multi-threaded programs.
For each, it is possible to collect simple counts or profiles.
.sp
The set of events that can be measured depends on the underlying processor.
Similarly certains options are specific to a processor model. In general
pfmon gives acess to all processor-specific monitoring features.
.sp
.SH generic options
Pfmon provides the following options on all processors:
.TP
.B -h or --help
display list of available options and exit
.TP
.B -V or --version
print pfmon version information and exit
.TP
.B -l[regex] or --show-events[=regex]
If \fBregex\fR is not provided, pfmon lists the names of all available events for
the current processor. Otherwise only the events matching the regular expression are
printed.
.TP
.B -L or --long-show-events[=regex]
If \fBregex\fR is not provided, pfmon lists all available events for the current
processor with all their unit masks using the event_name:unit_mask_name
notation. Only one unit mask per line is printed, thus multiple lines may be
printed. If \fBregex\fR is provided, it is applied only on the event name and
not on the unit masks. All event names matching the pattern are printed.
.TP
.B -i event or --event-info=event
Display detailed information about an event. The \fBevent\fR parameter can
either be the event code, the event name, or a regular expression. In case
multiple events match the expression, they are all printed.
.TP
.B -u, -3, or --user-level
Monitor at the user level for all events. By default, this option is turned on.
.TP
.B -k, -0, or --kernel-level
Monitor at the kernel level for all events. By default, this option is turned off.
.TP
.B -1
Monitor execution at privilege level 1. By default, this option is turned off.
.TP
.B -2
Monitor execution at privilege level 2. By default, this option is turned off.
.TP
.B -e ev1,ev2,... or --events=ev1,ev2,...
Select events to monitor. The events are specified by name or event code. If
there are multiple events, they must be passed as a comma separated list
\fBwithout\fR spaces. The maximum number of events depends on the underlying
processors. Events requiring unit mask can be specified using the notation:
event_name:unit_mask1:unit_mask2.... Each \fB-e\fR option forms a set of events,
multiple sets can be defined by specifying the \fB-e\fR option multiple times.
Events related options always apply to the last defined sets. All events from a set 
are measured together. Pfmon uses the perfmon interface to multiplex
the sets on the actual processors. In case multiple sets are used, pfmon 
scales the final count  to provides estimates of what the actual count
would have been had all the events been measured throughout the entire
duration of the run. Pfmon does not re-arrange events between sets in case
they cannot be measured together.
.TP
.B -I or --info
Print information related to the pfmon version, the support processor models and
built-in sampling modules.
.TP
.B -t secs or --session-timeout=secs
Duration of the monitoring session expressed in seconds. Once the timeout
expires, pfmon stops monitoring and print final counts or profiles.
.TP
.B -S format or --smpl-module-info=format
Display information about a sampling module.
.TP
.B --debug
Enable debug output (for experts).
.TP
.B --verbose
Print more information about the execution of pfmon.
.TP
.B --outfile=filename
Print final counts in the file called \fBfilename\fR. By default, all
results (count or profiles) are printed on the terminal.
.TP
.B --append
Append results (counts or profile) to the current output file. If
\fB--outfile\fB or \fB--smpl-outfile\fR are not provided results are printed on the screen.
.TP
.B --overflow-block
Block the monitored thread when the sampling buffer becomes full. This option is
only available in per-thread mode. By default, this option is turned off meaning
tha the monitored thread keeps on running, with monitoring disabled, while pfmon is
processing the sampling buffer. In other words, there may be blind spots.
.TP
.B --system-wide
Create a system wide monitoring session where pfmon measured all threads running
on a set of processors. By default this option is turned off,
i.e., pfmon operates in per-thread mode. By default, system-wide mode measures
the same events on all available processors. It is possible to restrict to a
subset of processor using the \fB--cpu-list\fR option. 
.TP
.B --smpl-outfile=filename
Save profiles into the file called \fBfilename\fR. By default, profiles are
printed on the terminal.
.TP
.B --long-smpl-periods=val1,val2,...
Set the sampling period to reload into the overflowed counter(s) after the last
sample is recorded into the sampling buffer, i.e. when the buffer becomes full.
The values must be passed in the same order as the events they refer to. For
instance, if the events are passed as \fB-eev1,ev2\fR then the sampling periods
for \fBev1\fR must be the first, followed by the period for \fBev2\fR.
It is possible to skip a period, by providing an empty element in the list,
e.g., \fB--long-smpl-periods=,val2\fR. Sampling periods are expressed in the
same unit as the event they refer to. If an event counts the number of
instructions retired, then the sampling period is using the same unit, i.e.,
instructions retired. To sample every 100,000 instructions, you can pass
\fB--long-smpl-periods=100000\fR.
.TP
.B --short-smpl-periods=val1,val2,...
Set the sampling to reload into the overflowed counter(s) after a sample
is recorded into the buffer and when that sample is not the last, i.e., when
the buffer still has space remaining. Other than that, this option works
exactly like \fB--long-smpl-periods\fR.
.TP
.B --smpl-entries=n
Selects the number of samples that the kernel sampling buffer can hold.
The default size is determined dynamically by pfmon based on the size
of a sample and system resource limits such as the amount of locked
memory allowed for a user process (as reported by ulimit).
.TP
.B --with-header
Generates a header before printing counts or profiles. The header contains
information about the configuration of the host systems and about the
measurement being made.
.TP
.B --cpu-list=num,num1-num2,...
For system-wide mode, this option specifies the list of processors to monitor.
Without this option, all available processors are monitored. Processors can
be specified individually with their index, or by range.
.TP
.B --aggregate-results
aggregate counts and profiles output. By default, this option is off meaning
that results are per-thread or per-CPU.
.TP
.B --trigger-code-start-address=addr
Start monitoring the first time code executes at address \fBaddr\fR. The address 
can be specified in hexadecimal or with a symbol.
.TP
.B --trigger-code-stop-address=addr
Stop monitoring the first time code executes at address \fBaddr\fR. The address
can be specified in hexadecimal or with a symbol.
.TP
.B --trigger-data-start-address=addr
Start monitoring when the data address at address \fBaddr\fR is accessed. By default,
this is for any read or write access.
.TP
.B --trigger-data-stop-address=addr
Stop monitoring when data address at address \fBaddr\fR is accessed. By default,
this is for any read of write access.
.TP
.B --trigger-code-repeat
By default, the start and stop code triggers are activated  only the first time they are
reached. With this option, it is possible to repeat the start/stop behavior
each time the execution crosses the trigger address.
.TP
.B --trigger-code-follow
Apply the start/stop code triggers to all monitored threads. By default,
triggers are only applied to the first thread. This option has no effect
on system-wide measurements.
.TP
.B --trigger-data-repeat
By default, the start and stop data triggers are activated  only the first time they are
reached. With this option, it is possible to repeat the start/stop behavior
each time the data address is accessed.
.TP
.B --trigger-data-follow
Apply the start/stop data triggers to all monitored threads. By default,
triggers are only applied to the first thread. This option has no effect
on system-wide measurements.
.TP
.B --trigger-data-ro
Data trigger are activated on read access only. By default, they are activated
on read or write access.
.TP
.B --trigger-data-wo
Data trigger activated on write access only. By default, they are activated on
read or write access.
.TP
.B --trigger-start-delay=secs
Number of seconds before activating monitoring. By default, monitoring is
activated immediatly, except when code/data triggers are used.
.TP
.B--priv-levels=lvl1,lvl2,...
Set privilege level per event. The levels apply to the current set, i.e. the
last \fB-e\fR option. The levels are specified in the same order as the events.
Accepted values for privileges are: u, k, 0, 1, 2, 3 or any combinations
thereof.
.TP
.B --us-counter-format
Print counts using commas, e.g., 1,024.
.TP
.B --eu-counter-format
Print count using points, e.g., 1.024.
.TP
.B --hex-counter-format
Print count using hexadecimal, e.g., 0x400.
.TP
.B --smpl-module=name
Select the sampling module. By default the first module that matches the
PMU model is used. This is typically the detailed-* module. To figure out
which modules are supports, use the \fB-I\fR option.
.TP
.B --show-time
Show real,user, and system time for the command executed in per-thread mode.
.TP
.B --symbol-file=filename
ELF image containing the symbol table for the command being monitored. 
By default, pfmon uses the binary image on disk. 
.TP
.B --sysmap-file=filename
System.map format file containing the kernel symbol table.
.TP
.B --check-events-only
Verify combination of events and exit. No measurement is performed.
.TP
.B --smpl-periods-random=mask1:seed1,...
Apply randomization to long and short periods. For each period, a seed and
a mask value must be passed. The mask is a bitmask representing the range of
variation for randomization. As of perfmon v2.3, the seed value is now ignored.
.TP
.B --smpl-print-counts
When sampling, the final counts for the counters are not printed by default.
This option forces counts to be printed at the end of a sampling measurement.
.TP
.B --attach-task pid
Attach to thread identified by \fBpid\fR that is already running. User must have
permission to attach to the thread.
.TP
.B --reset-non-smpl-periods
At the end of a sampling period, reset all counters.
.TP
.B --follow-fork
Monitoring continues across fork(). By default monitoring is not propagated to
child processes. This option has no effect in system-wide mode.
.TP
.B --follow-vfork
Monitoring continues across vfork(). By default monitoring is not propagated to
child processes. This option has no effect in system-wide mode.
.TP
.B --follow-pthread
Monitoring continues across pthread_create(). By default monitoring is not propagated to
new threads. This option has no effect in system-wide mode.
.TP
.B --follow-exec[=pattern]
Monitoring follows through the exec*() system call. By default monitoring stops at
exec*(). It is possible to specify a regular expression pattern to filter out
which command gets monitored. Without the pattern all commands are monitored.
.TP
.B --follow-exec-exclude=pattern
Monitoring follows through the exec*() system call. By default monitoring stops
at exec*(). This option is the counter-part of \fB--follow-exec\fR in that
the pattern specifies the command which must be excluded from monitoring.
Depending on the monitored workload, it may be easier to specify the commands to
excludes rather than the commands to include.
.TP
.B --follow-all
This option is equivalent to specifying all of --follow-fork, --follow-vfork,
--follow-pthreads, --follow-exec.
.TP
.B --no-cmd-output
Redirect all output of executed commands to /dev/null.
.TP
.B --exec-split-results
Generate separate results output for execution before and after exec*(). 
.TP
.B --resolve-addresses
Resolve all code/data addresses in profiles using symbol table information.
If the symbol information is not present, the raw address is printed. By
default, only raw addresses are printed.
.TP
.B --extra-smpl-pmds=num,num1-num2,...
Specify a list of extra PMD register to include in samples. Those PMD registers
are typically virtual PMD registers not tied to counters.
.TP
.B --demangle-cpp
C++ symbol demangling. By default, no symbol demangling is performed.
.TP
.B --demangle-java
Java symbol demangling. By default, no symbol demangling is performed.
.TP
.B --saturate-smpl-buffer
Stop collecting samples the first time the sampling buffer becomes full. In
other words, simply collect the first N entries when \fB--smpl-entries=N\fR.
By default, this option is off.
.TP
.B --pin-command
Pin executed command on the CPUs specified by --cpu-list. This option is only
relavant in system-wide mode.
.TP
.B --switch-timeout=milliseconds
The number of milliseconds before switching from one event set to the next.
Depending on the granularity of the underlying operating system timer tick,
the timeout may be rounded up. If the difference with the user provided timeout
exeeds 2%, pfmon prints a warning message.
.TP
.B --dont-start
Do not activate monitoring. This option is useful on architectures where 
it is possible to start/stop counters directly from the user level.
.TP
.B --cpu-set-relative
With this option, CPU identifications for \fB--cpu--list\fR are relative to
cpu_set affinity. By default, they are relative to actual CPU0.
.TP
.B --print-interval=msecs
With this option, intermediate results can be generated when counting in a
system-wide session. Pfmon prints the delta for each event since the last
print. The interval is expressed in milliseconds. This option is not
supported in per-thread mode.
.SH SEE ALSO
Visit \fBhttp://perfmon2.sf.net\fR for more detailed documentation including
processor specific options.

.SH AUTHOR
Stephane Eranian <eranian@hpl.hp.com>
.PP