<link rel=stylesheet href="style.css" type="text/css">
<title>collectl - Logging</title>
Collectl supports 2 very basic data logging mechanisms. In the
first case it will log the data as read from /proc to a file with
the extension <i>raw</i> or <i>raw.gz</i>, depending on whether or not the
perl module Compress::Zlib.pm has been installed.
If not, one can always install compression at a later
time and collectl will happily use it the next time it is started.
One useful property of raw files is that one can play them back
using different switches/options for display or generation of
plottable files from them.
The second major form of logging is writing data to one or more tabularized,
also known as <i>plottable</i>
files, which have the extension <i>tab</i> for data associated with the <i>core</i>
subsystems or one of several other files for the detail data associated
with devices like cpus, disks, networks, etc.
The biggest benefit of raw files is they are very lightweight to create in that
no additional processing is performed on the data.
Since they contain the unaltered /proc data from
which collectl derives its numbers to report, it is always possible to go back
and look at the orginal data.
In some cases, there is data in the raw file that was easier to
collect than ignore and in these situations one can actually see more data than
is normally available. In fact the <i>--grep</i> switch is available for looking
for data in the raw files and prefacing them with timestamps, something the standard
grep command cannot do.
As their type implies, plottable files have their data in a form that is ready
to be plotted with tools like gnuplot or immediately loadable into a spreadsheet
like OpenOffice or Excel or any other tool that can read space-separated data.
When generated by collectl while it is running, this data can be read
while it is being generated making it possible to do real-time monitoring/display
of it. For situations where a tool requires data be delimited by something other
than spaces, one can change the data separator with --sep. In fact, for the
case where a tool such as rrd requires the date be in UTC format, you can even
change the timestamp format using --utc.
There are 2 switches you should become familiar with for logging data, noting that
you cannot write to a file and the terminal at the same time.
<li><b>-f</i></b> tells collectl to write a log file to the specified directory. All files
written by collectl have a predefined name format which includes the hostname and date
and in some cases the time as well. You cannot change this format but if -f does not specify
a directory name, that string is prepended to the standard output file name</li>
<li><b>-P</b> which when used in combination with <i>-f</i> instructs collectl to write
a <a href=Plotfiles.html><i>plottable</i></a> file instead</li>
As collectl continues to grow in functionality and collect more data, linux is also growing
in complexity and increasing the number of active processes as well as expanding the number
of slabs. There is a 3rd option which has been around for quite awhile but has had minimal
use or discussion and that is the -G or --group switch. When specified, this tells collectl
to write process and slab data to a second file named <i>rawp</i> (initially it only
contained process data). The main reason for doing so is because without this switch a typical
raw file, even when compressed, can approach 50MB or more, growing even larger as the number
of active processes grows.
While large files are nothing new to collectl, playing them back either for the purpose of drilling
into the data or to simply generate plot files can become very expensive in terms of time and CPU
load. In extreme some cases it can take tens of minutes to process a single, large raw file and even
in normal cases it will take multiple minutes. Having
collectl write to 2 separate files doesn't add any additional overhead or disk space
but can significantly reduce the playback time when you are not interested in slab or process data,
which is often the case during initial analysis.
As a data point, on my development system, single compressed collectl logs are on the order of 35MB.
When using the -G switch, it generated a pair of files where the process/slab data is about 34MB
and the file with the rest of the data is only 1MB making the <i>raw</i>, where all the subsystem
details are stored, very efficient to process in playback mode, taking about a minute compared to
5 minutes when that file includes slab and process data.
For most users this is all you need to know. On the other hand if
want to use collectl to feed data to other tools or perhaps log to both
raw and plot files at the same time, read on...
<h3>Logging both raw and plottable data at the same time!</h3>
The main benefit in requesting collectl to write its data in plottable form is
that data becomes available for immediate plotting without any post-processing
required, the one expense being some additional processing
However there are a few potential limits in doing so that should be understood.
First and foremost, once a plottable file has been created the original
data from which it was created is lost forever. In many cases that is fine as many users
feel there is really no need to go back to the original source.
However, one often collects summary data because
that is what they are interested in, but then later decides they want to look at the details.
This can be easily done by just replaying the raw file and requesting details be
displayed or (re)written to a plottable file. If the raw file had not
been generated, this option would not be possible.
A second limitation with plottable
data files is that one cannot easily examine the data by timeframes
and when there are multiple data files involved,
it is not easy to look at all the data together as time-oriented samples without plotting it.
It is always possible to write a script that merges this data together, but that
functionality is natively built into collectl when used in playback mode.
Finally, there are times when one might wish to go back and look at non-normalized
data, for example if one has 3 processes created over a 10 second period
collectl will report a rate of 0 process creations/second because it would round down
and the only way to
see what really happened is to play the data back with -on, which tells collectl
not to normalize the data and will therefore tell you the value of
the counter not its rate.
In most cases none of these restrictions should be a concern, but there
may be occasions in which they are and that is where the --rawtoo switch
comes in. When specified in conjunction with -P,
collectl will generate raw data in addition to the plottable
data, making it possible to go back to the source if/when necessary.
The only real overhead is the amount of disk space required since the raw data
is already sitting in a buffer and ready to be written.
If the plottable files are being generated in uncompressed format,
the size of the compressed raw file becomes even less significant.
<h3>Exported output, the 3rd type of file</h3>
We finally come to a third type of output, intended primarily for feeding
collectl data to other programs, and that is <i>exported</i> output. There are
currently a variety of types of exports delivered with collectl though
only three are capable of generating local data files and their use complicates the
picture. To better understand how logging works in the context of <i>--export</i>,
see the <a href=Export.html> description </a> of how they work and in particular
how collectl decides where and when to do <i>logging</i>.
<li>The first is the <a href=http://en.wikipedia.org/wiki/S-expression>s-expression</a>.
S-expressions have been around for many years having their earliest roots in
programming languages such as Lisp and Scheme, as described in the Wikipedia
and offer a semi-structured mechanism for the representation of data. One such
environment in which they are heavily used is
and by providing a mechanism for collectl to
write s-expressions, one can more easily supply data to supermon or any other
tools that might wish to consume it in close to real-time.
The actual contents of the s-expressions will be driven by the subsystems for which
data is being collected.</li>
<li>The second type is the <i>L-expression</i> which is something that has been
invented purely for collectl as an alternative form of output which in some
environments can be easier to parse than the more complex <i>s-sepression</i>.</li>
<li>The third and newest type, named <i>gexpr</i>, writes its output over a UDP
socket in a format understood by the <a href=http://ganglia.info/>ganglia</a>
monitoring daemon, gmond. You can read more about this <a href=Gexpr.html>here</a>.
If you are a little confused, and you probably should be, try experimenting with
various combinations of switches and see which files get generated.
So what is the overhead associated with all this logging? From the perspective of
CPU load it can be quite minimal since in most cases the data is already in hand and
all that needs to be done is to write it out to one or more additional files, something that
is a fairly low-overhead operation on Linux systems. If this is really a concern,
<a href=Performance.html>measure it</a> yourself. It you want to see how much disk space involved
just examine the sizes of the file(s) created during the performance tests and see for yourself.
<table width=100%><tr><td align=right><i>updated Feb 21, 2011</i></td></tr></colgroup></table>