<link rel=stylesheet href="style.css" type="text/css">
<title>collectl - Infiniband</title>
Collectl V3.7.3 now supports monitoring infiniband by looking at 64 bit counters,
when the HCA supports them and virtually all of them do. This means several
<li>collectl no longer has to read/clear the counters to read them and so is
<li>there is no longer a restriction of multiple instances of collectl monintoring
<li>you can now read these counters at greater intervals (if you really want to)
with no fear of them latching when they hit their 32 bit maximum</li>
The easiest way to tell if your HCA supports 64 bit counters is to run perfquery -x
and if it works, you have 64 bit counters. Alternatively you could also run:
<i>collectl -sx --showheader</i>
and if it displays an X in the flag field, you have them. If you do have 64 bit
counters but collectl doesn't report the X, you have an older version installed.
The code to deal with 32 bit counters will be left in place for awhile but
eventually removed. The rest of this documentation talks about monitoring
the narrower counters and is largely unchanged from before.
<h3>32 Bit Counters</h3>
The most important thing you should know about 32 bit monitoring is that it
is <i>destructive</i>. What is meant by this is that every time
collectl reads the counters from the HCA it immediately resets them to zero, thereby
<i>destroying</i> their previous contents. You should also note this does
<i>not</i> apply error counters, which are never reset.
The obvious question is <i>why?</i> and perhaps the less than obvious answer is
because when the hardware specifications were written for the Infiniband HCAs it
was decided that performance counters would not wrap, probably because nobody
thought someone might want to do continuous sampling. In any event, at even modest
traffic rates HCAs with 32-bit counters quickly reach their maximum values and
stop incrementing, rendering them useless for performance monitors like collectl.
Collectl's solution to this problem is to read the
counters and immediately reset them to 0. As long as the next sampling period occurs
before the counters <i>fill up</i>, this methodology comes reasonably close to
reflecting the traffic rates (some counts are lost between the read and reset).
However, this methodology has a downside in that while collectl is monitoring
the Infiniband stats, nobody else can (including other copies of collectl).
Unfortunately there is no solution to this problem short of redesigning the HCA
and that's simply not going to happen. A second alternative would be to come
up with a mechanism in which the read/rest of the counters are moved into an OFED
module which exports these to /proc or /sys as rolling counters. This was in
fact done in a pre-ofed version of Voltaire's IB stack which is currently supported
by collectl. If someone would like to hear more details on how this was done, feel
free to contact me or to post something in a collectl
<a href=http://sourceforge.net/forum/?group_id=196536>forum</a> or to the
<a href=http://sourceforge.net/mail/?group_id=196536>mailing list</a>.
If you want to run collectl but also prevent it from doing destructive monitoring,
simple comment out the line in <i>/etc/collectl.conf</i> that begins with
<i>PQuery =</i> and you will be informed by collectl that Infiniband monitoring
has been disabled whenever someone try to monitor it.
The main purpose of this section is to help you understand how monitoring works so when
it doesn't you might be able to figure out what went wrong. There are 2 different ways
collectl can monitor Infiniband, one for the OFED stack, which is the Infiniband Stack
of choice these days and the other for pre-OFED.
The OFED stack can be identified by the presence of the <i>/sys/class/infiniband</i>
directory. If there, collectl looks inside to find which HCAs are present and which
ports are active. This information is then used to query the HCA via the <i>perfquery</i>
Unfortunately, with each release of OFED that utility seems to move to another location
and collectl tries to react by using a search path in /etc/collectl.conf. As of the 2.5.1
release of collectl, if it still can't find the utility it will try to find its location
with rpm and then add its path to collectl.conf. If a future OFED release eliminates or
replaces perfquery collectl will break.
All pre-OFED monitoring code has been removed.
Collectl has a variety of debugging capabilities built into it, the main one being
the debug switch -d. To use this switch you specify a bit mask which is then applied
against a variety settings which tells collectl what to display. For
debugging interconnect problems simply use -d2. All possible bit settings
and their meanings are listed in the beginning of collectl itself.
If collectl runs without errors but you're not seeing IB traffic being reported
when you think you should, you can always use -d4 or even -d6, which show
the values of the counters returned by both perfquery and get_pcounter. If they
don't change something outside of collectl must be wrong.
One example of a non-collectl problem was a system had IB configured and started
which could be verified by seein an <i>ib0</i> interface show up with <i>ifconfig</i>.
running <i>collectl -sN</i>, which will show the traffic over all the network
interfaces, there was never any traffic on the <i>ib</i> interface however there
was unexpected traffic on one of the <i>eth</i> interfaces. Clearly something
was wrong and looking at the routing showed the routes were set such that all
traffic to the infiniband address was being routed over the <i>eth</i> interface.
<table width=100%><tr><td align=right><i>updated Feb 04, 2014</i></td></tr></colgroup></table>