File: netperf.texi

package info (click to toggle)
netperf 2.4.4-6.1
links: PTS
area: non-free
in suites: squeeze, wheezy
size: 4,980 kB
ctags: 2,492
sloc: ansic: 31,252; sh: 4,797; makefile: 60
file content (2822 lines) | stat: -rw-r--r-- 119,695 bytes
parent folder | download | duplicates (2)
\input texinfo   @c -*-texinfo-*-
@c %**start of header
@setfilename netperf.info
@settitle Care and Feeding of Netperf 2.4.X
@c %**end of header

@copying
This is Rick Jones' feeble attempt at a Texinfo-based manual for the
netperf benchmark. 

Copyright @copyright{} 2005-2007 Hewlett-Packard Company
@quotation
Permission is granted to copy, distribute and/or modify this document
per the terms of the netperf source licence, a copy of which can be
found in the file @file{COPYING} of the basic netperf distribution.
@end quotation
@end copying

@titlepage
@title Care and Feeding of Netperf
@subtitle Versions 2.4.3 and Later
@author Rick Jones @email{rick.jones2@@hp.com}
@c this is here to start the copyright page
@page
@vskip 0pt plus 1filll
@insertcopying
@end titlepage

@c begin with a table of contents
@contents

@ifnottex
@node Top, Introduction, (dir), (dir)
@top Netperf Manual

@insertcopying
@end ifnottex

@menu
* Introduction::                An introduction to netperf - what it is and whatit is not.
* Installing Netperf::          How to go about installing netperf.
* The Design of Netperf::       
* Global Command-line Options::  
* Using Netperf to Measure Bulk Data Transfer::  
* Using Netperf to Measure Request/Response ::  
* Using Netperf to Measure Aggregate Performance::  
* Using Netperf to Measure Bidirectional Transfer::  
* Other Netperf Tests::         
* Address Resolution::          
* Enhancing Netperf::           
* Netperf4::                    
* Concept Index::               
* Option Index::                

@detailmenu
 --- The Detailed Node Listing ---

Introduction

* Conventions::                 

Installing Netperf

* Getting Netperf Bits::        
* Installing Netperf Bits::     
* Verifying Installation::      

The Design of Netperf

* CPU Utilization::             

Global Command-line Options

* Command-line Options Syntax::  
* Global Options::              

Using Netperf to Measure Bulk Data Transfer

* Issues in Bulk Transfer::     
* Options common to TCP UDP and SCTP tests::  

Options common to TCP UDP and SCTP tests

* TCP_STREAM::                  
* TCP_MAERTS::                  
* TCP_SENDFILE::                
* UDP_STREAM::                  
* XTI_TCP_STREAM::              
* XTI_UDP_STREAM::              
* SCTP_STREAM::                 
* DLCO_STREAM::                 
* DLCL_STREAM::                 
* STREAM_STREAM::               
* DG_STREAM::                   

Using Netperf to Measure Request/Response 

* Issues in Request/Response::  
* Options Common to TCP UDP and SCTP _RR tests::  

Options Common to TCP UDP and SCTP _RR tests

* TCP_RR::                      
* TCP_CC::                      
* TCP_CRR::                     
* UDP_RR::                      
* XTI_TCP_RR::                  
* XTI_TCP_CC::                  
* XTI_TCP_CRR::                 
* XTI_UDP_RR::                  
* DLCL_RR::                     
* DLCO_RR::                     
* SCTP_RR::                     

Using Netperf to Measure Aggregate Performance

* Running Concurrent Netperf Tests::  
* Using --enable-burst::        

Using Netperf to Measure Bidirectional Transfer

* Bidirectional Transfer with Concurrent Tests::  
* Bidirectional Transfer with TCP_RR::  

Other Netperf Tests

* CPU rate calibration::        

@end detailmenu
@end menu

@node Introduction, Installing Netperf, Top, Top
@chapter Introduction

@cindex Introduction

Netperf is a benchmark that can be use to measure various aspect of
networking performance.  The primary foci are bulk (aka
unidirectional) data transfer and request/response performance using
either TCP or UDP and the Berkeley Sockets interface.  As of this
writing, the tests available either unconditionally or conditionally
include:

@itemize @bullet
@item
TCP and UDP unidirectional transfer and request/response over IPv4 and
IPv6 using the Sockets interface.
@item
TCP and UDP unidirectional transfer and request/response over IPv4
using the XTI interface.
@item
Link-level unidirectional transfer and request/response using the DLPI
interface. 
@item
Unix domain sockets
@item
SCTP unidirectional transfer and request/response over IPv4 and IPv6
using the sockets interface.
@end itemize

While not every revision of netperf will work on every platform
listed, the intention is that at least some version of netperf will
work on the following platforms:

@itemize @bullet
@item
Unix - at least all the major variants.
@item
Linux
@item
Windows
@item
OpenVMS
@item
Others
@end itemize

Netperf is maintained and informally supported primarily by Rick
Jones, who can perhaps be best described as Netperf Contributing
Editor.  Non-trivial and very appreciated assistance comes from others
in the network performance community, who are too numerous to mention
here. While it is often used by them, netperf is NOT supported via any
of the formal Hewlett-Packard support channels.  You should feel free
to make enhancements and modifications to netperf to suit your
nefarious porpoises, so long as you stay within the guidelines of the
netperf copyright.  If you feel so inclined, you can send your changes
to
@email{netperf-feedback@@netperf.org,netperf-feedback} for possible
inclusion into subsequent versions of netperf.

If you would prefer to make contributions to networking benchmark
using certified ``open source'' license, please considuer netperf4,
which is distributed under the terms of the GPL.

The @email{netperf-talk@@netperf.org,netperf-talk} mailing list is
available to discuss the care and feeding of netperf with others who
share your interest in network performance benchmarking. The
netperf-talk mailing list is a closed list and you must first
subscribe by sending email to
@email{netperf-talk-request@@netperf.org,netperf-talk-request}.


@menu
* Conventions::                 
@end menu

@node Conventions,  , Introduction, Introduction
@section Conventions

A @dfn{sizespec} is a one or two item, comma-separated list used as an
argument to a command-line option that can set one or two, related
netperf parameters.  If you wish to set both parameters to separate
values, items should be separated by a comma:

@example
parameter1,parameter2
@end example

If you wish to set the first parameter without altering the value of
the second from its default, you should follow the first item with a
comma:

@example
parameter1,
@end example


Likewise, precede the item with a comma if you wish to set only the
second parameter:

@example
,parameter2
@end example

An item with no commas:

@example
parameter1and2
@end example

will set both parameters to the same value.  This last mode is one of
the most frequently used.

There is another variant of the comma-separated, two-item list called
a @dfn{optionspec} which is like a sizespec with the exception that a
single item with no comma:

@example
parameter1
@end example

will only set the value of the first parameter and will leave the
second parameter at its default value.

Netperf has two types of command-line options.  The first are global
command line options.  They are essentially any option not tied to a
particular test or group of tests.  An example of a global
command-line option is the one which sets the test type - @option{-t}.

The second type of options are test-specific options.  These are
options which are only applicable to a particular test or set of
tests.  An example of a test-specific option would be the send socket
buffer size for a TCP_STREAM test.

Global command-line options are specified first with test-specific
options following after a @code{--} as in:

@example
netperf <global> -- <test-specific>
@end example


@node Installing Netperf, The Design of Netperf, Introduction, Top
@chapter Installing Netperf

@cindex Installation

Netperf's primary form of distribution is source code.  This allows
installation on systems other than those to which the authors have
ready access and thus the ability to create binaries.  There are two
styles of netperf installation.  The first runs the netperf server
program - netserver - as a child of inetd.  This requires the
installer to have sufficient privileges to edit the files
@file{/etc/services} and @file{/etc/inetd.conf} or their
platform-specific equivalents.

The second style is to run netserver as a standalone daemon.  This
second method does not require edit privileges on @file{/etc/services}
and @file{/etc/inetd.conf} but does mean you must remember to run the
netserver program explicitly after every system reboot.

This manual assumes that those wishing to measure networking
performance already know how to use anonymous FTP and/or a web
browser. It is also expected that you have at least a passing
familiarity with the networking protocols and interfaces involved. In
all honesty, if you do not have such familiarity, likely as not you
have some experience to gain before attempting network performance
measurements.  The excellent texts by authors such as Stevens, Fenner
and Rudoff and/or Stallings would be good starting points. There are
likely other excellent sources out there as well.

@menu
* Getting Netperf Bits::        
* Installing Netperf Bits::     
* Verifying Installation::      
@end menu

@node Getting Netperf Bits, Installing Netperf Bits, Installing Netperf, Installing Netperf
@section Getting Netperf Bits

Gzipped tar files of netperf sources can be retrieved via 
@uref{ftp://ftp.netperf.org/netperf,anonymous FTP}
for ``released'' versions of the bits.  Pre-release versions of the
bits can be retrieved via anonymous FTP from the
@uref{ftp://ftp.netperf.org/netperf/experimental,experimental} subdirectory.

For convenience and ease of remembering, a link to the download site
is provided via the 
@uref{http://www.netperf.org/, NetperfPage}

The bits corresponding to each discrete release of netperf are
@uref{http://www.netperf.org/svn/netperf2/tags,tagged} for retrieval
via subversion.  For example, there is a tag for the first version
corresponding to this version of the manual - 
@uref{http://www.netperf.org/svn/netperf2/tags/netperf-2.4.3,netperf
2.4.3}.  Those wishing to be on the bleeding edge of netperf
development can use subversion to grab the
@uref{http://www.netperf.org/svn/netperf2/trunk,top of trunk}.

There are likely other places around the Internet from which one can
download netperf bits.  These may be simple mirrors of the main
netperf site, or they may be local variants on netperf.  As with
anything one downloads from the Internet, take care to make sure it is
what you really wanted and isn't some malicious Trojan or whatnot.
Caveat downloader.

As a general rule, binaries of netperf and netserver are not
distributed from ftp.netperf.org.  From time to time a kind soul or
souls has packaged netperf as a Debian package available via the
apt-get mechanism or as an RPM.  I would be most interested in
learning how to enhance the makefiles to make that easier for people,
and perhaps to generate HP-UX swinstall``depots.''

@node Installing Netperf Bits, Verifying Installation, Getting Netperf Bits, Installing Netperf
@section Installing Netperf

Once you have downloaded the tar file of netperf sources onto your
system(s), it is necessary to unpack the tar file, cd to the netperf
directory, run configure and then make.  Most of the time it should be
sufficient to just:

@example
gzcat <netperf-version>.tar.gz | tar xf -
cd <netperf-version>
./configure
make
make install
@end example

Most of the ``usual'' configure script options should be present
dealing with where to install binaries and whatnot.  
@example
./configure --help
@end example
should list all of those and more.

@vindex --enable-cpuutil, Configure
If the netperf configure script does not know how to automagically
detect which CPU utilization mechanism to use on your platform you may
want to add a @code{--enable-cpuutil=mumble} option to the configure
command.   If you have knowledge and/or experience to contribute to
that area, feel free to contact @email{netperf-feedback@@netperf.org}.

@vindex --enable-xti, Configure
@vindex --enable-unix, Configure
@vindex --enable-dlpi, Configure
@vindex --enable-sctp, Configure
Similarly, if you want tests using the XTI interface, Unix Domain
Sockets, DLPI or SCTP it will be necessary to add one or more
@code{--enable-[xti|unix|dlpi|sctp]=yes} options to the configure
command.  As of this writing, the configure script will not include
those tests automagically.

On some platforms, it may be necessary to precede the configure
command with a CFLAGS and/or LIBS variable as the netperf configure
script is not yet smart enough to set them itself.  Whenever possible,
these requirements will be found in @file{README.@var{platform}} files.
Expertise and assistance in making that more automagical in the
configure script would be most welcome.

@cindex Limiting Bandwidth
@cindex Bandwidth Limitation
@vindex --enable-intervals, Configure
@vindex --enable-histogram, Configure
Other optional configure-time settings include
@code{--enable-intervals=yes} to give netperf the ability to ``pace''
its _STREAM tests and @code{--enable-histogram=yes} to have netperf
keep a histogram of interesting times.  Each of these will have some
effect on the measured result.  If your system supports
@code{gethrtime()} the effect of the histogram measurement should be
minimized but probably still measurable.  For example, the histogram
of a netperf TCP_RR test will be of the individual transaction times:
@example
netperf -t TCP_RR -H lag -v 2
TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET : histogram
Local /Remote
Socket Size   Request  Resp.   Elapsed  Trans.
Send   Recv   Size     Size    Time     Rate         
bytes  Bytes  bytes    bytes   secs.    per sec   

16384  87380  1        1       10.00    3538.82   
32768  32768 
Alignment      Offset
Local  Remote  Local  Remote
Send   Recv    Send   Recv
    8      0       0      0
Histogram of request/response times
UNIT_USEC     :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
TEN_USEC      :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
HUNDRED_USEC  :    0: 34480:  111:   13:   12:    6:    9:    3:    4:    7
UNIT_MSEC     :    0:   60:   50:   51:   44:   44:   72:  119:  100:  101
TEN_MSEC      :    0:  105:    0:    0:    0:    0:    0:    0:    0:    0
HUNDRED_MSEC  :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
UNIT_SEC      :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
TEN_SEC       :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
>100_SECS: 0
HIST_TOTAL:      35391
@end example

Long-time users of netperf will notice the expansion of the main test
header.  This stems from the merging-in of IPv6 with the standard IPv4
tests and the addition of code to specify addressing information for
both sides of the data connection.

The histogram you see above is basically a base-10 log histogram where
we can see that most of the transaction times were on the order of one
hundred to one-hundred, ninety-nine microseconds, but they were
occasionally as long as ten to nineteen milliseconds

The @option{--enable-demo=yes} configure option will cause code to be
included to report interim results during a test run.  The rate at
which interim results are reported can then be controlled via the
global @option{-D} option.  Here is an example of --enable-demo mode
output:

@example
src/netperf -D 1.35 -H lag -f M
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET : demo
Interim result:    9.66 MBytes/s over 1.67 seconds
Interim result:    9.64 MBytes/s over 1.35 seconds
Interim result:    9.58 MBytes/s over 1.36 seconds
Interim result:    9.51 MBytes/s over 1.36 seconds
Interim result:    9.71 MBytes/s over 1.35 seconds
Interim result:    9.66 MBytes/s over 1.36 seconds
Interim result:    9.61 MBytes/s over 1.36 seconds
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    MBytes/sec  

 32768  16384  16384    10.00       9.61  
@end example

Notice how the units of the interim result track that requested by the
@option{-f} option.  Also notice that sometimes the interval will be
longer than the value specified in the @option{-D} option.  This is
normal and stems from how demo mode is implemented without relying on
interval timers, but by calculating how many units of work must be
performed to take at least the desired interval.

As of this writing, a @code{make install} will not actually update the
files @file{/etc/services} and/or @file{/etc/inetd.conf} or their
platform-specific equivalents.  It remains necessary to perform that
bit of installation magic by hand.  Patches to the makefile sources to
effect an automagic editing of the necessary files to have netperf
installed as a child of inetd would be most welcome.

Starting the netserver as a standalone daemon should be as easy as:
@example
$ netserver
Starting netserver at port 12865
Starting netserver at hostname 0.0.0.0 port 12865 and family 0
@end example

Over time the specifics of the messages netserver prints to the screen
may change but the gist will remain the same.

If the compilation of netperf or netserver happens to fail, feel free
to contact @email{netperf-feedback@@netperf.org} or join and ask in
@email{netperf-talk@@netperf.org}.  However, it is quite important
that you include the actual compilation errors and perhaps even the
configure log in your email.  Otherwise, it will be that much more
difficult for someone to assist you.

@node Verifying Installation,  , Installing Netperf Bits, Installing Netperf
@section Verifying Installation

Basically, once netperf is installed and netserver is configured as a
child of inetd, or launched as a standalone daemon, simply typing:
@example
netperf
@end example
should result in output similar to the following:
@example
$ netperf
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

 87380  16384  16384    10.00    2997.84   
@end example


@node The Design of Netperf, Global Command-line Options, Installing Netperf, Top
@chapter The Design of Netperf

@cindex Design of Netperf

Netperf is designed around a basic client-server model.  There are
two executables - netperf and netserver.  Generally you will only
execute the netperf program, with the netserver program being invoked
by the remote system's inetd or equivalent.
When you execute netperf, the first that that will happen is the
establishment of a control connection to the remote system.  This
connection will be used to pass test configuration information and
results to and from the remote system.  Regardless of the type of test
to be run, the control connection will be a TCP connection using BSD
sockets.  The control connection can use either IPv4 or IPv6.

Once the control connection is up and the configuration information
has been passed, a separate ``data'' connection will be opened for the
measurement itself using the API's and protocols appropriate for the
specified test.  When the test is completed, the data connection will
be torn-down and results from the netserver will be passed-back via the
control connection and combined with netperf's result for display to
the user.

Netperf places no traffic on the control connection while a test is in
progress.  Certain TCP options, such as SO_KEEPALIVE, if set as your
systems' default, may put packets out on the control connection while
a test is in progress.  Generally speaking this will have no effect on
the results.

@menu
* CPU Utilization::             
@end menu

@cindex CPU Utilization
@node CPU Utilization,  , The Design of Netperf, The Design of Netperf
@section CPU Utilization

CPU utilization is an important, and alas all-too infrequently
reported component of networking performance.  Unfortunately, it can
be one of the most difficult metrics to measure accurately as many
systems offer mechanisms that are at best il-suited to measuring CPU
utilization in high interrupt rate (eg networking) situations.  

CPU utilization in netperf is reported as a value between 0 and 100%
regardless of the number of CPUs involved.  In addition to CPU
utilization, netperf will report a metric called a @dfn{service
demand}.  The service demand is the normalization of CPU utilization
and work performed.  For a _STREAM test it is the microseconds of CPU
time consumed to transfer on KB (K == 1024) of data.  For a _RR test
it is the microseconds of CPU time consumed processing a single
transaction.   For both CPU utilization and service demand, lower is
better. 

Service demand can be particularly useful when trying to gauge the
effect of a performance change.  It is essentially a measure of
efficiency, with smaller values being more efficient.

Netperf is coded to be able to use one of several, generally
platform-specific CPU utilization measurement mechanisms.  Single
letter codes will be included in the CPU portion of the test banner to
indicate which mechanism was used on each of the local (netperf) and
remote (netserver) system.

As of this writing those codes are:

@table @code
@item U
The CPU utilization measurement mechanism was unknown to netperf or
netperf/netserver was not compiled to include CPU utilization
measurements. The code for the null CPU utilization mechanism can be
found in @file{src/netcpu_none.c}.
@item I
An HP-UX-specific CPU utilization mechanism whereby the kernel
incremented a per-CPU counter by one for each trip through the idle
loop. This mechanism was only available on specially-compiled HP-UX
kernels prior to HP-UX 10 and is mentioned here only for the sake of
historical completeness and perhaps as a suggestion to those who might
be altering other operating systems. While rather simple, perhaps even
simplistic, this mechanism was quite robust and was not affected by
the concerns of statistical methods, or methods attempting to track
time in each of user, kernel, interrupt and idle modes which require
quite careful accounting.  It can be thought-of as the in-kernel
version of the looper @code{L} mechanism without the context switch
overhead. This mechanism required calibration.
@item P
An HP-UX-specific CPU utilization mechanism whereby the kernel
keeps-track of time (in the form of CPU cycles) spent in the kernel
idle loop (HP-UX 10.0 to 11.23 inclusive), or where the kernel keeps
track of time spent in idle, user, kernel and interrupt processing
(HP-UX 11.23 and later).  The former requires calibration, the latter
does not.  Values in either case are retrieved via one of the pstat(2)
family of calls, hence the use of the letter @code{P}.  The code for
these mechanisms is found in @file{src/netcpu_pstat.c} and
@file{src/netcpu_pstatnew.c} respectively.
@item K
A Solaris-specific CPU utilization mechanism where by the kernel
keeps track of ticks (eg HZ) spent in the idle loop.  This method is
statistical and is known to be inaccurate when the interrupt rate is
above epsilon as time spent processing interrupts is not subtracted
from idle.  The value is retrieved via a kstat() call - hence the use
of the letter @code{K}.  Since this mechanism uses units of ticks (HZ)
the calibration value should invariably match HZ. (Eg 100)  The code
for this mechanism is implemented in @file{src/netcpu_kstat.c}.
@item M
A Solaris-specific mechanism available on Solaris 10 and latter which
uses the new microstate accounting mechanisms.  There are two, alas,
overlapping, mechanisms.  The first tracks nanoseconds spent in user,
kernel, and idle modes. The second mechanism tracks nanoseconds spent
in interrupt.  Since the mechanisms overlap, netperf goes through some
hand-waving to try to ``fix'' the problem.  Since the accuracy of the
handwaving cannot be completely determined, one must presume that
while better than the @code{K} mechanism, this mechanism too is not
without issues.  The values are retrieved via kstat() calls, but the
letter code is set to @code{M} to distinguish this mechanism from the
even less accurate @code{K} mechanism.  The code for this mechanism is
implemented in @file{src/netcpu_kstat10.c}.
@item L
A mechanism based on ``looper''or ``soaker'' processes which sit in
tight loops counting as fast as they possibly can. This mechanism
starts a looper process for each known CPU on the system.  The effect
of processor hyperthreading on the mechanism is not yet known.  This
mechanism definitely requires calibration.  The code for the
``looper''mechanism can be found in @file{src/netcpu_looper.c}
@item N
A Microsoft Windows-specific mechanism, the code for which can be
found in @file{src/netcpu_ntperf.c}.  This mechanism too is based on
what appears to be a form of micro-state accounting and requires no
calibration.  On laptops, or other systems which may dynamically alter
the CPU frequency to minimize power consumtion, it has been suggested
that this mechanism may become slightly confsed, in which case using
BIOS settings to disable the power saving would be indicated.

@item S
This mechanism uses @file{/proc/stat} on Linux to retrieve time
(ticks) spent in idle mode.  It is thought but not known to be
reasonably accurate.  The code for this mechanism can be found in
@file{src/netcpu_procstat.c}.
@item C
A mechanism somewhat similar to @code{S} but using the sysctl() call
on BSD-like Operating systems (*BSD and MacOS X).  The code for this
mechanism can be found in @file{src/netcpu_sysctl.c}.
@item Others
Other mechanisms included in netperf in the past have included using
the times() and getrusage() calls.  These calls are actually rather
poorly suited to the task of measuring CPU overhead for networking as
they tend to be process-specific and much network-related  processing
can happen outside the context of a process, in places where it is not
a given it will be charged to the correct, or even a process.  They
are mentioned here as a warning to anyone seeing those mechanisms used
in other networking benchmarks.  These mechanisms are not available in
netperf 2.4.0 and later.
@end table



For many platforms, the configure script will chose the best available
CPU utilization mechanism.  However, some platforms have no
particularly good mechanisms.  On those platforms, it is probably best
to use the ``LOOPER'' mechanism which is basically some number of
processes (as many as there are processors) sitting in tight little
loops counting as fast as they can.  The rate at which the loopers
count when the system is believed to be idle is compared with the rate
when the system is running netperf and the ratio is used to compute
CPU utilization.

In the past, netperf included some mechanisms that only reported CPU
time charged to the calling process.  Those mechanisms have been
removed from netperf versions 2.4.0 and later because they are
hopelessly inaccurate.  Networking can and often results in CPU time
being spent in places - such as interrupt contexts - that do not get
charged to a or the correct process.

In fact, time spent in the processing of interrupts is a common issue
for many CPU utilization mechanisms.  In particular, the ``PSTAT''
mechanism was eventually known to have problems accounting for certain
interrupt time prior to HP-UX 11.11 (11iv1).  HP-UX 11iv1 and later
are known to be good. The ``KSTAT'' mechanism is known to have
problems on all versions of Solaris up to and including Solaris 10.
Even the microstate accounting available via kstat in Solaris 10 has
issues, though perhaps not as bad as those of prior versions.

The /proc/stat mechanism under Linux is in what the author would
consider an ``uncertain'' category as it appears to be statistical,
which may also have issues with time spent processing interrupts.

In summary, be sure to ``sanity-check'' the CPU utilization figures
with other mechanisms.  However, platform tools such as top, vmstat or
mpstat are often based on the same mechanisms used by netperf.

@node Global Command-line Options, Using Netperf to Measure Bulk Data Transfer, The Design of Netperf, Top
@chapter Global Command-line Options

This section describes each of the global command-line options
available in the netperf and netserver binaries.  Essentially, it is
an expanded version of the usage information displayed by netperf or
netserver when invoked with the @option{-h} global command-line
option.

@menu
* Command-line Options Syntax::  
* Global Options::              
@end menu

@node Command-line Options Syntax, Global Options, Global Command-line Options, Global Command-line Options
@comment  node-name,  next,  previous,  up
@section Command-line Options Syntax

Revision 1.8 of netperf introduced enough new functionality to overrun
the English alphabet for mnemonic command-line option names, and the
author was not and is not quite ready to switch to the contemporary
@option{--mumble} style of command-line options. (Call him a Luddite).

For this reason, the command-line options were split into two parts -
the first are the global command-line options.  They are options that
affect nearly any and every test type of netperf.  The second type are
the test-specific command-line options.  Both are entered on the same
command line, but they must be separated from one another by a @code{--}
for correct parsing.  Global command-line options come first, followed
by the @code{--} and then test-specific command-line options.  If there
are no test-specific options to be set, the @code{--} may be omitted.  If
there are no global command-line options to be set, test-specific
options must still be preceded by a @code{--}.  For example:
@example
netperf <global> -- <test-specific>
@end example
sets both global and test-specific options:
@example
netperf <global>
@end example
sets just global options and:
@example
netperf -- <test-specific>
@end example
sets just test-specific options.

@node Global Options,  , Command-line Options Syntax, Global Command-line Options
@comment  node-name,  next,  previous,  up
@section Global Options

@table @code
@vindex -a, Global
@item -a <sizespec>
This option allows you to alter the alignment of the buffers used in
the sending and receiving calls on the local system.. Changing the
alignment of the buffers can force the system to use different copy
schemes, which can have a measurable effect on performance.  If the
page size for the system were 4096 bytes, and you want to pass
page-aligned buffers beginning on page boundaries, you could use
@samp{-a 4096}.  By default the units are bytes, but suffix of ``G,''
``M,'' or ``K'' will specify the units to be 2^30 (GB), 2^20 (MB) or
2^10 (KB) respectively. A suffix of ``g,'' ``m'' or ``k'' will specify
units of 10^9, 10^6 or 10^3 bytes respectively. [Default: 8 bytes]

@vindex -A, Global
@item -A <sizespec>
This option is identical to the @option{-a} option with the difference
being it affects alignments for the remote system.

@vindex -b, Global
@item -b <size>
This option is only present when netperf has been configure with
--enable-intervals=yes prior to compilation.  It sets the size of the
burst of send calls in a _STREAM test.  When used in conjunction with
the @option{-w} option it can cause the rate at which data is sent to
be ``paced.''

@vindex -B, Global
@item -B <string>
This option will cause @option{<string>} to be appended to the brief
(see -P) output of netperf.

@vindex -c, Global
@item -c [rate]
This option will ask that CPU utilization and service demand be
calculated for the local system.  For those CPU utilization mechanisms
requiring calibration, the options rate parameter may be specified to
preclude running another calibration step, saving 40 seconds of time.
For those CPU utilization mechanisms requiring no calibration, the
optional rate parameter will be utterly and completely ignored.
[Default: no CPU measurements]

@vindex -C, Global
@item -C [rate]
This option requests CPU utilization and service demand calculations
for the remote system.  It is otherwise identical to the @option{-c}
option.

@vindex -d, Global
@item -d
Each instance of this option will increase the quantity of debugging
output displayed during a test.  If the debugging output level is set
high enough, it may have a measurable effect on performance.
Debugging information for the local system is printed to stdout.
Debugging information for the remote system is sent by default to the
file @file{/tmp/netperf.debug}. [Default: no debugging output]

@vindex -D, Global
@item -D [interval,units]
This option is only available when netperf is configured with
--enable-demo=yes.  When set, it will cause netperf to emit periodic
reports of performance during the run.  [@var{interval},@var{units}]
follow the semantics of an optionspec. If specified,
@var{interval} gives the minimum interval in real seconds, it does not
have to be whole seconds.  The @var{units} value can be used for the
first guess as to how many units of work (bytes or transactions) must
be done to take at least @var{interval} seconds. If omitted,
@var{interval} defaults to one second and @var{units} to values
specific to each test type.

@vindex -f, Global
@item -f G|M|K|g|m|k
This option can be used to change the reporting units for _STREAM
tests.  Arguments of ``G,'' ``M,'' or ``K'' will set the units to
2^30, 2^20 or 2^10 bytes/s respectively (EG power of two GB, MB or
KB).  Arguments of ``g,'' ``,m'' or ``k'' will set the units to 10^9,
10^6 or 10^3 bits/s respectively.  [Default: 'm' or 10^6 bits/s]

@vindex -F, Global
@item -F <fillfile>
This option specified the file from which send which buffers will be
pre-filled .  While the buffers will contain data from the specified
file, the file is not fully transfered to the remote system as the
receiving end of the test will not write the contents of what it
receives to a file.  This can be used to pre-fill the send buffers
with data having different compressibility and so is useful when
measuring performance over mechanisms which perform compression. 

While optional for most tests, this option is required for a test
utilizing the sendfile() or related calls because sendfile tests need
a name of a file to reference.

@vindex -h, Global
@item -h
This option causes netperf to display its usage string and exit to the
exclusion of all else.

@vindex -H, Global
@item -H <optionspec>
This option will set the name of the remote system and or the address
family used for the control connection.  For example:
@example
-H linger,4
@end example
will set the name of the remote system to ``tardy'' and tells netperf to
use IPv4 addressing only.
@example
-H ,6
@end example
will leave the name of the remote system at its default, and request
that only IPv6 addresses be used for the control connection.
@example
-H lag
@end example
will set the name of the remote system to ``lag'' and leave the
address family to AF_UNSPEC which means selection of IPv4 vs IPv6 is
left to the system's address resolution.  

A value of ``inet'' can be used in place of ``4'' to request IPv4 only
addressing.  Similarly, a value of ``inet6'' can be used in place of
``6'' to request IPv6 only addressing.  A value of ``0'' can be used
to request either IPv4 or IPv6 addressing as name resolution dictates.

By default, the options set with the global @option{-H} option are
inherited by the test for its data connection, unless a test-specific
@option{-H} option is specified.

If a @option{-H} option follows either the @option{-4} or @option{-6}
options, the family setting specified with the -H option will override
the @option{-4} or @option{-6} options for the remote address
family. If no address family is specified, settings from a previous
@option{-4} or @option{-6} option will remain.  In a nutshell, the
last explicit global command-line option wins.

[Default:  ``localhost'' for the remote name/IP address and ``0'' (eg
AF_UNSPEC) for the remote address family.]

@vindex -I, Global
@item -I <optionspec>
This option enables the calculation of confidence intervals and sets
the confidence and width parameters with the first have of the
optionspec being either 99 or 95 for 99% or 95% confidence
respectively.  The second value of the optionspec specifies the width
of the desired confidence interval.  For example
@example
-I 99,5
@end example
asks netperf to be 99% confident that the measured mean values for
throughput and CPU utilization are within +/- 2.5% of the ``real''
mean values.  If the @option{-i} option is specified and the
@option{-I} option is omitted, the confidence defaults to 99% and the
width to 5% (giving +/- 2.5%)

If netperf calculates that the desired confidence intervals have not
been met, it emits a noticeable warning that cannot be suppressed with
the @option{-P} or @option{-v} options:

@example
netperf -H tardy.cup -i 3 -I 99,5
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.cup.hp.com (15.244.44.58) port 0 AF_INET : +/-2.5% @ 99% conf.
!!! WARNING
!!! Desired confidence was not achieved within the specified iterations.
!!! This implies that there was variability in the test environment that
!!! must be investigated before going further.
!!! Confidence intervals: Throughput      :  6.8%
!!!                       Local CPU util  :  0.0%
!!!                       Remote CPU util :  0.0%

Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

 32768  16384  16384    10.01      40.23   
@end example

Where we see that netperf did not meet the desired convidence
intervals.  Instead of being 99% confident it was within +/- 2.5% of
the real mean value of throughput it is only confident it was within
+/-3.4%.  In this example, increasing the @option{-i} option
(described below) and/or increasing the iteration length with the
@option{-l} option might resolve the situation.

@vindex -i, Global
@item -i <sizespec>
This option enables the calculation of confidence intervals and sets
the minimum and maximum number of iterations to run in attempting to
achieve the desired confidence interval.  The first value sets the
maximum number of iterations to run, the second, the minimum.  The
maximum number of iterations is silently capped at 30 and the minimum
is silently floored at 3.  Netperf repeats the measurement the minimum
number of iterations and continues until it reaches either the
desired confidence interval, or the maximum number of iterations,
whichever comes first.

If the @option{-I} option is specified and the @option{-i} option
omitted the maximum number of iterations is set to 10 and the minimum
to three.

If netperf determines that the desired confidence intervals have not
been met, it emits a noticeable warning.

The total test time will be somewhere between the minimum and maximum
number of iterations multiplied by the test length supplied by the
@option{-l} option.

@vindex -l, Global
@item -l testlen
This option controls the length of any @b{one} iteration of the requested
test.  A positive value for @var{testlen} will run each iteration of
the test for at least @var{testlen} seconds.  A negative value for
@var{testlen} will run each iteration for the absolute value of
@var{testlen} transactions for a _RR test or bytes for a _STREAM test.
Certain tests, notably those using UDP can only be timed, they cannot
be limited by transaction or byte count.

In some situations, individual iterations of a test may run for longer
for the number of seconds specified by the @option{-l} option.  In
particular, this may occur for those tests where the socket buffer
size(s) are significantly longer than the bandwidthXdelay product of
the link(s) over which the data connection passes, or those tests
where there may be non-trivial numbers of retransmissions.

If confidence intervals are enabled via either @option{-I} or
@option{-i} the total length of the netperf test will be somewhere
between the minimum and maximum iteration count multiplied by
@var{testlen}.

@vindex -L, Global
@item -L <optionspec>
This option is identical to the @option{-H} option with the difference
being it sets the _local_ hostname/IP and/or address family
information.  This option is generally unnecessary, but can be useful
when you wish to make sure that the netperf control and data
connections go via different paths.  It can also come-in handy if one
is trying to run netperf through those evil, end-to-end breaking
things known as firewalls.

[Default: 0.0.0.0 (eg INADDR_ANY) for IPv4 and ::0 for IPv6 for the
local name.  AF_UNSPEC for the local address family.]

@vindex -n, Global
@item -n numcpus
This option tells netperf how many CPUs it should ass-u-me are active
on the system running netperf.  In particular, this is used for the
@ref{CPU Utilization,CPU utilization} and service demand calculations.
On certain systems, netperf is able to determine the number of CPU's
automagically. This option will override any number netperf might be
able to determine on its own.

Note that this option does _not_ set the number of CPUs on the system
running netserver.  When netperf/netserver cannot automagically
determine the number of CPUs that can only be set for netserver via a
netserver @option{-n} command-line option.

@vindex -N, Global
@item -N
This option tells netperf to forego establishing a control
connection. This makes it is possible to run some limited netperf
tests without a corresponding netserver on the remote system.

With this option set, the test to be run is to get all the addressing
information it needs to establish its data connection from the command
line or internal defaults.  If not otherwise specified by
test-specific command line options, the data connection for a
``STREAM'' or ``SENDFILE'' test will be to the ``discard'' port, an
``RR'' test will be to the ``echo'' port, and a ``MEARTS'' test will
be to the chargen port.  

The response size of an ``RR'' test will be silently set to be the
same as the request size.  Otherwise the test would hang if the
response size was larger than the request size, or would report an
incorrect, inflated transaction rate if the response size was less
than the request size.

Since there is no control connection when this option is specified, it
is not possible to set ``remote'' properties such as socket buffer
size and the like via the netperf command line. Nor is it possible to
retrieve such interesting remote information as CPU utilization.
These items will be set to values which when displayed should make it
immediately obvious that was the case.

The only way to change remote characteristics such as socket buffer
size or to obtain information such as CPU utilization is to employ
platform-specific methods on the remote system.  Frankly, if one has
access to the remote system to employ those methods one aught to be
able to run a netserver there.  However, that ability may not be
present in certain ``support'' situations, hence the addition of this
option.

Added in netperf 2.4.3.

@vindex -o, Global
@item -o <sizespec>
The value(s) passed-in with this option will be used as an offset
added to the alignment specified with the @option{-a} option.  For
example:
@example
-o 3 -a 4096
@end example
will cause the buffers passed to the local send and receive calls to
begin three bytes past an address aligned to 4096 bytes. [Default: 0
bytes]

@vindex -O, Global
@item -O <sizespec>
This option behaves just as the @option{-o} option but on the remote
system and in conjunction with the @option{-A} option. [Default: 0
bytes]

@vindex -p, Global
@item -p <optionspec>
The first value of the optionspec passed-in with this option tells
netperf the port number at which it should expect the remote netserver
to be listening for control connections.  The second value of the
optionspec will request netperf to bind to that local port number
before establishing the control connection.  For example
@example
-p 12345
@end example
tells netperf that the remote netserver is listening on port 12345 and
leaves selection of the local port number for the control connection
up to the local TCP/IP stack whereas
@example
-p ,32109
@end example
leaves the remote netserver port at the default value of 12865 and
causes netperf to bind to the local port number 32109 before
connecting to the remote netserver.

In general, setting the local port number is only necessary when one
is looking to run netperf through those evil, end-to-end breaking
things known as firewalls.

@vindex -P, Global
@item -P 0|1
A value of ``1'' for the @option{-P} option will enable display of
the test banner.  A value of ``0'' will disable display of the test
banner. One might want to disable display of the test banner when
running the same basic test type (eg TCP_STREAM) multiple times in
succession where the test banners would then simply be redundant and
unnecessarily clutter the output. [Default: 1 - display test banners]

@vindex -t, Global
@item -t testname
This option is used to tell netperf which test you wish to run.  As of
this writing, valid values for @var{testname} include:
@itemize
@item
@ref{TCP_STREAM}, @ref{TCP_MAERTS}, @ref{TCP_SENDFILE}, @ref{TCP_RR}, @ref{TCP_CRR}, @ref{TCP_CC}
@item
@ref{UDP_STREAM}, @ref{UDP_RR}
@item
@ref{XTI_TCP_STREAM},  @ref{XTI_TCP_RR}, @ref{XTI_TCP_CRR}, @ref{XTI_TCP_CC}
@item
@ref{XTI_UDP_STREAM}, @ref{XTI_UDP_RR}
@item
@ref{SCTP_STREAM}, @ref{SCTP_RR}
@item
@ref{DLCO_STREAM}, @ref{DLCO_RR},  @ref{DLCL_STREAM}, @ref{DLCL_RR}
@item
@ref{Other Netperf Tests,LOC_CPU}, @ref{Other Netperf Tests,REM_CPU}
@end itemize
Not all tests are always compiled into netperf.  In particular, the
``XTI,'' ``SCTP,'' ``UNIX,'' and ``DL*'' tests are only included in
netperf when configured with
@option{--enable-[xti|sctp|unix|dlpi]=yes}.

Netperf only runs one type of test no matter how many @option{-t}
options may be present on the command-line.  The last @option{-t}
global command-line option will determine the test to be
run. [Default: TCP_STREAM]

@vindex -v, Global
@item -v verbosity
This option controls how verbose netperf will be in its output, and is
often used in conjunction with the @option{-P} option. If the
verbosity is set to a value of ``0'' then only the test's SFM (Single
Figure of Merit) is displayed.  If local @ref{CPU Utilization,CPU
utilization} is requested via the @option{-c} option then the SFM is
the local service demand.  Othersise, if remote CPU utilization is
requested via the @option{-C} option then the SFM is the remote
service demand.  If neither local nor remote CPU utilization are
requested the SFM will be the measured throughput or transaction rate
as implied by the test specified with the @option{-t} option.

If the verbosity level is set to ``1'' then the ``normal'' netperf
result output for each test is displayed.

If the verbosity level is set to ``2'' then ``extra'' information will
be displayed.  This may include, but is not limited to the number of
send or recv calls made and the average number of bytes per send or
recv call, or a histogram of the time spent in each send() call or for
each transaction if netperf was configured with
@option{--enable-histogram=yes}. [Default: 1 - normal verbosity]

@vindex -w, Global
@item -w time
If netperf was configured with @option{--enable-intervals=yes} then
this value will set the inter-burst time to time milliseconds, and the
@option{-b} option will set the number of sends per burst.  The actual
inter-burst time may vary depending on the system's timer resolution.

@vindex -W, Global
@item -W <sizespec>
This option controls the number of buffers in the send (first or only
value) and or receive (second or only value) buffer rings.  Unlike
some benchmarks, netperf does not continuously send or receive from a
single buffer.  Instead it rotates through a ring of
buffers. [Default: One more than the size of the send or receive
socket buffer sizes (@option{-s} and/or @option{-S} options) divided
by the send @option{-m} or receive @option{-M} buffer size
respectively]

@vindex -4, Global
@item -4
Specifying this option will set both the local and remote address
families to AF_INET - that is use only IPv4 addresses on the control
connection.  This can be overridden by a subsequent @option{-6},
@option{-H} or @option{-L} option.  Basically, the last option
explicitly specifying an address family wins.  Unless overridden by a
test-specific option, this will be inherited for the data connection
as well.

@vindex -6, Global
@item -6
Specifying this option will set both local and and remote address
families to AF_INET6 - that is use only IPv6 addresses on the control
connection.  This can be overridden by a subsequent @option{-4},
@option{-H} or @option{-L} option.  Basically, the last address family
explicitly specified wins.  Unless overridden by a test-specific
option, this will be inherited for the data connection as well.

@end table


@node Using Netperf to Measure Bulk Data Transfer, Using Netperf to Measure Request/Response , Global Command-line Options, Top
@chapter Using Netperf to Measure Bulk Data Transfer

The most commonly measured aspect of networked system performance is
that of bulk or unidirectional transfer performance.  Everyone wants
to know how many bits or bytes per second they can push across the
network. The netperf convention for a bulk data transfer test name is
to tack a ``_STREAM'' suffix to a test name.

@menu
* Issues in Bulk Transfer::     
* Options common to TCP UDP and SCTP tests::  
@end menu

@node Issues in Bulk Transfer, Options common to TCP UDP and SCTP tests, Using Netperf to Measure Bulk Data Transfer, Using Netperf to Measure Bulk Data Transfer
@comment  node-name,  next,  previous,  up
@section Issues in Bulk Transfer

There are any number of things which can affect the performance of a
bulk transfer test.  

Certainly, absent compression, bulk-transfer tests can be limited by
the speed of the slowest link in the path from the source to the
destination.  If testing over a gigabit link, you will not see more
than a gigabit :) Such situations can be described as being
@dfn{network-limited} or @dfn{NIC-limited}.

CPU utilization can also affect the results of a bulk-transfer test.
If the networking stack requires a certain number of instructions or
CPU cycles per KB of data transferred, and the CPU is limited in the
number of instructions or cycles it can provide, then the transfer can
be described as being @dfn{CPU-bound}.  

A bulk-transfer test can be CPU bound even when netperf reports less
than 100% CPU utilization.  This can happen on an MP system where one
or more of the CPUs saturate at 100% but other CPU's remain idle.
Typically, a single flow of data, such as that from a single instance
of a netperf _STREAM test cannot make use of much more than the power
of one CPU. Exceptions to this generally occur when netperf and/or
netserver run on CPU(s) other than the CPU(s) taking interrupts from
the NIC(s).

Distance and the speed-of-light can affect performance for a
bulk-transfer; often this can be mitigated by using larger windows.
One common limit to the performance of a transport using window-based
flow-control is:
@example
Throughput <= WindowSize/RoundTripTime
@end example
As the sender can only have a window's-worth of data outstanding on
the network at any one time, and the soonest the sender can receive a
window update from the receiver is one RoundTripTime (RTT).  TCP and
SCTP are examples of such protocols.

Packet losses and their effects can be particularly bad for
performance.  This is especially true if the packet losses result in
retransmission timeouts for the protocol(s) involved.  By the time a
retransmission timeout has happened, the flow or connection has sat
idle for a considerable length of time.

On many platforms, some variant on the @command{netstat} command can
be used to retrieve statistics about packet loss and
retransmission. For example:
@example
netstat -p tcp
@end example
will retrieve TCP statistics on the HP-UX Operating System.  On other
platforms, it may not be possible to retrieve statistics for a
specific protocol and something like:
@example
netstat -s
@end example
would be used instead.

Many times, such network statistics are keep since the time the stack
started, and we are only really interested in statistics from when
netperf was running.  In such situations something along the lines of:
@example
netstat -p tcp > before
netperf -t TCP_mumble...
netstat -p tcp > after
@end example
is indicated.  The
@uref{ftp://ftp.cup.hp.com/dist/networking/tools/,beforeafter} utility
can be used to subtract the statistics in @file{before} from the
statistics in @file{after}
@example
beforeafter before after > delta
@end example
and then one can look at the statistics in @file{delta}.  Beforeafter
is distributed in source form so one can compile it on the platofrm(s)
of interest. 

While it was written with HP-UX's netstat in mind, the
@uref{ftp://ftp.cup.hp.com/dist/networking/briefs/annotated_netstat.txt,annotated
netstat} writeup may be helpful with other platforms as well.

@node Options common to TCP UDP and SCTP tests,  , Issues in Bulk Transfer, Using Netperf to Measure Bulk Data Transfer
@comment  node-name,  next,  previous,  up
@section Options common to TCP UDP and SCTP tests

Many ``test-specific'' options are actually common across the
different tests.  For those tests involving TCP, UDP and SCTP, whether
using the BSD Sockets or the XTI interface those common options
include:

@table @code
@vindex -h, Test-specific
@item -h
Display the test-suite-specific usage string and exit.  For a TCP_ or
UDP_ test this will be the usage string from the source file
nettest_bsd.c.  For an XTI_ test, this will be the usage string from
the source file nettest_xti.c.  For an SCTP test, this will be the
usage string from the source file nettest_sctp.c.

@item -H <optionspec>
Normally, the remote hostname|IP and address family information is
inherited from the settings for the control connection (eg global
command-line @option{-H}, @option{-4} and/or @option{-6} options).
The test-specific @option{-H} will override those settings for the
data (aka test) connection only.  Settings for the control connection
are left unchanged.

@vindex -L, Test-specific
@item -L <optionspec>
The test-specific @option{-L} option is identical to the test-specific
@option{-H} option except it affects the local hostname|IP and address
family information.  As with its global command-line counterpart, this
is generally only useful when measuring though those evil, end-to-end
breaking things called firewalls.

@vindex -m, Test-specific
@item -m bytes
Set the size of the buffer passed-in to the ``send'' calls of a
_STREAM test.  Note that this may have only an indirect effect on the
size of the packets sent over the network, and certain Layer 4
protocols do _not_ preserve or enforce message boundaries, so setting
@option{-m} for the send size does not necessarily mean the receiver
will receive that many bytes at any one time. By default the units are
bytes, but suffix of ``G,'' ``M,'' or ``K'' will specify the units to
be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively. A suffix of ``g,''
``m'' or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes
respectively. For example:
@example
@code{-m 32K}
@end example
will set the size to 32KB or 32768 bytes. [Default: the local send
socket buffer size for the connection - either the system's default or
the value set via the @option{-s} option.]

@vindex -M, Test-specific
@item -M bytes
Set the size of the buffer passed-in to the ``recv'' calls of a
_STREAM test.  This will be an upper bound on the number of bytes
received per receive call. By default the units are bytes, but suffix
of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30 (GB), 2^20
(MB) or 2^10 (KB) respectively.  A suffix of ``g,'' ``m'' or ``k''
will specify units of 10^9, 10^6 or 10^3 bytes respectively. For
example:
@example
@code{-M 32K}
@end example
will set the size to 32KB or 32768 bytes. [Default: the remote receive
socket buffer size for the data connection - either the system's
default or the value set via the @option{-S} option.]

@vindex -P, Test-specific
@item -P <optionspec>
Set the local and/or remote port numbers for the data connection.

@vindex -s, Test-specific
@item -s <sizespec>
This option sets the local send and receive socket buffer sizes for
the data connection to the value(s) specified.  Often, this will
affect the advertised and/or effective TCP or other window, but on
some platforms it may not. By default the units are bytes, but suffix
of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30 (GB), 2^20
(MB) or 2^10 (KB) respectively.  A suffix of ``g,'' ``m'' or ``k''
will specify units of 10^9, 10^6 or 10^3 bytes respectively. For
example:
@example
@code{-s 128K}
@end example
Will request the local send and receive socket buffer sizes to be
128KB or 131072 bytes. 

While the historic expectation is that setting the socket buffer size
has a direct effect on say the TCP window, today that may not hold
true for all stacks. Further, while the historic expectation is that
the value specified in a setsockopt() call will be the value returned
via a getsockopt() call, at least one stack is known to deliberately
ignore history.  When running under Windows a value of 0 may be used
which will be an indication to the stack the user wants to enable a
form of copy avoidance. [Default: -1 - use the system's default socket
buffer sizes]

@vindex -S Test-specific
@item -S <sizespec>
This option sets the remote send and/or receive socket buffer sizes
for the data connection to the value(s) specified.  Often, this
will affect the advertised and/or effective TCP or other window, but
on some platforms it may not. By default the units are bytes, but
suffix of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30
(GB), 2^20 (MB) or 2^10 (KB) respectively.  A suffix of ``g,'' ``m''
or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes respectively.
For example:
@example
@code{-s 128K}
@end example
Will request the local send and receive socket buffer sizes to be
128KB or 131072 bytes. 

While the historic expectation is that setting the socket buffer size
has a direct effect on say the TCP window, today that may not hold
true for all stacks.  Further, while the historic expectation is that
the value specified in a setsockopt() call will be the value returned
via a getsockopt() call, at least one stack is known to deliberately
ignore history.  When running under Windows a value of 0 may be used
which will be an indication to the stack the user wants to enable a
form of copy avoidance. [Default: -1 - use the system's default socket
buffer sizes]

@vindex -4, Test-specific
@item -4
Set the local and remote address family for the data connection to
AF_INET - ie use IPv4 addressing only.  Just as with their global
command-line counterparts the last of the @option{-4}, @option{-6},
@option{-H} or @option{-L} option wins for their respective address
families.

@vindex -6, Test-specific
@item -6
This option is identical to its @option{-4} cousin, but requests IPv6
addresses for the local and remote ends of the data connection.

@end table


@menu
* TCP_STREAM::                  
* TCP_MAERTS::                  
* TCP_SENDFILE::                
* UDP_STREAM::                  
* XTI_TCP_STREAM::              
* XTI_UDP_STREAM::              
* SCTP_STREAM::                 
* DLCO_STREAM::                 
* DLCL_STREAM::                 
* STREAM_STREAM::               
* DG_STREAM::                   
@end menu

@node TCP_STREAM, TCP_MAERTS, Options common to TCP UDP and SCTP tests, Options common to TCP UDP and SCTP tests
@subsection TCP_STREAM

The TCP_STREAM test is the default test in netperf.  It is quite
simple, transferring some quantity of data from the system running
netperf to the system running netserver.  While time spent
establishing the connection is not included in the throughput
calculation, time spent flushing the last of the data to the remote at
the end of the test is.  This is how netperf knows that all the data
it sent was received by the remote.  In addition to the @ref{Options
common to TCP UDP and SCTP tests,options common to STREAM tests}, the
following test-specific options can be included to possibly alter the
behavior of the test:

@table @code
@item -C
This option will set TCP_CORK mode on the data connection on those
systems where TCP_CORK is defined (typically Linux).  A full
description of TCP_CORK is beyond the scope of this manual, but in a
nutshell it forces sub-MSS sends to be buffered so every segment sent
is Maximum Segment Size (MSS) unless the application performs an
explicit flush operation or the connection is closed.  At present
netperf does not perform any explicit flush operations.  Setting
TCP_CORK may improve the bitrate of tests where the ``send size''
(@option{-m} option) is smaller than the MSS.  It should also improve
(make smaller) the service demand.

The Linux tcp(7) manpage states that TCP_CORK cannot be used in
conjunction with TCP_NODELAY (set via the @option{-d} option), however
netperf does not validate command-line options to enforce that.

@item -D
This option will set TCP_NODELAY on the data connection on those
systems where TCP_NODELAY is defined.  This disables something known
as the Nagle Algorithm, which is intended to make the segments TCP
sends as large as reasonably possible.  Setting TCP_NODELAY for a
TCP_STREAM test should either have no effect when the send size
(@option{-m} option) is larger than the MSS or will decrease reported
bitrate and increase service demand when the send size is smaller than
the MSS.  This stems from TCP_NODELAY causing each sub-MSS send to be
its own TCP segment rather than being aggregated with other small
sends.  This means more trips up and down the protocol stack per KB of
data transferred, which means greater CPU utilization.

If setting TCP_NODELAY with @option{-D} affects throughput and/or
service demand for tests where the send size (@option{-m}) is larger
than the MSS it suggests the TCP/IP stack's implementation of the
Nagle Algorithm _may_ be broken, perhaps interpreting the Nagle
Algorithm on a segment by segment basis rather than the proper user
send by user send basis.  However, a better test of this can be
achieved with the @ref{TCP_RR} test.

@end table

Here is an example of a basic TCP_STREAM test, in this case from a
Debian Linux (2.6 kernel) system to an HP-UX 11iv2 (HP-UX 11.23)
system:

@example
$ netperf -H lag
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

 32768  16384  16384    10.00      80.42   
@end example

We see that the default receive socket buffer size for the receiver
(lag - HP-UX 11.23) is 32768 bytes, and the default socket send buffer
size for the sender (Debian 2.6 kernel) is 16384 bytes.  Througput is
expressed as 10^6 (aka Mega) bits per second, and the test ran for 10
seconds.  IPv4 addresses (AF_INET) were used.

@node TCP_MAERTS, TCP_SENDFILE, TCP_STREAM, Options common to TCP UDP and SCTP tests
@comment  node-name,  next,  previous,  up
@subsection TCP_MAERTS

A TCP_MAERTS (MAERTS is STREAM backwards) test is ``just like'' a
@ref{TCP_STREAM} test except the data flows from the netserver to the
netperf. The global command-line @option{-F} option is ignored for
this test type.  The test-specific command-line @option{-C} option is
ignored for this test type.

Here is an example of a TCP_MAERTS test between the same two systems
as in the example for the @ref{TCP_STREAM} test.  This time we request
larger socket buffers with @option{-s} and @option{-S} options:

@example
$ netperf -H lag -t TCP_MAERTS -- -s 128K -S 128K
TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

221184 131072 131072    10.03      81.14   
@end example

Where we see that Linux, unlike HP-UX, may not return the same value
in a getsockopt() as was requested in the prior setsockopt().

This test is included more for benchmarking convenience than anything
else.

@node TCP_SENDFILE, UDP_STREAM, TCP_MAERTS, Options common to TCP UDP and SCTP tests
@comment  node-name,  next,  previous,  up
@subsection TCP_SENDFILE

The TCP_SENDFILE test is ``just like'' a @ref{TCP_STREAM} test except
netperf the platform's @code{sendfile()} call instead of calling
@code{send()}.  Often this results in a @dfn{zero-copy} operation
where data is sent directly from the filesystem buffer cache.  This
_should_ result in lower CPU utilization and possibly higher
throughput.  If it does not, then you may want to contact your
vendor(s) because they have a problem on their hands.

Zero-copy mechanisms may also alter the characteristics (size and
number of buffers per) of packets passed to the NIC.  In many stacks,
when a copy is performed, the stack can ``reserve'' space at the
beginning of the destination buffer for things like TCP, IP and Link
headers.  This then has the packet contained in a single buffer which
can be easier to DMA to the NIC.  When no copy is performed, there is
no opportunity to reserve space for headers and so a packet will be
contained in two or more buffers.

The @ref{Global Options,global @option{-F} option} is required for this test and it must
specify a file of at least the size of the send ring (@xref{Global
Options,the global @option{-W} option}.) multiplied by the send size
(@xref{Options common to TCP UDP and SCTP tests,the test-specific
@option{-m} option}.).  All other TCP-specific options are available
and optional.

In this first example:
@example
$ netperf -H lag -F ../src/netperf -t TCP_SENDFILE -- -s 128K -S 128K
TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET
alloc_sendfile_buf_ring: specified file too small.
file must be larger than send_width * send_size
@end example

we see what happens when the file is too small.  Here:

@example
$ ../src/netperf -H lag -F /boot/vmlinuz-2.6.8-1-686 -t TCP_SENDFILE -- -s 128K -S 128K
TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

131072 221184 221184    10.02      81.83   
@end example

we resolve that issue by selecting a larger file.


@node UDP_STREAM, XTI_TCP_STREAM, TCP_SENDFILE, Options common to TCP UDP and SCTP tests
@subsection UDP_STREAM

A UDP_STREAM test is similar to a @ref{TCP_STREAM} test except UDP is
used as the transport rather than TCP.

@cindex Limiting Bandwidth
A UDP_STREAM test has no end-to-end flow control - UDP provides none
and neither does netperf.  However, if you wish, you can configure
netperf with @code{--enable-intervals=yes} to enable the global
command-line @option{-b} and @option{-w} options to pace bursts of
traffic onto the network.

This has a number of implications.

The biggest of these implications is the data which is sent might not
be received by the remote.  For this reason, the output of a
UDP_STREAM test shows both the sending and receiving throughput.  On
some platforms, it may be possible for the sending throughput to be
reported as a value greater than the maximum rate of the link.  This
is common when the CPU(s) are faster than the network and there is no
@dfn{intra-stack} flow-control.

Here is an example of a UDP_STREAM test between two systems connected
by a 10 Gigabit Ethernet link:
@example
$ netperf -t UDP_STREAM -H 192.168.2.125 -- -m 32768
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET
Socket  Message  Elapsed      Messages                
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

124928   32768   10.00      105672      0    2770.20
135168           10.00      104844           2748.50

@end example

The first line of numbers are statistics from the sending (netperf)
side. The second line of numbers are from the receiving (netserver)
side.  In this case, 105672 - 104844 or 828 messages did not make it
all the way to the remote netserver process.

If the value of the @option{-m} option is larger than the local send
socket buffer size (@option{-s} option) netperf will likely abort with
an error message about how the send call failed:

@example
netperf -t UDP_STREAM -H 192.168.2.125
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET
udp_send: data send error: Message too long
@end example

If the value of the @option{-m} option is larger than the remote
socket receive buffer, the reported receive throughput will likely be
zero as the remote UDP will discard the messages as being too large to
fit into the socket buffer.

@example
$ netperf -t UDP_STREAM -H 192.168.2.125 -- -m 65000 -S 32768
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET
Socket  Message  Elapsed      Messages                
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

124928   65000   10.00       53595      0    2786.99
 65536           10.00           0              0.00
@end example

The example above was between a pair of systems running a ``Linux''
kernel. Notice that the remote Linux system returned a value larger
than that passed-in to the @option{-S} option.  In fact, this value
was larger than the message size set with the @option{-m} option.
That the remote socket buffer size is reported as 65536 bytes would
suggest to any sane person that a message of 65000 bytes would fit,
but the socket isn't _really_ 65536 bytes, even though Linux is
telling us so.  Go figure.

@node XTI_TCP_STREAM, XTI_UDP_STREAM, UDP_STREAM, Options common to TCP UDP and SCTP tests
@subsection XTI_TCP_STREAM

An XTI_TCP_STREAM test is simply a @ref{TCP_STREAM} test using the XTI
rather than BSD Sockets interface.  The test-specific @option{-X
<devspec>} option can be used to specify the name of the local and/or
remote XTI device files, which is required by the @code{t_open()} call
made by netperf XTI tests.

The XTI_TCP_STREAM test is only present if netperf was configured with
@code{--enable-xti=yes}.  The remote netserver must have also been
configured with @code{--enable-xti=yes}.

@node XTI_UDP_STREAM, SCTP_STREAM, XTI_TCP_STREAM, Options common to TCP UDP and SCTP tests
@subsection XTI_UDP_STREAM

An XTI_UDP_STREAM test is simply a @ref{UDP_STREAM} test using the XTI
rather than BSD Sockets Interface.  The test-specific @option{-X
<devspec>} option can be used to specify the name of the local and/or
remote XTI device files, which is required by the @code{t_open()} call
made by netperf XTI tests.

The XTI_UDP_STREAM test is only present if netperf was configured with
@code{--enable-xti=yes}. The remote netserver must have also been
configured with @code{--enable-xti=yes}.

@node SCTP_STREAM, DLCO_STREAM, XTI_UDP_STREAM, Options common to TCP UDP and SCTP tests
@subsection SCTP_STREAM

An SCTP_STREAM test is essentially a @ref{TCP_STREAM} test using the SCTP
rather than TCP.  The @option{-D} option will set SCTP_NODELAY, which
is much like the TCP_NODELAY option for TCP.  The @option{-C} option
is not applicable to an SCTP test as there is no corresponding
SCTP_CORK option.  The author is still figuring-out what the
test-specific @option{-N} option does :)

The SCTP_STREAM test is only present if netperf was configured with
@code{--enable-sctp=yes}. The remote netserver must have also been
configured with @code{--enable-sctp=yes}.

@node DLCO_STREAM, DLCL_STREAM, SCTP_STREAM, Options common to TCP UDP and SCTP tests
@subsection DLCO_STREAM

A DLPI Connection Oriented Stream (DLCO_STREAM) test is very similar
in concept to a @ref{TCP_STREAM} test.  Both use reliable,
connection-oriented protocols.  The DLPI test differs from the TCP
test in that its protocol operates only at the link-level and does not
include TCP-style segmentation and reassembly.  This last difference
means that the value  passed-in  with the @option{-m} option must be
less than the interface MTU.  Otherwise, the @option{-m} and
@option{-M} options are just like their TCP/UDP/SCTP counterparts.

Other DLPI-specific options include:

@table @code
@item -D <devspec>
This option is used to provide the fully-qualified names for the local
and/or remote DPLI device files.  The syntax is otherwise identical to
that of a @dfn{sizespec}.
@item -p <ppaspec>
This option is used to specify the local and/or remote DLPI PPA(s).
The PPA is used to identify the interface over which traffic is to be
sent/received. The syntax of a @dfn{ppaspec} is otherwise the same as
a @dfn{sizespec}.
@item -s sap 
This option specifies the 802.2 SAP for the test.  A SAP is somewhat
like either the port field of a TCP or UDP header or the protocol
field of an IP header.  The specified SAP should not conflict with any
other active SAPs on the specified PPA's (@option{-p} option).
@item -w <sizespec>
This option specifies the local send and receive window sizes in units
of frames on those platforms which support setting such things.
@item -W <sizespec>
This option specifies the remote send and receive window sizes in
units of frames on those platforms which support setting such things.
@end table

The DLCO_STREAM test is only present if netperf was configured with
@code{--enable-dlpi=yes}. The remote netserver must have also been
configured with @code{--enable-dlpi=yes}.


@node DLCL_STREAM, STREAM_STREAM, DLCO_STREAM, Options common to TCP UDP and SCTP tests
@subsection DLCL_STREAM

A DLPI ConnectionLess Stream (DLCL_STREAM) test is analogous to a
@ref{UDP_STREAM} test in that both make use of unreliable/best-effort,
connection-less transports.  The DLCL_STREAM test differs from the
@ref{UDP_STREAM} test in that the message size (@option{-m} option) must
always be less than the link MTU as there is no IP-like fragmentation
and reassembly available and netperf does not presume to provide one.

The test-specific command-line options for a DLCL_STREAM test are the
same as those for a @ref{DLCO_STREAM} test.

The DLCL_STREAM test is only present if netperf was configured with
@code{--enable-dlpi=yes}. The remote netserver must have also been
configured with @code{--enable-dlpi=yes}.

@node STREAM_STREAM, DG_STREAM, DLCL_STREAM, Options common to TCP UDP and SCTP tests
@comment  node-name,  next,  previous,  up
@subsection STREAM_STREAM

A Unix Domain Stream Socket Stream test (STREAM_STREAM) is similar in
concept to a @ref{TCP_STREAM} test, but using Unix Domain sockets.  It is,
naturally, limited to intra-machine traffic.  A STREAM_STREAM test
shares the @option{-m}, @option{-M}, @option{-s} and @option{-S}
options of the other _STREAM tests.  In a STREAM_STREAM test the
@option{-p} option sets the directory in which the pipes will be
created rather than setting a port number.  The default is to create
the pipes in the system default for the @code{tempnam()} call.

The STREAM_STREAM test is only present if netperf was configured with
@code{--enable-unix=yes}. The remote netserver must have also been
configured with @code{--enable-unix=yes}.

@node DG_STREAM,  , STREAM_STREAM, Options common to TCP UDP and SCTP tests
@comment  node-name,  next,  previous,  up
@subsection DG_STREAM

A Unix Domain Datagram Socket Stream test (SG_STREAM) is very much
like a @ref{TCP_STREAM} test except that message boundaries are preserved.
In this way, it may also be considered similar to certain flavors of
SCTP test which can also preserve message boundaries.

All the options of a @ref{STREAM_STREAM} test are applicable to a DG_STREAM
test. 

The DG_STREAM test is only present if netperf was configured with
@code{--enable-unix=yes}. The remote netserver must have also been
configured with @code{--enable-unix=yes}.


@node Using Netperf to Measure Request/Response , Using Netperf to Measure Aggregate Performance, Using Netperf to Measure Bulk Data Transfer, Top
@chapter Using Netperf to Measure Request/Response 

Request/response performance is often overlooked, yet it is just as
important as bulk-transfer performance.  While things like larger
socket buffers and TCP windows can cover a multitude of latency and
even path-length sins, they cannot easily hide from a request/response
test.  The convention for a request/response test is to have a _RR
suffix.  There are however a few ``request/response'' tests that have
other suffixes.

A request/response test, particularly synchronous, one transaction at
at time test such as those found in netperf, is particularly sensitive
to the path-length of the networking stack.  An _RR test can also
uncover those platforms where the NIC's are strapped by default with
overbearing interrupt avoidance settings in an attempt to increase the
bulk-transfer performance (or rather, decrease the CPU utilization of
a bulk-transfer test).  This sensitivity is most acute for small
request and response sizes, such as the single-byte default for a
netperf _RR test.

While a bulk-transfer test reports its results in units of bits or
bytes transfered per second, a mumble_RR test reports transactions per
second where a transaction is defined as the completed exchange of a
request and a response.  One can invert the transaction rate to arrive
at the average round-trip latency.  If one is confident about the
symmetry of the connection, the average one-way latency can be taken
as one-half the average round-trip latency.  Netperf does not do
either of these on its own but leaves them as exercises to the
benchmarker.

@menu
* Issues in Request/Response::  
* Options Common to TCP UDP and SCTP _RR tests::  
@end menu

@node Issues in Request/Response, Options Common to TCP UDP and SCTP _RR tests, Using Netperf to Measure Request/Response , Using Netperf to Measure Request/Response
@comment  node-name,  next,  previous,  up
@section Issues in Reqeust/Response

Most if not all the @ref{Issues in Bulk Transfer} apply to
request/response.  The issue of round-trip latency is even more
important as netperf generally only has one transaction outstanding at
a time.

A single instance of a one transaction outstanding _RR test should
_never_ completely saturate the CPU of a system.  If testing between
otherwise evenly matched systems, the symmetric nature of a _RR test
with equal request and response sizes should result in equal CPU
loading on both systems. However, this may not hold true on MP
systems, particularly if one CPU binds the netperf and netserver
differently via the global @option{-T} option.

For smaller request and response sizes packet loss is a bigger issue
as there is no opportunity for a @dfn{fast retransmit} or
retransmission prior to a retransmission timer expiring.

Certain NICs have ways to minimize the number of interrupts sent to
the host.  If these are strapped badly they can significantly reduce
the performance of something like a single-byte request/response test.
Such setups are distinguised by seriously low reported CPU utilization
and what seems like a low (even if in the thousands) transaction per
second rate.  Also, if you run such an OS/driver combination on faster
or slower hardware and do not see a corresponding change in the
transaction rate, chances are good that the drvier is strapping the
NIC with aggressive interrupt avoidance settings.  Good for bulk
throughput, but bad for latency.

Some drivers may try to automagically adjust the interrupt avoidance
settings.  If they are not terribly good at it, you will see
considerable run-to-run variation in reported transaction rates.
Particularly if you ``mix-up'' _STREAM and _RR tests.


@node Options Common to TCP UDP and SCTP _RR tests,  , Issues in Request/Response, Using Netperf to Measure Request/Response
@comment  node-name,  next,  previous,  up
@section Options Common to TCP UDP and SCTP _RR tests

Many ``test-specific'' options are actually common across the
different tests.  For those tests involving TCP, UDP and SCTP, whether
using the BSD Sockets or the XTI interface those common options
include:

@table @code
@vindex -h, Test-specific
@item -h
Display the test-suite-specific usage string and exit.  For a TCP_ or
UDP_ test this will be the usage string from the source file
@file{nettest_bsd.c}.  For an XTI_ test, this will be the usage string
from the source file @file{src/nettest_xti.c}.  For an SCTP test, this
will be the usage string from the source file
@file{src/nettest_sctp.c}.

@vindex -H, Test-specific
@item -H <optionspec>
Normally, the remote hostname|IP and address family information is
inherited from the settings for the control connection (eg global
command-line @option{-H}, @option{-4} and/or @option{-6} options.
The test-specific @option{-H} will override those settings for the
data (aka test) connection only.  Settings for the control connection
are left unchanged.  This might be used to cause the control and data
connections to take different paths through the network.

@vindex -L, Test-specific
@item -L <optionspec>
The test-specific @option{-L} option is identical to the test-specific
@option{-H} option except it affects the local hostname|IP and address
family information.  As with its global command-line counterpart, this
is generally only useful when measuring though those evil, end-to-end
breaking things called firewalls.

@vindex -P, Test-specific
@item -P <optionspec>
Set the local and/or remote port numbers for the data connection.

@vindex -r, Test-specific
@item -r <sizespec>
This option sets the request (first value) and/or response (second
value) sizes for an _RR test. By default the units are bytes, but a
suffix of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30
(GB), 2^20 (MB) or 2^10 (KB) respectively.  A suffix of ``g,'' ``m''
or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes
respectively. For example:
@example
@code{-r 128,16K}
@end example
Will set the request size to 128 bytes and the response size to 16 KB
or 16384 bytes. [Default: 1 - a single-byte request and response ]

@vindex -s, Test-specific
@item -s <sizespec>
This option sets the local send and receive socket buffer sizes for
the data connection to the value(s) specified.  Often, this will
affect the advertised and/or effective TCP or other window, but on
some platforms it may not. By default the units are bytes, but a
suffix of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30
(GB), 2^20 (MB) or 2^10 (KB) respectively.  A suffix of ``g,'' ``m''
or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes
respectively. For example:
@example
@code{-s 128K}
@end example
Will request the local send and receive socket buffer sizes to be
128KB or 131072 bytes. 

While the historic expectation is that setting the socket buffer size
has a direct effect on say the TCP window, today that may not hold
true for all stacks.  When running under Windows a value of 0 may be
used which will be an indication to the stack the user wants to enable
a form of copy avoidance. [Default: -1 - use the system's default
socket buffer sizes]

@vindex -S, Test-specific
@item -S <sizespec>
This option sets the remote send and/or receive socket buffer sizes
for the data connection to the value(s) specified.  Often, this
will affect the advertised and/or effective TCP or other window, but
on some platforms it may not. By default the units are bytes, but a
suffix of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30
(GB), 2^20 (MB) or 2^10 (KB) respectively.  A suffix of ``g,'' ``m''
or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes respectively.
For example:
@example
@code{-s 128K}
@end example
Will request the local send and receive socket buffer sizes to be
128KB or 131072 bytes. 

While the historic expectation is that setting the socket buffer size
has a direct effect on say the TCP window, today that may not hold
true for all stacks.  When running under Windows a value of 0 may be
used which will be an indication to the stack the user wants to enable
a form of copy avoidance.  [Default: -1 - use the system's default
socket buffer sizes]

@vindex -4, Test-specific
@item -4
Set the local and remote address family for the data connection to
AF_INET - ie use IPv4 addressing only.  Just as with their global
command-line counterparts the last of the @option{-4}, @option{-6},
@option{-H} or @option{-L} option wins for their respective address
families.

@vindex -6 Test-specific
@item -6
This option is identical to its @option{-4} cousin, but requests IPv6
addresses for the local and remote ends of the data connection.

@end table

@menu
* TCP_RR::                      
* TCP_CC::                      
* TCP_CRR::                     
* UDP_RR::                      
* XTI_TCP_RR::                  
* XTI_TCP_CC::                  
* XTI_TCP_CRR::                 
* XTI_UDP_RR::                  
* DLCL_RR::                     
* DLCO_RR::                     
* SCTP_RR::                     
@end menu

@node TCP_RR, TCP_CC, Options Common to TCP UDP and SCTP _RR tests, Options Common to TCP UDP and SCTP _RR tests
@cindex Measuring Latency
@cindex Latency, Request-Response
@subsection TCP_RR

A TCP_RR (TCP Request/Response) test is requested by passing a value
of ``TCP_RR'' to the global @option{-t} command-line option.  A TCP_RR
test can be though-of as a user-space to user-space @code{ping} with
no think time - it is a synchronous, one transaction at a time,
request/response test.

The transaction rate is the number of complete transactions exchanged
divided by the length of time it took to perform those transactions.

If the two Systems Under Test are otherwise identical, a TCP_RR test
with the same request and response size should be symmetric - it
should not matter which way the test is run, and the CPU utilization
measured should be virtually the same on each system.  If not, it
suggests that the CPU utilization mechanism being used may have some,
well, issues measuring CPU utilization completely and accurately.

Time to establish the TCP connection is not counted in the result.  If
you want connection setup overheads included, you should consider the
TCP_CC or TCP_CRR tests.

If specifying the @option{-D} option to set TCP_NODELAY and disable
the Nagle Algorithm increases the transaction rate reported by a
TCP_RR test, it implies the stack(s) over which the TCP_RR test is
running have a broken implementation of the Nagle Algorithm.  Likely
as not they are interpreting Nagle on a segment by segment basis
rather than a user send by user send basis.  You should contact your
stack vendor(s) to report the problem to them.

Here is an example of two systems running a basic TCP_RR test over a
10 Gigabit Ethernet link:

@example
netperf -t TCP_RR -H 192.168.2.125
TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET
Local /Remote
Socket Size   Request  Resp.   Elapsed  Trans.
Send   Recv   Size     Size    Time     Rate         
bytes  Bytes  bytes    bytes   secs.    per sec   

16384  87380  1        1       10.00    29150.15   
16384  87380 
@end example

In this example the request and response sizes were one byte, the
socket buffers were left at their defaults, and the test ran for all
of 10 seconds.  The transaction per second rate was rather good :)

@node TCP_CC, TCP_CRR, TCP_RR, Options Common to TCP UDP and SCTP _RR tests
@cindex Connection Latency
@cindex Latency, Connection Establishment
@subsection TCP_CC

A TCP_CC (TCP Connect/Close) test is requested by passing a value of
``TCP_CC'' to the global @option{-t} option.  A TCP_CC test simply
measures how fast the pair of systems can open and close connections
between one another in a synchronous (one at a time) manner.  While
this is considered an _RR test, no request or response is exchanged
over the connection.

@cindex Port Reuse
@cindex TIME_WAIT
The issue of TIME_WAIT reuse is an important one for a TCP_CC test.
Basically, TIME_WAIT reuse is when a pair of systems churn through
connections fast enough that they wrap the 16-bit port number space in
less time than the length of the TIME_WAIT state.  While it is indeed
theoretically possible to ``reuse'' a connection in TIME_WAIT, the
conditions under which such reuse is possible are rather rare.  An
attempt to reuse a connection in TIME_WAIT can result in a non-trivial
delay in connection establishment.

Basically, any time the connection churn rate approaches:

Sizeof(clientportspace) / Lengthof(TIME_WAIT)

there is the risk of TIME_WAIT reuse.  To minimize the chances of this
happening, netperf will by default select its own client port numbers
from the range of 5000 to 65535.  On systems with a 60 second
TIME_WAIT state, this should allow roughly 1000 transactions per
second.  The size of the client port space used by netperf can be
controlled via the test-specific @option{-p} option, which takes a
@dfn{sizespec} as a value setting the minimum (first value) and
maximum (second value) port numbers used by netperf at the client end.

Since no requests or responses are exchanged during a TCP_CC test,
only the @option{-H}, @option{-L}, @option{-4} and @option{-6} of the
``common'' test-specific options are likely to have an effect, if any,
on the results.  The @option{-s} and @option{-S} options _may_ have
some effect if they alter the number and/or type of options carried in
the TCP SYNchronize segments.  The @option{-P}  and @option{-r}
options are utterly ignored.

Since connection establishment and tear-down for TCP is not symmetric,
a TCP_CC test is not symmetric in its loading of the two systems under
test.

@node TCP_CRR, UDP_RR, TCP_CC, Options Common to TCP UDP and SCTP _RR tests
@cindex Latency, Connection Establishment
@cindex Latency, Request-Response
@subsection TCP_CRR

The TCP Connect/Request/Response (TCP_CRR) test is requested by
passing a value of ``TCP_CRR'' to the global @option{-t} command-line
option.  A TCP_RR test is like a merger of a TCP_RR and TCP_CC test
which measures the performance of establishing a connection, exchanging
a single request/response transaction, and tearing-down that
connection.  This is very much like what happens in an HTTP 1.0 or
HTTP 1.1 connection when HTTP Keepalives are not used.  In fact, the
TCP_CRR test was added to netperf to simulate just that.

Since a request and response are exchanged the @option{-r},
@option{-s} and @option{-S} options can have an effect on the
performance.

The issue of TIME_WAIT reuse exists for the TCP_CRR test just as it
does for the TCP_CC test.  Similarly, since connection establishment
and tear-down is not symmetric, a TCP_CRR test is not symmetric even
when the request and response sizes are the same.

@node UDP_RR, XTI_TCP_RR, TCP_CRR, Options Common to TCP UDP and SCTP _RR tests
@cindex Latency, Request-Response
@cindex Packet Loss
@subsection UDP_RR

A UDP Request/Response (UDP_RR) test is requested by passing a value
of ``UDP_RR'' to a global @option{-t} option.  It is very much the
same as a TCP_RR test except UDP is used rather than TCP.

UDP does not provide for retransmission of lost UDP datagrams, and
netperf does not add anything for that either.  This means that if
_any_ request or response is lost, the exchange of requests and
responses will stop from that point until the test timer expires.
Netperf will not really ``know'' this has happened - the only symptom
will be a low transaction per second rate.

The netperf side of a UDP_RR test will call @code{connect()} on its
data socket and thenceforth use the @code{send()} and @code{recv()}
socket calls.  The netserver side of a UDP_RR test will not call
@code{connect()} and will use @code{recvfrom()} and @code{sendto()}
calls.  This means that even if the request and response sizes are the
same, a UDP_RR test is _not_ symmetric in its loading of the two
systems under test.

Here is an example of a UDP_RR test between two otherwise
identical two-CPU systems joined via a 1 Gigabit Ethernet network:

@example
$ netperf -T 1 -H 192.168.1.213 -t UDP_RR -c -C
UDP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.213 (192.168.1.213) port 0 AF_INET
Local /Remote
Socket Size   Request Resp.  Elapsed Trans.   CPU    CPU    S.dem   S.dem
Send   Recv   Size    Size   Time    Rate     local  remote local   remote
bytes  bytes  bytes   bytes  secs.   per sec  % I    % I    us/Tr   us/Tr

65535  65535  1       1      10.01   15262.48   13.90  16.11  18.221  21.116
65535  65535 
@end example

This example includes the @option{-c} and @option{-C} options to
enable CPU utilization reporting and shows the asymmetry in CPU
loading.  The @option{-T} option was used to make sure netperf and
netserver ran on a given CPU and did not move around during the test.

@node XTI_TCP_RR, XTI_TCP_CC, UDP_RR, Options Common to TCP UDP and SCTP _RR tests
@cindex Latency, Request-Response
@subsection XTI_TCP_RR

An XTI_TCP_RR test is essentially the same as a @ref{TCP_RR} test only
using the XTI rather than BSD Sockets interface. It is requested by
passing a value of ``XTI_TCP_RR'' to the @option{-t} global
command-line option.

The test-specific options for an XTI_TCP_RR test are the same as those
for a TCP_RR test with the addition of the @option{-X <devspec>} option to
specify the names of the local and/or remote XTI device file(s).

@node XTI_TCP_CC, XTI_TCP_CRR, XTI_TCP_RR, Options Common to TCP UDP and SCTP _RR tests
@comment  node-name,  next,  previous,  up
@cindex Latency, Connection Establishment
@subsection XTI_TCP_CC

@node XTI_TCP_CRR, XTI_UDP_RR, XTI_TCP_CC, Options Common to TCP UDP and SCTP _RR tests
@comment  node-name,  next,  previous,  up
@cindex Latency, Connection Establishment
@cindex Latency, Request-Response
@subsection XTI_TCP_CRR

@node XTI_UDP_RR, DLCL_RR, XTI_TCP_CRR, Options Common to TCP UDP and SCTP _RR tests
@cindex Latency, Request-Response
@subsection XTI_UDP_RR

An XTI_UDP_RR test is essentially the same as a UDP_RR test only using
the XTI rather than BSD Sockets interface.  It is requested by passing
a value of ``XTI_UDP_RR'' to the @option{-t} global command-line
option.

The test-specific options for an XTI_UDP_RR test are the same as those
for a UDP_RR test with the addition of the @option{-X <devspec>}
option to specify the name of the local and/or remote XTI device
file(s).

@node DLCL_RR, DLCO_RR, XTI_UDP_RR, Options Common to TCP UDP and SCTP _RR tests
@comment  node-name,  next,  previous,  up
@cindex Latency, Request-Response
@subsection DLCL_RR

@node DLCO_RR, SCTP_RR, DLCL_RR, Options Common to TCP UDP and SCTP _RR tests
@comment  node-name,  next,  previous,  up
@cindex Latency, Request-Response
@subsection DLCO_RR

@node SCTP_RR,  , DLCO_RR, Options Common to TCP UDP and SCTP _RR tests
@comment  node-name,  next,  previous,  up
@cindex Latency, Request-Response
@subsection SCTP_RR

@node Using Netperf to Measure Aggregate Performance, Using Netperf to Measure Bidirectional Transfer, Using Netperf to Measure Request/Response , Top
@comment  node-name,  next,  previous,  up
@cindex Aggregate Performance
@vindex --enable-burst, Configure
@chapter Using Netperf to Measure Aggregate Performance

@ref{Netperf4,Netperf4} is the preferred benchmark to use when one
wants to measure aggregate performance because netperf has no support
for explicit synchronization of concurrent tests.

Basically, there are two ways to measure aggregate performance with
netperf.  The first is to run multiple, concurrent netperf tests and
can be applied to any of the netperf tests.  The second is to
configure netperf with @code{--enable-burst} and is applicable to the
TCP_RR test.

@menu
* Running Concurrent Netperf Tests::  
* Using --enable-burst::        
@end menu

@node  Running Concurrent Netperf Tests, Using --enable-burst, Using Netperf to Measure Aggregate Performance, Using Netperf to Measure Aggregate Performance
@comment  node-name,  next,  previous,  up
@section Running Concurrent Netperf Tests

@ref{Netperf4,Netperf4} is the preferred benchmark to use when one
wants to measure aggregate performance because netperf has no support
for explicit synchronization of concurrent tests.  This leaves
netperf2 results vulnerable to @dfn{skew} errors.

However, since there are times when netperf4 is unavailable it may be
necessary to run netperf. The skew error can be minimized by making
use of the confidence interval functionality.  Then one simply
launches multiple tests from the shell using a @code{for} loop or the
like:

@example
for i in 1 2 3 4
do
netperf -t TCP_STREAM -H tardy.cup.hp.com -i 10 -P 0 &
done
@end example

which will run four, concurrent @ref{TCP_STREAM,TCP_STREAM} tests from
the system on which it is executed to tardy.cup.hp.com.  Each
concurrent netperf will iterate 10 times thanks to the @option{-i}
option and will omit the test banners (option @option{-P}) for
brevity.  The output looks something like this:

@example
 87380  16384  16384    10.03     235.15   
 87380  16384  16384    10.03     235.09   
 87380  16384  16384    10.03     235.38   
 87380  16384  16384    10.03     233.96
@end example

We can take the sum of the results and be reasonably confident that
the aggregate performance was 940 Mbits/s.

If you see warnings about netperf not achieving the confidence
intervals, the best thing to do is to increase the number of
iterations with @option{-i} and/or increase the run length of each
iteration with @option{-l}.

You can also enable local (@option{-c}) and/or remote (@option{-C})
CPU utilization:

@example
for i in 1 2 3 4
do
netperf -t TCP_STREAM -H tardy.cup.hp.com -i 10 -P 0 -c -C &
done

87380  16384  16384    10.03       235.47   3.67     5.09     10.226  14.180 
87380  16384  16384    10.03       234.73   3.67     5.09     10.260  14.225 
87380  16384  16384    10.03       234.64   3.67     5.10     10.263  14.231 
87380  16384  16384    10.03       234.87   3.67     5.09     10.253  14.215
@end example

If the CPU utilizations reported for the same system are the same or
very very close you can be reasonably confident that skew error is
minimized.  Presumeably one could then omit @option{-i} but that is
not advised, particularly when/if the CPU utilization approaches 100
percent.  In the example above we see that the CPU utilization on the
local system remains the same for all four tests, and is only off by
0.01 out of 5.09 on the remote system.

@quotation
@b{NOTE: It is very important to rememeber that netperf is calculating
system-wide CPU utilization.  When calculating the service demand
(those last two columns in the output above) each netperf assumes it
is the only thing running on the system.  This means that for
concurrent tests the service demands reported by netperf will be
wrong.  One has to compute service demands for concurrent tests by
hand.}
@end quotation

If you wish you can add a unique, global @option{-B} option to each
command line to append the given string to the output:

@example
for i in 1 2 3 4
do
netperf -t TCP_STREAM -H tardy.cup.hp.com -B "this is test $i" -i 10 -P 0 &
done

87380  16384  16384    10.03     234.90   this is test 4
87380  16384  16384    10.03     234.41   this is test 2
87380  16384  16384    10.03     235.26   this is test 1
87380  16384  16384    10.03     235.09   this is test 3
@end example

You will notice that the tests completed in an order other than they
were started from the shell.  This underscores why there is a threat
of skew error and why netperf4 is the preferred tool for aggregate
tests.  Even if you see the Netperf Contributing Editor acting to the
contrary!-)

@node  Using --enable-burst,  , Running Concurrent Netperf Tests, Using Netperf to Measure Aggregate Performance
@comment  node-name,  next,  previous,  up
@section Using --enable-burst

If one configures netperf with @code{--enable-burst}:

@example
configure --enable-burst
@end example

Then a test-specific @option{-b num} option is added to the
@ref{TCP_RR,TCP_RR} and @ref{UDP_RR,UDP_RR} tests. This option causes
TCP_RR and UDP_RR to quickly work their way up to having at least
@option{num} transactions in flight at one time.

This is used as an alternative to or even in conjunction with
multiple-concurrent _RR tests.  When run with just a single instance
of netperf, increasing the burst size can determine the maximum number
of transactions per second can be serviced by a single process:

@example
for b in 0 1 2 4 8 16 32
do 
 netperf -v 0 -t TCP_RR -B "-b $b" -H hpcpc108 -P 0 -- -b $b
done

9457.59 -b 0
9975.37 -b 1
10000.61 -b 2
20084.47 -b 4
29965.31 -b 8
71929.27 -b 16
109718.17 -b 32
@end example

The global @option{-v} and @option{-P} options were used to minimize
the output to the single figure of merit which in this case the
transaction rate.  The global @code{-B} option was used to more
clearly label the output, and the test-specific @option{-b} option
enabled by @code{--enable-burst} set the number of transactions in
flight at one time.

Now, since the test-specific @option{-D} option was not specified to
set TCP_NODELAY, the stack was free to ``bundle'' requests and/or
responses into TCP segments as it saw fit, and since the default
request and response size is one byte, there could have been some
considerable bundling.  If one wants to try to achieve a closer to
one-to-one correspondence between a request and response and a TCP
segment, add the test-specific @option{-D} option:

@example
for b in 0 1 2 4 8 16 32
do
 netperf -v 0 -t TCP_RR -B "-b $b -D" -H hpcpc108 -P 0 -- -b $b -D
done

 8695.12 -b 0 -D
 19966.48 -b 1 -D
 20691.07 -b 2 -D
 49893.58 -b 4 -D
 62057.31 -b 8 -D
 108416.88 -b 16 -D
 114411.66 -b 32 -D
@end example

You can see that this has a rather large effect on the reported
transaction rate.  In this particular instance, the author believes it
relates to interactions between the test and interrupt coalescing
settings in the driver for the NICs used.

@quotation
@b{NOTE: Even if you set the @option{-D} option that is still not a
guarantee that each transaction is in its own TCP segments.  You
should get into the habit of verifying the relationship between the
transaction rate and the packet rate via other means}
@end quotation

You can also combine @code{--enable-burst} functionality with
concurrent netperf tests.  This would then be an ``aggregate of
aggregates'' if you like:

@example

for i in 1 2 3 4
do
 netperf -H hpcpc108 -v 0 -P 0 -i 10 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D &
done

 46668.38 aggregate 4 -b 8 -D
 44890.64 aggregate 2 -b 8 -D
 45702.04 aggregate 1 -b 8 -D
 46352.48 aggregate 3 -b 8 -D

@end example

Since each netperf did hit the confidence intervals, we can be
reasonably certain that the aggregate transaction per second rate was
the sum of all four concurrent tests, or something just shy of 184,000
transactions per second.  To get some idea if that was also the packet
per second rate, we could bracket that @code{for} loop with something
to gather statistics and run the results through
@uref{ftp://ftp.cup.hp.com/dist/networking/tools,beforeafter}:

@example
/usr/sbin/ethtool -S eth2 > before
for i in 1 2 3 4
do
 netperf -H 192.168.2.108 -l 60 -v 0 -P 0 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D &
done
wait
/usr/sbin/ethtool -S eth2 > after

 52312.62 aggregate 2 -b 8 -D
 50105.65 aggregate 4 -b 8 -D
 50890.82 aggregate 1 -b 8 -D
 50869.20 aggregate 3 -b 8 -D

beforeafter before after > delta

grep packets delta
     rx_packets: 12251544
     tx_packets: 12251550

@end example

This example uses @code{ethtool} because the system being used is
running Linux.  Other platforms have other tools - for example HP-UX
has lanadmin:

@example
lanadmin -g mibstats <ppa>
@end example

and of course one could instead use @code{netstat}.

The @code{wait} is important because we are launching concurrent
netperfs in the background.  Without it, the second ethtool command
would be run before the tests finished and perhaps even before the
last of them got started!

The sum of the reported transaction rates is 204178 over 60 seconds,
which is a total of 12250680 transactions.  Each transaction is the
exchange of a request and a response, so we multiply that by 2 to
arrive at 24501360.

The sum of the ethtool stats is 24503094 packets which matches what
netperf was reporting very well. 

Had the request or response size differed, we would need to know how
it compared with the @dfn{MSS} for the connection.

Just for grins, here is the excercise repeated, using @code{netstat}
instead of @code{ethtool}

@example
netstat -s -t > before
for i in 1 2 3 4
do
 netperf -l 60 -H 192.168.2.108 -v 0 -P 0 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D & done
wait
netstat -s -t > after

 51305.88 aggregate 4 -b 8 -D
 51847.73 aggregate 2 -b 8 -D
 50648.19 aggregate 3 -b 8 -D
 53605.86 aggregate 1 -b 8 -D

beforeafter before after > delta

grep segments delta
    12445708 segments received
    12445730 segments send out
    1 segments retransmited
    0 bad segments received.
@end example

The sums are left as an excercise to the reader :)

Things become considerably more complicated if there are non-trvial
packet losses and/or retransmissions.

Of course all this checking is unnecessary if the test is a UDP_RR
test because UDP ``never'' aggregates multiple sends into the same UDP
datagram, and there are no ACKnowledgements in UDP.  The loss of a
single request or response will not bring a ``burst'' UDP_RR test to a
screeching halt, but it will reduce the number of transactions
outstanding at any one time.  A ``burst'' UDP_RR test @b{will} come to a
halt if the sum of the lost requests and responses reaches the value
specified in the test-specific @option{-b} option.


@node Using Netperf to Measure Bidirectional Transfer, Other Netperf Tests, Using Netperf to Measure Aggregate Performance, Top
@comment  node-name,  next,  previous,  up
@chapter Using Netperf to Measure Bidirectional Transfer

There are two ways to use netperf to measure the perfomance of
bidirectional transfer.  The first is to run concurrent netperf tests
from the command line.  The second is to configure netperf with
@code{--enable-burst} and use a single instance of the
@ref{TCP_RR,TCP_RR} test.

While neither method is more ``correct'' than the other, each is doing
so in different ways, and that has possible implications.  For
instance, using the concurrent netperf test mechanism means that
multiple TCP connections and multiple processes are involved, whereas
using the single instance of TCP_RR there is only one TCP connection
and one process on each end.  They may behave differently, especially
on an MP system.

@menu
* Bidirectional Transfer with Concurrent Tests::  
* Bidirectional Transfer with TCP_RR::  
@end menu

@node  Bidirectional Transfer with Concurrent Tests, Bidirectional Transfer with TCP_RR, Using Netperf to Measure Bidirectional Transfer, Using Netperf to Measure Bidirectional Transfer
@comment  node-name,  next,  previous,  up
@section Bidirectional Transfer with Concurrent Tests

If we had two hosts Fred and Ethel, we could simply run a netperf
@ref{TCP_STREAM,TCP_STREAM} test on Fred pointing at Ethel, and a
concurrent netperf TCP_STREAM test on Ethel pointing at Fred, but
since there are no mechanisms to synchronize netperf tests and we
would be starting tests from two different systems, there is a
considerable risk of skew error.

Far better would be to run simultaneous TCP_STREAM and
@ref{TCP_MAERTS,TCP_MAERTS} tests from just @b{one} system, using the
concepts and procedures outlined in @ref{Running Concurrent Netperf
Tests,Running Concurrent Netperf Tests}. Here then is an example:

@example
for i in 1
do
 netperf -H 192.168.2.108 -t TCP_STREAM -B "outbound" -i 10 -P 0 -v 0 -- -s 256K -S 256K &
 netperf -H 192.168.2.108 -t TCP_MAERTS -B "inbound"  -i 10 -P 0 -v 0 -- -s 256K -S 256K &
done

 892.66 outbound
 891.34 inbound

@end example

We have used a @code{for} loop in the shell with just one iteration
because that will be @b{much} easier to get both tests started at more or
less the same time than doing it by hand.  The global @option{-P} and
@option{-v} options are used because we aren't interested in anything
other than the throughput, and the global @option{-B} option is used
to tag each output so we know which was inbound and which outbound
relative to the system on which we were running netperf.  Of course
that sense is switched on the system running netserver :)  The use of
the global @option{-i} option is explained in @ref{Running Concurrent
Netperf Tests,Running Concurrent Netperf Tests}.

@node  Bidirectional Transfer with TCP_RR,  , Bidirectional Transfer with Concurrent Tests, Using Netperf to Measure Bidirectional Transfer
@comment  node-name,  next,  previous,  up
@section Bidirectional Transfer with TCP_RR

If one configures netperf with @code{--enable-burst} then one can use
the test-specific @option{-b} option to increase the number of
transactions in flight at one time.  If one also uses the -r option to
make those transactions larger the test starts to look more and more
like a bidirectional transfer than a request/response test.

Now, the logic behing @code{--enable-burst} is very simple, and there
are no calls to @code{poll()} or @code{select()} which means we want
to make sure that the @code{send()} calls will never block, or we run
the risk of deadlock with each side stuck trying to call @code{send()}
and neither calling @code{recv()}.  

Fortunately, this is easily accomplished by setting a ``large enough''
socket buffer size with the test-specific @option{-s} and @option{-S}
options.  Presently this must be performed by the user.  Future
versions of netperf might attempt to do this automagically, but there
are some issues to be worked-out. 

Here then is an example of a bidirectional transfer test using
@code{--enable-burst} and the @ref{TCP_RR,TCP_RR} test:

@example
netperf -t TCP_RR -H hpcpc108 -- -b 6 -r 32K -s 256K -S 256K
TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to hpcpc108.cup.hp.com (16.89.84.108) port 0 AF_INET : first burst 6
Local /Remote
Socket Size   Request  Resp.   Elapsed  Trans.
Send   Recv   Size     Size    Time     Rate         
bytes  Bytes  bytes    bytes   secs.    per sec   

524288 524288 32768    32768   10.01    3525.97   
524288 524288

@end example

Now, at present netperf does not include a bit or byte rate in the
output of an _RR test which means we must calculate it ourselves. Each
transaction is the exchange of 32768 bytes of request and 32768 bytes
of response, or 65536 bytes.  Multiply that by 8 and we arrive at
524288 bits per transaction.  Multiply that by 3525.97 and we arrive
at 1848623759 bits per second.  Since things were uniform, we can
divide that by two and arrive at roughly 924311879 bits per second
each way.  That corresponds to ``link-rate'' for a 1 Gigiabit Ethernet
which happens to be the type of netpwrk used in the example.

A future version of netperf may perform the calculation on behalf of
the user, but it would likely not emit it unless the user specified a
verbosity of 2 or more with the global @option{-v} option.

@node Other Netperf Tests, Address Resolution, Using Netperf to Measure Bidirectional Transfer, Top
@chapter Other Netperf Tests

Apart from the typical performance tests, netperf contains some tests
which can be used to streamline measurements and reporting.  These
include CPU rate calibration (present) and host identification (future
enhancement).

@menu
* CPU rate calibration::        
@end menu

@node CPU rate calibration,  , Other Netperf Tests, Other Netperf Tests
@section CPU rate calibration

Some of the CPU utilization measurement mechanisms of netperf work by
comparing the rate at which some counter increments when the system is
idle with the rate at which that same counter increments when the
system is running a netperf test.  The ratio of those rates is used to
arrive at a CPU utilization percentage.

This means that netperf must know the rate at which the counter
increments when the system is presumed to be ``idle.''  If it does not
know the rate, netperf will measure it before starting a data transfer
test.  This calibration step takes 40 seconds for each of the local or
remote ystems, and if repeated for each netperf test would make taking
repeated measurements rather slow.

Thus, the netperf CPU utilization options @option{-c} and and
@option{-C} can take an optional calibration value.  This value is
used as the ``idle rate'' and the calibration step is not
performed. To determine the idle rate, netperf can be used to run
special tests which only report the value of the calibration - they
are the LOC_CPU and REM_CPU tests.  These return the calibration value
for the local and remote system respectively.  A common way to use
these tests is to store their results into an environment variable and
use that in subsequent netperf commands:

@example
LOC_RATE=`netperf -t LOC_CPU`
REM_RATE=`netperf -H <remote> -t REM_CPU`
netperf -H <remote> -c $LOC_RATE -C $REM_RATE ... -- ...
...
netperf -H <remote> -c $LOC_RATE -C $REM_RATE ... -- ...
@end example

If you are going to use netperf to measure aggregate results, it is
important to use the LOC_CPU and REM_CPU tests to get the calibration
values first to avoid issues with some of the aggregate netperf tests
transferring data while others are ``idle'' and getting bogus
calibration values.  When running aggregate tests, it is very
important to remember that any one instance of netperf does not know
about the other instances of netperf.  It will report global CPU
utilization and will calculate service demand believing it was the
only thing causing that CPU utilization.  So, you can use the CPU
utilization reported by netperf in an aggregate test, but you have to
calculate service demands by hand.

@node Address Resolution, Enhancing Netperf, Other Netperf Tests, Top
@comment  node-name,  next,  previous,  up
@chapter Address Resolution

Netperf versions 2.4.0 and later have merged IPv4 and IPv6 tests so
the functionality of the tests in @file{src/nettest_ipv6.c} has been
subsumed into the tests in @file{src/nettest_bsd.c}  This has been
accomplished in part by switching from @code{gethostbyname()}to
@code{getaddrinfo()} exclusively.  While it was theoretically possible
to get multiple results for a hostname from @code{gethostbyname()} it
was generally unlikely and netperf's ignoring of the second and later
results was not much of an issue.

Now with @code{getaddrinfo} and particularly with AF_UNSPEC it is
increasingly likely that a given hostname will have multiple
associated addresses.  The @code{establish_control()} routine of
@file{src/netlib.c} will indeed attempt to chose from among all the
matching IP addresses when establishing the control connection.
Netperf does not _really_ care if the control connection is IPv4 or
IPv6 or even mixed on either end.

However, the individual tests still ass-u-me that the first result in
the address list is the one to be used.  Whether or not this will
turn-out to be an issue has yet to be determined.

If you do run into problems with this, the easiest workaround is to
specify IP addresses for the data connection explicitly in the
test-specific @option{-H} and @option{-L} options.  At some point, the
netperf tests _may_ try to be more sophisticated in their parsing of
returns from @code{getaddrinfo()} - straw-man patches to
@email{netperf-feedback@@netperf.org} would of course be most welcome
:)

Netperf has leveraged code from other open-source projects with
amenable licensing to provide a replacement @code{getaddrinfo()} call
on those platforms where the @command{configure} script believes there
is no native getaddrinfo call.  As of this writing, the replacement
@code{getaddrinfo()} as been tested on HP-UX 11.0 and then presumed to
run elsewhere.

@node Enhancing Netperf, Netperf4, Address Resolution, Top
@comment  node-name,  next,  previous,  up
@chapter Enhancing Netperf

Netperf is constantly evolving.  If you find you want to make
enhancements to netperf, by all means do so.  If you wish to add a new
``suite'' of tests to netperf the general idea is to

@enumerate
@item
Add files @file{src/nettest_mumble.c} and @file{src/nettest_mumble.h}
where mumble is replaced with something meaningful for the test-suite.
@item
Add support for an apropriate @option{--enable-mumble} option in
@file{configure.ac}.
@item
Edit @file{src/netperf.c}, @file{netsh.c}, and @file{netserver.c} as
required, using #ifdef WANT_MUMBLE.
@item
Compile and test
@end enumerate

If you wish to submit your changes for possible inclusion into the
mainline sources, please try to base your changes on the latest
available sources. (@xref{Getting Netperf Bits}.) and then send email
describing the changes at a high level to
@email{netperf-feedback@@netperf.org} or perhaps
@email{netperf-talk@@netperf.org}.  If the concensus is positive, then
sending context @command{diff} results to
@email{netperf-feedback@@netperf.org} is the next step.  From that
point, it is a matter of pestering the Netperf Contributing Editor
until he gets the changes incorporated :)

@node  Netperf4, Concept Index, Enhancing Netperf, Top
@comment  node-name,  next,  previous,  up
@chapter Netperf4

Netperf4 is the shorthand name given to version 4.X.X of netperf.
This is really a separate benchmark more than a newer version of
netperf, but it is a decendant of netperf so the netperf name is
kept.  The facitious way to describe netperf4 is to say it is the
egg-laying-wolly-milk-pig version of netperf :)  The more respectful
way to describe it is to say it is the version of netperf with support
for synchronized, multiple-thread, multiple-test, multiple-system,
network-oriented benchmarking.

Netperf4 is still undergoing rapid evolution. Those wishing to work
with or on netperf4 are encouraged to join the
@uref{http://www.netperf.org/cgi-bin/mailman/listinfo/netperf-dev,netperf-dev}
mailing list and/or peruse the
@uref{http://www.netperf.org/svn/netperf4/trunk,current sources}.

@node Concept Index, Option Index, Netperf4, Top
@unnumbered Concept Index

@printindex cp

@node Option Index,  , Concept Index, Top
@comment  node-name,  next,  previous,  up
@unnumbered Option Index

@printindex vr
@bye                                      

@c  LocalWords:  texinfo setfilename settitle titlepage vskip pt filll ifnottex
@c  LocalWords:  insertcopying cindex dfn uref printindex cp