1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195
|
= Rate Management and the Kiss-o'-Death Packet
include::include-html.ad[]
[cols="10%,90%",frame="none",grid="none",style="verse"]
|==============================
|image:pic/boom4.gif[]|
{millshome}pictures.html[from 'Pogo', Walt Kelly]
Our junior managers and the administrators.
|==============================
== Related Links
include::includes/hand.adoc[]
== Table of Contents
* link:#intro[Introduction]
* link:#guard[Minimum Headway Time]
* link:#mah[Minimum Average Headway Time]
* link:#kiss[The Kiss-o'-Death Packet]
* link:#ref[References]
'''''
[[intro]]
== Introduction
This page describes the various rate management provisions in NTPv4.
Some national time metrology laboratories, including NIST and USNO, use
the NTP reference implementation in their very busy public time servers.
They operate multiple servers behind load-balancing devices to support
aggregate rates up to ten thousand packets per second. The servers need
to defend themselves against all manner of broken client implementations
that can clog the server and network infrastructure. On the other hand,
friendly clients need to avoid configurations that can result in
unfriendly behavior.
A review of past client abuse incidence shows the most frequent scenario
is a broken client that attempts to send packets at rates of one per
second or more. On one occasion due to a defective client design
footnote:[1. Mills, D.L., J. Levine, R. Schmidt and D. Plonka. Coping with
overload on the Network Time Protocol public servers. _Proc. Precision
Time and Time Interval (PTTI) Applications and Planning Meeting_
(Washington DC, December 2004), 5-16. Paper:
{millshome}database/papers/ptti/ptti04a.pdf[PDF],
Slides: {millshome}database/brief/ptti/ptti04.pdf[PDF] |
{millshome}database/brief/ptti/ptti04.ppt[PowerPoint]]
over 750,000 clients demonstrated this abuse. There have been occasions
where this abuse has persisted for days at a time. These scenarios are
the most damaging, as they can threaten not only the victim server but
the network infrastructure as well.
There are several features in the reference implementation designed to
defend the servers and network against accidental or intentional flood
attack. Other features are used to insure that the client is a good
citizen, even if configured in unfriendly ways. The ground rules are:
* Send at the lowest rate consistent with the expected accuracy
requirements.
* Maintain strict guard time and minimum average headway time, even if
multiple burst options are operating.
* When the first packet of a burst is sent to a server, do not send
further packets until the first packet has been received from the
server.
* Upon receiving a Kiss-o'-Death packet (KoD, see below), immediately
reduce the sending rate.
Rate management involves four algorithms to manage resources: (1) poll
rate control, (2) burst control, (3) average headway time and (4) guard
time. The first two algorithms are described on the link:poll.html[Poll
Program] page; the remaining two are described in following sections.
[[guard]]
== Minimum Headway Time
The headway is defined for each source as the interval between the last
packet sent or received and the next packet for that source. The minimum
receive headway is defined as the guard time. In the reference
implementation, if the receive headway is less than the guard time, the
packet is discarded. The guard time defaults to 2 s, but this can be
changed using the +minimum+ option of the
link:accopt.html#discard[+discard+] command. By design, the minimum
interval between +burst+ and +iburst+ packets sent by any client is 2 s,
which does not violate this constraint. Packets sent by other
implementations that violate this constraint will be dropped and a KoD
packet returned, if enabled.
[[mah]]
== Minimum Average Headway Time
There are two features in the reference implementation to manage the
minimum average headway time between one packet and the next, and thus
the maximum average rate for each source. The transmit throttle limits
the rate for transmit packets, while the receive discard limits the rate
for receive packets. These features make use of a pair of counters: a
client output counter for each association and a server input counter
for each distinct client IP address. For each packet received, the input
counter increments by a value equal to the minimum average headway (MAH)
and then decrements by one each second. For each packet transmitted, the
output counter increments by the MAH and then decrements by one each
second. The default MAH is 8 s, but this can be changed using the
+average+ option of the link:accopt.html#discard[+discard+] command.
If the +iburst+ or +burst+ options are present, the poll algorithm sends
a burst of packets instead of a single packet at each poll opportunity.
The NTPv4 specification requires that bursts contain no more than eight
packets. Starting from an output counter value of zero, the maximum
counter value, called the ceiling, can be no more than eight times the
MAH. However, if the burst starts with a counter value other than zero,
there is a potential to exceed the ceiling; this can result from
protocol restarts. In these cases, the
poll algorithm throttles the output rate by computing an additional
headway time so that the next packet sent will not exceed the ceiling.
Designs such as this are often called leaky buckets.
The reference implementation uses a special most-recently used (MRU)
list of entries, one entry for each distinct client IP address found.
Each entry includes the IP address, input counter and process time at
the last packet arrival. As each packet arrives, the IP source address
is compared to the IP address in each entry in turn. If a match is found
the entry is removed and inserted first on the list. If the IP source
address does not match any entry, a new entry is created and inserted
first, possibly discarding the last entry if the list is full. Observers
will note this is the same algorithm used for page replacement in
virtual memory systems. However, in the virtual memory algorithm the
entry of interest is the last, whereas here the entry of interest is the
first.
The input counter for the first entry on the MRU list, representing the
current input packet, is decreased by the interval since the entry was
last referenced, but not below zero. If the input counter is greater
than the ceiling, the packet is discarded. Otherwise, the counter is
increased by the MAH and the packet is processed. The result is, if the
client maintains an average headway greater than the ceiling and
transmits no more than eight packets in a burst, the input counter will
not exceed the ceiling. Packets sent by other implementations that
violate this constraint will be dropped and a KoD packet returned, if
enabled.
The reference implementation has a maximum MRU list size of a few
hundred entries. The national time servers operated by NIST and USNO
have an aggregate packet rate in the thousands of packets per second
from many thousands of customers. Under these conditions, the list
overflows after only a few seconds of traffic. However, analysis shows
that the vast majority of the abusive traffic is due to a tiny minority
of the customers, some of which send at over one packet per second. This
means that the few seconds retained on the list is sufficient to
identify and discard by far the majority of the abusive traffic.
[[kiss]]
== The Kiss-of-Death Packet
Ordinarily, packets denied service are simply dropped with no further
action except incrementing statistics counters. Sometimes a more
proactive response is needed to cause the client to slow down. A special
packet has been created for this purpose called the kiss-o'-death (KoD)
packet. KoD packets have leap indicator 3, stratum 0 and the reference
identifier set to a four-octet ASCII code. At present, only one code
+RATE+ is sent by the server if the +limited+ and +kod+ flags of the
link:accopt.html#restrict[+restrict+] command are present and either the
guard time or MAH time are violated.
A client receiving a KoD packet is expected to slow down; however, no
explicit mechanism is specified in the protocol to do this. In the
reference implementation, the server sets the poll field of the KoD
packet to the greater of (a) the server MAH and (b) client packet poll
field. In response to the KoD packet, the client sets the peer poll
interval to the maximum of (a) the client MAH and (b) the server packet
poll field. This automatically increases the headway for following
client packets.
In order to make sure the client notices the KoD packet, the server sets
the receive and transmit timestamps to the transmit timestamp of the
client packet. Thus, even if the client ignores all except the
timestamps, it cannot do any useful time computations. KoD packets
themselves are rate limited to no more than one packet per guard time,
in order to defend against flood attacks.
[[ref]]
== References
1. Mills, D.L., J. Levine, R. Schmidt and D. Plonka. Coping with
overload on the Network Time Protocol public servers. _Proc. Precision
Time and Time Interval (PTTI) Applications and Planning Meeting_
(Washington DC, December 2004), 5-16. Paper:
{millshome}database/papers/ptti/ptti04a.pdf[PDF],
Slides: {millshome}database/brief/ptti/ptti04.pdf[PDF]
|
{millshome}database/brief/ptti/ptti04.ppt[PowerPoint]
'''''
include::includes/footer.adoc[]
|