1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173
|
= Clock State Machine
include::include-html.ad[]
== Table of Contents
* link:#intro[General Overview]
* link:#panic[Panic Threshold]
* link:#step[Step and Stepout Thresholds]
* link:#hold[Hold Timer]
* link:#inter[Operating Intervals]
* link:#state[State Transition Function]
'''''
[[intro]]
== General Overview
In the NTPv4 specification and reference implementation a state machine
is used to manage the system clock under exceptional conditions, as when
the daemon is first started or when encountering severe network
congestion. This page describes the design and operation of the state
machine in detail.
The state machine is activated upon receipt of an update by the clock
discipline algorithm. Its primary purpose is to determine whether the
clock is slewed or stepped and how the initial time and frequency are
determined using three thresholds: _panic_, _step_ and _stepout_, and
one timer: _hold_.
[[panic]]
== Panic Threshold
Most computers today incorporate a real-time-clock (RTC) chip (sometimes
referred to as a "time-of-year" (TOY) chip in the past) to maintain
the time when the power is off. When the computer is restarted, the chip
is used to initialize the operating system time. In case there is no RTC
chip or the RTC time is different from NTP time by more than the panic
threshold, the daemon assumes something must be terribly wrong, so
exits with a message to the system operator to set the time manually.
With the +-g+ option on the command line, the daemon sets the clock to
NTP time at the first update, but exits if the offset exceeds the panic
threshold at subsequent updates. The panic threshold default is 1000 s,
but it can be changed with the +panic+ option of the
link:miscopt.html#tinker[+tinker+] command.
[[step]]
== Step and Stepout Thresholds
Under ordinary conditions, the clock discipline gradually slews the
clock to the correct time, so that the time is effectively continuous
and never stepped forward or backward. If, due to extreme network
congestion, an offset spike exceeds the step threshold, by default 128
ms, the spike is discarded. However, if offset spikes greater than the
step threshold persist for an interval more than the stepout threshold,
by default 300 s, the system clock is stepped to the correct time.
In practice, the need for a step has been extremely rare and almost
always the result of a hardware failure or operator error. The step
threshold and stepout threshold can be changed using the +step+ and
+stepout+ options of the link:miscopt.html#tinker[+tinker+] command,
respectively. If the step threshold is set to zero, the step function is
entirely disabled and the clock is always slewed. The daemon sets the
step threshold to 600 s using the +-x+ option on the command line. If
the +-g+ option is used or the step threshold is set greater than 0.5 s,
the precision time kernel support is disabled.
Historically, the most important application of the step function was
when a leap second was inserted in the Coordinated Universal Time (UTC)
timescale and the kernel precision time support was not available. This
also happened with older reference clocks that indicated an impending
leap second, but the radio itself did not respond until it
resynchronized some minutes later. Further details are on the
link:leap.html[Leap Second Processing] page.
In some applications the clock can never be set backward, even if it is
accidentally set forward a week by some evil means. The issues should be
carefully considered before using these options. The slew rate is fixed
at 500 parts-per-million (ppm) by the Unix kernel. As a result, the
clock can take 33 minutes to amortize each second the clock is outside
the acceptable range. During this interval the clock will not be
consistent with any other network clock and the system cannot be used
for distributed applications that require correctly synchronized network
time.
[[hold]]
== Hold Timer
When the daemon is started after a considerable downtime, it could be
that the RTC chip clock has drifted significantly from NTP time. This can
cause a transient at system startup. In the past, this has produced a
phase transient and resulted in a frequency surge that could take some
time, even hours, to subside. When the highest accuracy is required,
some means is necessary to manage the startup process so that the
clock is quickly set correctly and the frequency is undisturbed. The
hold timer is used to suppress frequency adjustments during the training
and startup intervals described below. At the beginning of the interval
the hold timer is set to the stepout threshold and decrements at one
second intervals until reaching zero. However, the hold timer is forced
to zero if the residual clock offset is less than 0.5 ms. When nonzero,
the discipline algorithm uses a small time constant (equivalent to a
poll exponent of 2), but does not adjust the frequency. Assuming that
the frequency has been set to within 1 ppm, either from the frequency
file or by the training interval described later, the clock is set to
within 0.5 ms in less than 300 s.
[[inter]]
== Operating Intervals
The state machine operates in one of four nonoverlapping intervals.
Training interval::
This interval is used at startup when the frequency file is not
present. It begins when the first update is received by the
discipline algorithm and ends when an update is received following the
stepout threshold. The clock phase is steered to the offset presented
at the beginning of the interval, but without affecting the frequency.
During this interval further updates are ignored. At the end of the
interval the frequency is calculated as the phase change during the
interval divided by the length of the interval. This generally results
in a frequency error of less than 0.5 ppm. Note that if the intrinsic
oscillator frequency error is large, the offset will in general have
significant error. This is corrected during the subsequent startup
interval.
Startup interval::
This interval is used at startup to amortize the residual offset while
not affecting the frequency. If the frequency file is present, it
begins when the first update is received by the discipline. If not, it
begins after the training interval. It ends when the hold timer
decrements to zero or when the residual offset falls below 0.5 ms.
Step interval::
This interval is used as a spike blanker during periods when the
offsets exceed the step threshold. The interval continues as long as
offsets are received that are greater than the step threshold, but
ends when either an offset is received less than the step threshold or
until the time since the last valid update exceeds the stepout
threshold.
Sync Interval::
This interval is implicit; that is, it is used when none of the above
intervals are used.
[[state]]
== State Transition Function
The state machine consists of five states. An event is created when an
update is received by the discipline algorithm. Depending on the state
and the offset magnitude, the machine performs some actions and
transitions to the same or another state. Following is a short
description of the states.
FSET - The frequency file is present::
Load the frequency file, initialize the hold timer and continue in
SYNC state.
NSET - The frequency file is not present::
Initialize the hold timer and continue in FREQ state.
FREQ - Frequency training state::
Disable the clock discipline until the time since the last update
exceeds the stepout threshold. When this happens, calculate the
frequency, initialize the hold counter and transition to SYNC state.
SPIK - Spike state::
A update greater than the step threshold has occurred. Ignore the
update and continue in this state as long as updates greater than the
step threshold occur. If a valid update is received, continue in SYNC
state. When the time since the last valid update was received exceeds
the stepout threshold, step the system clock and continue in SYNC
state.
SYNC - Ordinary clock discipline state::
Discipline the system clock time and frequency using the hybrid
phase/frequency feedback loop. However, do not discipline the
frequency if the hold timer is nonzero.
'''''
include::includes/footer.adoc[]
|