[ntp:questions] Re: ntpd PLL and clock overshoot

user at domain.invalid user at domain.invalid
Wed Oct 11 16:06:48 UTC 2006


The modern NTP feedback loop is much more intricate than you report. It 
is represented as a hybrid phase/frequency feedback loop with a 
state-machine driven initial frequency measurement. Details are in the 
book and in the documents recetly posted for the IETF NTP Working Group. 
See the stuff linked from the NTP project page.

There are lots of nasty little approximations in the PLL/FLL code due to 
imprecise measurement of some time intervals. While the design targe for 
overshoot is 5-6 percent, I would not be surprised if in some cases it 
is 10 percent.


David Woolley wrote:

> In article <V4WdnThd8_TCzbfYnZ2dnUVZ_omdnZ2d at comcast.com>,
> Richard B. Gilbert <rgilbert88 at comcast.net> wrote:
>>I can't confirm the 100 percent but the current version doesn't work too 
>>well with my GPS reference clock at startup!  I had something like a 90 
>>millisecond offset when I started ntpd.  Over the next few minutes it 
>>corrected that offset but didn't stop, or even slow down, when it hit 
>>the zero line.  It kept right on going until it had a -9 millisecond 
> That's only a 10% overshoot, which is only twice the design target, so
> is a different problem.
> The problem you are seeing is that (ignoring its ability to modify the
> loop time constants) ntpd uses a simple analogue process controller 
> type mechanism to control the phase, based on measured phase errors.
> Such processes don't have any prior knowledge of the amount of noise in
> the phase error signal, whereas a human does.  The human realises that,
> for example, 89.9 out of the initial 90ms are the initial transient, whereas
> the ntpd control loop assumes it could all be a random excursion and the
> actual clock may be correct.  (Note that some instances of ntpd may be
> operating in contexts where all the 90ms is phase noise.)
> Such linear control can either overshoot and converge quickly, or can
> be over, or critcally damped, but take longer to converge in the first
> place.
> My feeling is that there is scope for ntpd to learn the likely phase
> noise and to use a fast and dead beat way of getting into the noise band
> before applying the linear control loop.  Once the systematic errors
> have been removed, the gaussian noise assumptions that underly the
> analysis of the behaviour of the current algorithm may well apply and
> it may then be the best algorithm for maintaining lock.
> I think there may well be a good case for using Nick McClaren's, statistics
> based, leased squares fit, at least during initial acquisition, rather than
> the linear controller that is currently used.
> Note, it may be necessary to ensure that time is not served before the
> error is within the noise region, as that may cause downstream servers
> to do their initial acquisition based on the initial catch up of their
> server, rather than the true time, and might cause instabilities in the
> network, taken as a whole.
> A related problem is that ntpd has no built in knowledge that crystals
> only vary by 1 or 2 ppm with temperature, so when presented with transients
> can end up believing it needs a long term frequency correction of 500ppm.
> My feeling is that there should be a coarse control loop that can cope
> with long term changes (including sudden ones like changing motherboards)
> and a fine control loop with only a limited control range (although maybe
> the whole range can be used for the phase correction).
> A real life example of this is a CD drive, where fast, fine, tracking
> is applied to the read head itself, using voice coils, and longer term
> corrections are applied using the head positioner.

More information about the questions mailing list