[ntp:questions] Choice of local reference clock seems to affect synchronization on a leaf node

unruh unruh at invalid.ca
Mon Nov 7 21:58:07 UTC 2011


On 2011-11-07, Nathan Kitchen <nkitchen at aristanetworks.com> wrote:
> On Sun, Nov 6, 2011 at 2:13 PM, Danny Mayer <mayer at ntp.org> wrote:
>> On 11/4/2011 7:27 PM, Nathan Kitchen wrote:
>> > I'm curious about some behavior that I'm observing on a host running
>> > ntpd as a client. As I understand it, configuring a local reference
>> > clock--either an undisciplined local clock or orphan mode--shouldn't
>> > help me, but I see different behavior when I do have one. In
>> > particular, when I'm synchronizing after correcting a very large
>> > offset, I synchronize about 2x faster in orphan mode than with no
>> > local clock, and with an undisciplined local clock I don't even fix
>> > the offset.
>> >
>> > I'm curious about whether this difference should be expected.
>> >
>> > I'm using the following configuration in all cases:
>> >
>> > ? ?driftfile /persist/local/ntp.drift
>> > ? ?server 172.22.22.50 iburst
>> >
>> > My three different configurations for local clocks are the following:
>> >
>> > 1. No additional commands
>> >
>> > 2. tos orphan 10
>> >
>> > 3. server 127.127.1.0
>> > ? ? fudge 127.127.1.0 stratum 10
>> >
>> > In all three cases, my test has these steps:
>> >
>> > 1. Stop ntpd.
>> > 2. Set the clock to 2000-1-1 00:00:00 (that is, more than 10 years ago).
>> > 3. Run ntpd -g.
>> > 4. Check that the 11-year offset is corrected.
>> > 5. Wait for synchronization to the time server.
>> >
>> > With either configuration #1 (no local clock) or #2 (orphan mode), the
>> > offset is corrected quickly: 4 and 13 seconds, respectively. With
>> > configuration #3 (undisciplined local clock), it fails to be corrected
>> > within 60 seconds.
>>
>> In case #3 that's expected if there are no servers to get the correct
>> time. What else would you expect? Where would it get it's time from?
>
> In case #3, as in the other cases, the configuration includes the
> server 172.22.22.50.
>
>> > After the offset is corrected, configuration #1 takes 921 seconds to
>> > synchronize to the server. Configuration #2 takes 472.
>> >
>>
>> First, correcting the offset is the major concern. After that figuring
>> out the frequency changes need to be calculated with additional packets
>> being received and that takes time. It needs to have enough of them to
>> do the calculation.

Actually, that is not the way that ntpd works. It has no concept of
"frequency error". All it knows is the offset. It then changes the
frequency in order to correct the offset. It does not correct the offset
directly. It never figures out what the frequency error is. All it does
is "If offset is positive, speed up the clock, if negative slow it down"
( where I am defining the offset at "'true' time- system clock time").
 (There is lots that goes into ntp's best estimate of the 'true' time,
which is irrelevant to this discussion)

chrony has a different philosophy, where it has a concept of both the
frequency error and the offset, and it tries to correct both
independently. It keeps a large number of measurements to estimate both
the frequency error and the offset from those measurements. This results
in a far far faster convergence, and a better system clock offset behaviour (by
factors of 2-20).
 Another approach might be to use the PID concepts ( in which one uses
the present offset, the derivative of the offset and the integral of the
offset to drive the correction) to control the clock to get faster
convergence, without overshoot and with high long term accuracy. These
kinds of feedback systems are used for example to control the
temperature of scientific heat baths to high precision and fast
 non ringing convergence (and have gained popular use in for example
sous vide cooking). 

It might be interesting to get a Masters or PhD student somewhere to compare the
various techniques for clock control to see what their advantages and
disadvantages are especially under real life conditions. 

>
> Why would it take fewer packets with orphan mode enabled (and no
> peers) than with no local clock?
>
> -- Nathan



More information about the questions mailing list