[ntp:questions] Choice of local reference clock seems to affect synchronization on a leaf node

unruh unruh at invalid.ca
Tue Nov 8 16:55:55 UTC 2011


On 2011-11-08, David L. Mills <mills at udel.edu> wrote:
>    unruh, 
>                                                                              
> unhurt, 
?? Am I supposed to read something into this mispelling?


>
> 1. You have a broken interpretation on how the NTP discipline algorithm 
> works. See the online document "How NTP Works," and in particular the 
> discipline and clock state machine pages.

I have read it. Which part of the interpretation do you feel is
"broken".

>
> 2. Your comparison between NTP and Chrony is badly conceived. Talk to 
> Miroslav; he knows the issues.

??? Exactly which issue? As Miroslav has reported, chrony, using a
refclock, is about 20 times more accurate (as measured by the offsets)
thanis ntpd. My own experiments with chrony and ntpd with a network
source found chrony to be 2-3 times as accurate as ntpd, primarily I
believe because of ntpd's very slow response to frequency changes which
happen probably primarily because of temp changes due to use of the
computer. 
 
>
> 3. The PIC (sic) issues have already been carefully considered. See the 
> startup algorithms described on the "How NTP Works". pages.

?? PID is I believe the term
used.(http://en.wikipedia.org/wiki/PID_controller)
You may well be correct, but certainly my impression is that it is only
the current error that is used to set the frequency. (the P of PID) with
the control parameter chosen to essentially produce critical damping.
With the aggressive elimination of measurements by the clock filter
algorithm, this leads to very slow response of ntp to changes in the
clock (eg rate changes due to temp changes). Attempts are being made to
use more sophisticated means to alleviate this problem on startup, but
that does not change the long term behaviour. Ie, startup is not the
only place where ntpd suffers from slow behaviour. It is just the most
obvious.
 



> 4. The orphan mode and local clock discipline require special provisions 
> to delay clock adjustments until the configured sources have had a 
> chance to activate. The paint isn't quite dry on some intricacies.
>
> 5. Starting NTP weith an initial ten-year offset is not a frequent 
> adventure. Under these conditions, if the clock takes a little longer to 
> stabilize, I'm not going to worry a lot about it.

10 years? The problem is more 5 sec initial offsets, which are highly
possible given the drift rates of rtc's (especially the diffference
between cold and hot rtcs)  and even calibration errors of
the cpu time scale by the operating system on startup. And I see no
difference between the behaviour of ntpd to an intial 10 year offset, or
a 5 sec offset. Not worrying about the former seems also to imply not
worrying about the latter. 


>
> Dave
>
> unruh wrote:
>
>>On 2011-11-07, Nathan Kitchen <nkitchen at aristanetworks.com> wrote:
>>  
>>
>>>On Sun, Nov 6, 2011 at 2:13 PM, Danny Mayer <mayer at ntp.org> wrote:
>>>    
>>>
>>>>On 11/4/2011 7:27 PM, Nathan Kitchen wrote:
>>>>      
>>>>
>>>>>I'm curious about some behavior that I'm observing on a host running
>>>>>ntpd as a client. As I understand it, configuring a local reference
>>>>>clock--either an undisciplined local clock or orphan mode--shouldn't
>>>>>help me, but I see different behavior when I do have one. In
>>>>>particular, when I'm synchronizing after correcting a very large
>>>>>offset, I synchronize about 2x faster in orphan mode than with no
>>>>>local clock, and with an undisciplined local clock I don't even fix
>>>>>the offset.
>>>>>
>>>>>I'm curious about whether this difference should be expected.
>>>>>
>>>>>I'm using the following configuration in all cases:
>>>>>
>>>>>? ?driftfile /persist/local/ntp.drift
>>>>>? ?server 172.22.22.50 iburst
>>>>>
>>>>>My three different configurations for local clocks are the following:
>>>>>
>>>>>1. No additional commands
>>>>>
>>>>>2. tos orphan 10
>>>>>
>>>>>3. server 127.127.1.0
>>>>>? ? fudge 127.127.1.0 stratum 10
>>>>>
>>>>>In all three cases, my test has these steps:
>>>>>
>>>>>1. Stop ntpd.
>>>>>2. Set the clock to 2000-1-1 00:00:00 (that is, more than 10 years ago).
>>>>>3. Run ntpd -g.
>>>>>4. Check that the 11-year offset is corrected.
>>>>>5. Wait for synchronization to the time server.
>>>>>
>>>>>With either configuration #1 (no local clock) or #2 (orphan mode), the
>>>>>offset is corrected quickly: 4 and 13 seconds, respectively. With
>>>>>configuration #3 (undisciplined local clock), it fails to be corrected
>>>>>within 60 seconds.
>>>>>        
>>>>>
>>>>In case #3 that's expected if there are no servers to get the correct
>>>>time. What else would you expect? Where would it get it's time from?
>>>>      
>>>>
>>>In case #3, as in the other cases, the configuration includes the
>>>server 172.22.22.50.
>>>
>>>    
>>>
>>>>>After the offset is corrected, configuration #1 takes 921 seconds to
>>>>>synchronize to the server. Configuration #2 takes 472.
>>>>>
>>>>>        
>>>>>
>>>>First, correcting the offset is the major concern. After that figuring
>>>>out the frequency changes need to be calculated with additional packets
>>>>being received and that takes time. It needs to have enough of them to
>>>>do the calculation.
>>>>      
>>>>
>>
>>Actually, that is not the way that ntpd works. It has no concept of
>>"frequency error". All it knows is the offset. It then changes the
>>frequency in order to correct the offset. It does not correct the offset
>>directly. It never figures out what the frequency error is. All it does
>>is "If offset is positive, speed up the clock, if negative slow it down"
>>( where I am defining the offset at "'true' time- system clock time").
>> (There is lots that goes into ntp's best estimate of the 'true' time,
>>which is irrelevant to this discussion)
>>
>>chrony has a different philosophy, where it has a concept of both the
>>frequency error and the offset, and it tries to correct both
>>independently. It keeps a large number of measurements to estimate both
>>the frequency error and the offset from those measurements. This results
>>in a far far faster convergence, and a better system clock offset behaviour (by
>>factors of 2-20).
>> Another approach might be to use the PID concepts ( in which one uses
>>the present offset, the derivative of the offset and the integral of the
>>offset to drive the correction) to control the clock to get faster
>>convergence, without overshoot and with high long term accuracy. These
>>kinds of feedback systems are used for example to control the
>>temperature of scientific heat baths to high precision and fast
>> non ringing convergence (and have gained popular use in for example
>>sous vide cooking). 
>>
>>It might be interesting to get a Masters or PhD student somewhere to compare the
>>various techniques for clock control to see what their advantages and
>>disadvantages are especially under real life conditions. 
>>
>>  
>>
>>>Why would it take fewer packets with orphan mode enabled (and no
>>>peers) than with no local clock?
>>>
>>>-- Nathan
>>>    
>>>
>>
>>_______________________________________________
>>questions mailing list
>>questions at lists.ntp.org
>>http://lists.ntp.org/listinfo/questions
>>  
>>



More information about the questions mailing list