[ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

David L. Mills mills at udel.edu
Thu Jan 24 19:08:08 UTC 2008


This answers my earlier question. I can't believe this is so crude and 
dangerous. you really need to provide an analysis on the errors this 
creates when reading the clock during the slew. The problem is not the 
residual time offset but the rate at which time changes. Measuring time 
intervals is very different during the slew. The NTP design carefully 
limits this to no more than 5 microseconds per second without the kernel 
and even smaller with the kernel.


Unruh wrote:

> Brian Utterback <brian.utterback at sun.com> writes:
>>Unruh wrote:
>>>Just an update: I started chrony with a 60ms offset. It had the right drift
>>>file. It took about 1 min ( having collected about 4 samples from the
>>>servers at minpoll 4) to drive the offset down to about 100 usec (Yes, a
>>>1000 fold improvement in about 50 sec.) Ie, the time constant for
>>>correction of offset errors is enough time to collect enough samples to
>>>determine that the offset really is statistically way off. 
>>Is that supposed to be impressive? One of the design constraints of NTP
>>is to limit the clock frequency change during offset adjustments to
>>500ppm to prevent NTP network instabilities. If the offset was
>>amortized over the 50secs you stated, then that is a slew rate of
>>1200 ppm. If this happened entirely at the end of the 4 samples, then it 
>>sounds simply like a step to me. By that reasoning, ntpdate far
> NO it is NOT a step. It is done via a fast slew by a change in the tick size, which can be 10% (ie
> +-100000PPM) The clock always runs forward. It does not step. It may seem
> like a step from the point of the coarse sampling done by chrony or ntp,
> but if you ran a PPS clock and looked at the time returned by gettimeofday,
> it would be continuous and positive, just like ntp. When the NPT offset
> changes by 100ms between samples spaced at 500 sec apart, did it do that by
> stepping? No it did it by increasing the frequency by 200PPM. Chrony
> behaves the same way, only it uses the ticksize as well as the frequency to
> produce fast slews to get rid of the offsets, and it does not go unstable
> that I have ever seen. 
>>outperforms chrony. I presume that chrony cannot behave as a server and
>>only does clients right?
> Chrony is also  a server. The key detraction for me is that it cannot use hardware clocks. 
> It also does not act as a multicast/broadcast server  which may be a
> detraction for others and does not do leap
> seconds. On the other hand with its rapid response it will correct the
> leapsecond within less than an hour. 
> Anyway, the issue here is the clock disciplining routine, not a comparison
> of the chronyd program with the ntp implimentation. 
> I am arguing that chrony's clock discipline routine keeps the hardware
> clock much closer to the real time (in the real world) and reacts to real
> world changes much faster than does the NTP discipline routine. 
> And chrony is just as stable it seems as NTP is. The offset fluctuations
> are better than NTP's are. The key question is how close to the real time
> is the time that the system clock delivers. Chrony is closer by factors of
> at least 2 and probably if run at high priority as my ntp is, much better
> than that. In particular if there are glitches in the clock drift rate,
> chrony reacts much faster, and keeps the time much much closer to the true
> time.  Instability would produce worse behaviour not better. 
>>>I also started chrony without a drift file. In this case it took about 5
>>>min to get a frequency within 10% of the long term stable frequency and
>>>that "error" disappeared within 1/2 hour.
>>I don't know about the version of ntp you are running, but recent
>>versions have a bug in the initial frequency calculations which
>>has since been fixed, but not released (ahem. Harlan?).
> The initial horrible  transient was under 4.2.0. After this round I will
> try an initial transient test with 4.2.4.  But the transient
> behaviour I am describing in the previous post is during the normal running
>  of NTP. It is not  an initial transient. It is the response of the system
> to a real world drift rate glitch.
> It is after NTP has been running for 5 days and the hardware clock on the
> machine suffered a frequency glitch. I have no idea what is causing those
> frequency glitches-- the clock suddenly canges it drift rate by .2 to 2 PPM.
> I have seen this both with a chrony controlled clock and an NTP controlled
> clock. It is just that the NTP response is not good. 

More information about the questions mailing list