[ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
root at physics.ubc.ca
Fri Jan 25 09:11:04 UTC 2008
"David L. Mills" <mills at udel.edu> writes:
>This answers my earlier question. I can't believe this is so crude and
>dangerous. you really need to provide an analysis on the errors this
>creates when reading the clock during the slew. The problem is not the
>residual time offset but the rate at which time changes. Measuring time
>intervals is very different during the slew. The NTP design carefully
>limits this to no more than 5 microseconds per second without the kernel
>and even smaller with the kernel.
I think you meant .5 milliseconds per second (500PPM)
5microseconds per second is only 5PPM.
Just to be more precise. If the offset is greater than .2 sec, chrony does a
tick size adjustment. (I think ntp does a step if greater than .128s?)
If it is less than .2 sec, it uses the adjtimex ADJ_OFFSET_SINGLESHOT facility
of the kernel. If that clamps to 500PPM I do not know-- it at least seems not.
And My comment re leap time, is that I believe ntp slews at something like
1sec/sec during leap (at lest the discussion in the ntp docs seem to
imply that it slews at -1+epsilon sec/sec). That was what I refered to.
A step adjustment which ntp does make is an infinite slew rate.
>> Brian Utterback <brian.utterback at sun.com> writes:
>>>>Just an update: I started chrony with a 60ms offset. It had the right drift
>>>>file. It took about 1 min ( having collected about 4 samples from the
>>>>servers at minpoll 4) to drive the offset down to about 100 usec (Yes, a
>>>>1000 fold improvement in about 50 sec.) Ie, the time constant for
>>>>correction of offset errors is enough time to collect enough samples to
>>>>determine that the offset really is statistically way off.
>>>Is that supposed to be impressive? One of the design constraints of NTP
>>>is to limit the clock frequency change during offset adjustments to
>>>500ppm to prevent NTP network instabilities. If the offset was
>>>amortized over the 50secs you stated, then that is a slew rate of
>>>1200 ppm. If this happened entirely at the end of the 4 samples, then it
>>>sounds simply like a step to me. By that reasoning, ntpdate far
>> NO it is NOT a step. It is done via a fast slew by a change in the tick size, which can be 10% (ie
>> +-100000PPM) The clock always runs forward. It does not step. It may seem
>> like a step from the point of the coarse sampling done by chrony or ntp,
>> but if you ran a PPS clock and looked at the time returned by gettimeofday,
>> it would be continuous and positive, just like ntp. When the NPT offset
>> changes by 100ms between samples spaced at 500 sec apart, did it do that by
>> stepping? No it did it by increasing the frequency by 200PPM. Chrony
>> behaves the same way, only it uses the ticksize as well as the frequency to
>> produce fast slews to get rid of the offsets, and it does not go unstable
>> that I have ever seen.
>>>outperforms chrony. I presume that chrony cannot behave as a server and
>>>only does clients right?
>> Chrony is also a server. The key detraction for me is that it cannot use hardware clocks.
>> It also does not act as a multicast/broadcast server which may be a
>> detraction for others and does not do leap
>> seconds. On the other hand with its rapid response it will correct the
>> leapsecond within less than an hour.
>> Anyway, the issue here is the clock disciplining routine, not a comparison
>> of the chronyd program with the ntp implimentation.
>> I am arguing that chrony's clock discipline routine keeps the hardware
>> clock much closer to the real time (in the real world) and reacts to real
>> world changes much faster than does the NTP discipline routine.
>> And chrony is just as stable it seems as NTP is. The offset fluctuations
>> are better than NTP's are. The key question is how close to the real time
>> is the time that the system clock delivers. Chrony is closer by factors of
>> at least 2 and probably if run at high priority as my ntp is, much better
>> than that. In particular if there are glitches in the clock drift rate,
>> chrony reacts much faster, and keeps the time much much closer to the true
>> time. Instability would produce worse behaviour not better.
>>>>I also started chrony without a drift file. In this case it took about 5
>>>>min to get a frequency within 10% of the long term stable frequency and
>>>>that "error" disappeared within 1/2 hour.
>>>I don't know about the version of ntp you are running, but recent
>>>versions have a bug in the initial frequency calculations which
>>>has since been fixed, but not released (ahem. Harlan?).
>> The initial horrible transient was under 4.2.0. After this round I will
>> try an initial transient test with 4.2.4. But the transient
>> behaviour I am describing in the previous post is during the normal running
>> of NTP. It is not an initial transient. It is the response of the system
>> to a real world drift rate glitch.
>> It is after NTP has been running for 5 days and the hardware clock on the
>> machine suffered a frequency glitch. I have no idea what is causing those
>> frequency glitches-- the clock suddenly canges it drift rate by .2 to 2 PPM.
>> I have seen this both with a chrony controlled clock and an NTP controlled
>> clock. It is just that the NTP response is not good.
More information about the questions