[ntp:questions] very slow convergence of ntp to correct time.
david at ex.djwhome.demon.co.uk.invalid
Sun Jan 20 20:08:54 UTC 2008
In article <RRLkj.8804$vp3.6129 at edtnps90>,
Unruh <unruh-spam at physics.ubc.ca> wrote:
> david at ex.djwhome.demon.co.uk.invalid (David Woolley) writes:
> >Note that chrony seems not to have been updated for several years and its
> Actually not true. The latest version 1.23 has just been released, but it
> is true that the support has become somewhat slow of late.
There appeared to be no change significant enough to require new
documentation, and it didn't acknowledge NTPv4 in the documentation.
> I certainly does impliment the ntp but does use it own clock discipline
> algorithm. Which is why it converges so fast to having a well disciplined
Do you mean that it will work with NTP servers or that it actually complies
with the normative parts of the NTP (v3) specification? Most versions of
W32Time do the former, but not the latter.
> clock.(minutes rather than hours or days). In general it does a great job
> acint either as an ntp client or server. It does not support refclocks
I certainly have reservations about the initial lock up of ntpd, but
it has got a lot better since the version referenced in the chrony
documentation. The basic problem it has, is that, if it starts close to
the correct time and with a saved frequency, it assumes that offsets its
sees are random errors, and applies the algorithm that gives very
good performance when locked up, not one designed to get within the
actual noise levels.
> file initially. ntp really should not take a clock which had an initial
> accuracy of .01usec, and drive it away from lock to an accuracy of 60ms and
That should only happen if it starts up believing that the frequency error
differs greatly from the true value, i.e. you do a cold start with a
drift file present. It is optimizing by not re-calibrating the frequency,
because it believes it already has a correct value.
> then take hours to correct that error, never actually getting back the
> orginal without a restart of ntp.
I find that weird, as the control loop should converge to zero, although
it will take a long time to do so, because it is correcting at rates
matched to the rates at which new errors are introduced. I would have
thought that such a problem was so obvious that any problem would have been
One thought. If you are using Linux with a HZ value other than 100, and
the kernel discipline, the kernel discipline code will violate the
design assumptions, because the people who implemented HZ=250 and HZ=1000
didn't update the ntpd support code.
> ?? ntp also used linear regression to estimate the drift. That is then fed
ntpd uses infinite impulse response filters which make use of z and possibly
z^2 terms. A linear regression approach assumes the use of a finite impulse
response filter using relatively high order z terms (I have a feeling that the
FIR is non-linear). The overall response is, of course, IIR, because
the complete system is a feedback loop.
> back into the frequency locked loop and the phase locked loop.
> chrony uses two mechanism to correct errors, a fast slew (adjtimex tickval
ntpd maintains separate phase and frequency correction values, and, I seem
to remember, decays the phase correction if there are no updates.
> ) to eliminate offset errors and freq adjust to eliminate drift errors.
That assumes that you can measure the two values separately, which is not
As noted above I do have some reservations about the use of IIR filters
in out of lock conditions, but I also pointed out that the chrony author
appears not to have contributed to this newsgroup to argue the case against
the ntpd approach, to the extent that regulars here have never heard of
chrony, even though it has been around for 9 or so years. In locked
conditions, the ntpd algorithm should be better.
This is at least in part why it can adjust initially so rapidly. The
question is whether some non-linearity or whatever in the clock algorithm
is causing the oscillations (effectively narrow band ringing in the
> algorithm), but the time scale seems wrong. The maxpoll is 7 which is about
Note that servers tend to be dimensioned assuming the average maxpoll is
higher, and would consider something locked at maxpoll 6 or less as hostile.
> 2 min, the typical number of data points retained is about 10-20 in the
> adjustment algorithm, which would be of order 1/2-1 hr, while the
> oscillations are of order 1.5 hr.
That does seem to suggest a shorter natural oscillation period (for the
offset control loop, even shorter than you suggest), although
note that the cutoff frequency of the filter is too high to provide
optimum stability when locked. With ntpd, if you force maxpoll too
low, it will oversample, but the FIR filter here doesn't allow that.
> chrony has a nice feature of being able to send an
> echo datagram to the other machine if you want (before the ntp packet), to
I think that would be considered abusive by most ntp server operators
(especially those in Australia who pay for bandwidth used).
wake up the routers along the way.
> I thought an elimination algorithm was used to get rid of the outliers in
> the ntp algorithm (median filter).
I did correct this, but NTPv4 only uses median filters for reference
clocks. Normal sources use the sample with the lowest overall
error bounds, which may have a similar effect.
More information about the questions