[ntp:questions] NTP not syncing

unruh unruh at invalid.ca
Thu Dec 5 07:44:42 UTC 2013

On 2013-12-05, Harlan Stenn <stenn at ntp.org> wrote:
> unruh writes:
>> On 2013-12-05, mike cook <michael.cook at sfr.fr> wrote:
>> >
>> > Le 4 d?c. 2013 ? 22:41, antonio.marcheselli at gmail.com a ?crit :
>> >
>> >> 
>> >>> 
>> >>>> I kept monitoring the drift file, but it was stable.
>> >> 
>> >>> Were you monitoring the modification times as well as the contents? As th
>> e logs were not being updated, maybe they were not being changed either?
>> >> 
>> >> Hi,
>> >> I'm not sure what you mean.
>> >> I was monitoring the status of the ntp using ntpd and the "o" option to se
>> e the offset and the status of the sources.
>> >> 
>> >> I was also reading the ntp.drift file to check the drift value.
>> >
>> >   Unfortunately there is no history for the drift file contents and the tim
>> es between updates seems irregular. looking around my systems I see
>> > OSX           >1hr
>> > FreeBSD   >5hr
>> > FreeBSD   >2hr40
>> > FreeBSD   >15m
>> > linux   	   >1hr:50      
>> > Windows 7 > 2h
>> >
>> Once it has settled down ntp polls the server every 20 min (approx). But
>> it then sends that poll through a clock filter algorithm which throws
>> away roughly 7/8 of the results. (it keeps only that poll item whose
>> delay is smaller than any of the other delays of the last 8 polls),
>> which brings you up to roughly 2.5hr. between updates. 
> Bill, you say this a lot.

I say it because that is how ntpd acts. If I have misunderstood the
source code please correct me. 

> The experience we have is that the longer the delay the larger the
> error, and ntpd does its best to set the time based on the
> highest-quality time samples it can find.

Well, yes, but in the process it throws away data which could be used to
improve the evaluation of the time.
Is the improvement in error  obtained by averaging the 8 samples better or worse
than the error in taking the sample with the shortest round trip? The
answer will clearly depend on the exact type of noise. For example, if
we assume that the noise is dominated by random fluctuations in the
remote clock, then clearly averaging the 8 samples will give a better
estimate of the true ntp time than will taking the one sample with the
shortest delay. If the error is dominated by random errors in the travel
time, and the remote clock error is negligible, then the shortest round
trip might well provide the best estimate. Now, if you are concerned by
that clock synchronization in the Philipenes in the 1980s, and the time
is coming from the US Navel observatory, then the latter might well be a
good assumption. If however you are using a stratum 4 server in
someone's home from the pool, then the former would almost certainly be
a far better model for the noise, and the elimination of the 7 bits of
data with longer delay makes the errors in the ntp time on your machine
much worse than they could be. 

>>> I haven't looked at the source, but it may mean that ntpd updates the
>>> file only when there is a change in frequency, say due to temperature
>>> variations, but this is not systematic as if you check the frequency
>>> with ntpq -rv, you get data that can differ from the value in the
>>> file . There must be some time factor as well.
>> ntpd changes the offset ONLY by changing the frequency. Thus if there is
>> a non-zero offset, the frequency is changed ( and the offset is "never"
>> zero). Ie, anytime a new poll result makes it through the filter, it
>> changes the frequency. 
> This has nothing to do with the author's question about the drift file,
> which is updated once an hour iff there has been a significant change
> since the last time it was written.  See the code around line 259 of
> ntp_util.c in ntp-dev.

That you. I was not sure what the time scale was for the writing of the
drift file. Do you happen to remember what the definition of
"significant change" is? greater than SD of the drift?

>> > example:
>> > mike at raspberrypi ~ $ sudo ls -l /var/lib/ntp/ntp.drift
>> > -rw-r--r-- 1 root root 8 Dec  5 00:51 /var/lib/ntp/ntp.drift
>> AIUI, it does not write out the drift file every time it changes the
>> frequency.
>> The drift file is there to give an approximate value for the drift of
>> the system for next bootup. Since the new drift will certainly be
>> different from the present drift (temperature, recalibration of the
>> system clock, wear on the crystal, ....) it is pointless to have the
>> file follow the current drift to closely.
> And the system clock recalibration of what used to be the "tick" value
> pretty much only happens on Linux kernels (and this may have been fixed
> recently).  The drift file value is tightly-coupled to that "tick"
> value.  That value was a constant most everywhere until [some version of
> the linux kernel] when somebody had the idea that the frequency needed
> to be calculated at each boot.  This means that from one linux boot to
> the next the tick value can change by about 200ppm (as I understand it),
> and these changes mess up the stored drift calculation.  Yes, we could
> do something about this, but nobody has volunteered to do this work.

Yes, I thought that was what I said in different words. Do you know if
this has been corrected in more recent kernels? I have been seeing much
smaller changes in the drift between boots than I used to. Maybe I have
just been lucky (I do not reboot that often so the sample size is small)
or maybe something has been fixed.  

It was not usually 200PPM change-- usually I saw more like 50PPM, but it
could be that high. I could really be a problem if that calibration sent
the system too close to the 500PPM boundary.

The problem for ntp is that ntp takes a long time to recover from a bad
drift value. 

More information about the questions mailing list