[ntp:questions] NTP not syncing

Brian Inglis Brian.Inglis at SystematicSw.ab.ca
Sun Dec 8 10:30:53 UTC 2013


On 2013-12-06 11:05, unruh wrote:
> On 2013-12-06, Brian Inglis <Brian.Inglis at SystematicSw.ab.ca> wrote:
>> It would be better if ntpd used the drift file frequency for the first
>> two hours, instead of 15 minutes, before coming up with its (currently
>> wild assed) guesstimate, and then spending 4 hours getting back to the
>> drift frequency.
>
> I do not think you understand. ntpd has a drift value-- a rate at which
> it corrects the system clock (on linux this is using the adjtimex call).
> If it find an offset, it changes that drift to eliminate that offset. If
> one has a system where the drift file is off, but the offset is OK, then
> ntp will take a while to even notice that the offset is increasing (part
> of that wait is the poll interval, part is the clock filter which throws
> out 80% of the poll results, and part of it is that the wrong drift will
> not show up in the offset for a while.) Once it notices there is an
> offset, it will alter the drift a little bit to try to eliminate that
> offset. For the first while that offset will be small, and the drift
> adjustment will be small. The offset continues to grow, and ntp makes
> the drift correction larger and larger, but the offset continues to be
> large. The drift finally badly overshoots and now the offset starts to
> get smaller, but the drift keeps being increased because the offset is
> still large. That drives the clock offsets finally off in the opposite
> direction.
> It is this behaviour, that it only looks at the current offset to adjust
> the drift that is the problem. With even a little bit of history, it is
> obvious that the drift rate is way off. But ntpd does not keep any
> history. This is one of the reasons why chrony performs so much better
> than ntpd does. It does keep a history (it does a linear regression on
> the past 3-64 offsets to determine the current offset and drift) and
> uses it. The number of history points kept is varied depending on how
> well the linear model (offset plus drift) fits the data.
> There are other differences as well.
>
> ntpd works fine in keeping the time in a situation where there is only
> noise-- variable network time delays, gaussian random white noise
> affecting the local clock frequency or clock reading, etc. But for large
> errors (bad drift file, sudden temperature change, ...) ntpd does poorly
> in the sense that it takes a long time to notice and fix.  Since one of
> the key sources of noise in a computer is temperature variations, this
> means that in normal use ntpd does significantly worse than chrony in
> keeping the offsets small (the offsets are at least a factor of 2 and I
> have heard reports of a factor of 20 higher standard deviation in ntpd
> than in chrony). I do not have good figures on how its drift rate
> fluctuations compare to those of ntpd-- but probably worse.

>> This is on Windows 7, current stable ntpd, NMEA user mode pps ref clock
>> (serialpps does not work with a 64 bit PCIe serial port driver stack).
>>
>> I try to avoid any issues by copying the drift file daily, if it is
>> within limits, and copying back before startup, if it outside limits,
>> to reduce issues with wild ntpd drift estimates.
>>
>> Could whatever was patched in Linux be required in the Windows port?
>
> What was patched in Linux was the kernel by the kernel developers. Ie
> way outside ntpd's pervue, and impossible on Windows.

AFAICT the Windows port does what the Linux kernel and adjtimex do, by
disabling the default system time adjustment, and estimating the frequency
to correct the phase offset, within the system adjustment limits.

But the drift estimate takes 15 minutes, not 11 seconds, and then is off
by many PPM, takes 2 hours to get within 1 PPM, and another 2 hours to start
oscillating around the true system drift value.

I have seen no sign of drifts diverging, it is slowly and consistently
driven towards the true hardware rate, by the ref clock.

On my system, ignoring hours after NTP restart, drift < 1ppm, drift range
< .04ppm, wander range < .0003ppm, phase offset range < 10us, mean < .25us.
TSCs are constant/invariant, synced, and all run at the 2.6GHz CPU clock,
with skews < 3us after 3 weeks uptime, so within read error timing.
NTPD runs on the last CPU at Realtime priority, so is unlikely to be
affected by system use.
I am now seeing obvious heating system artifacts, especially when the
temperature plunged to the coldest on earth the other day.

So I take notice when offset or drift are above those expected ranges!

Also why I would like to see the Windows ntpd support pinning, tsc/pcc use,
and other tweaks, independent of each other and the interpolation option.
That option currently has to be enabled to use those tweaks, but appears
in my tests to degrade results  compared to my current setup.

-- 
Take care. Thanks, Brian Inglis


More information about the questions mailing list