[ntp:questions] Bug 2341 - ntpd fails to keep up with clock drift at poll>7
martin.burnicki at meinberg.de
Wed Nov 27 17:26:33 UTC 2013
Brian Inglis wrote:
> On 2013-11-26 08:22, Martin Burnicki wrote:
>> Brian Inglis wrote:
> As I said above, on Windows stable, with only network servers, and
> normal maxpoll 10, as the poll interval increases, the FLL kicks in
> to drive the drift within PPB, and the offset stabilizes in the low
Yes, you said that. But on which Windows version?
Here's a short summary of possible variants, if I remember correctly:
1. Windows XP/Server 2003. System time increments in steps of about 16
ms. With ntpd 4.2.4 the system time was always interpolated between
clock ticks, and the performance of interpolation was depending on the
implementation of QueryPerformanceCounter (QPC) which could be based on
the power management timer (PMTIMER), or the TSC. If QPC was using TSC
then interpolated times could go nuts if running on a CPU type where the
TSCs of the different CPU cores were not synchronized, or the TSC was
clocked down due to power management (Intel SpeedStep, AMD,
Cool'n'Quiet, ...). SP3 for XP usually switched QPC to use the PMTIMER,
to avoid potential TSC problems in general.
2. With Windows Vista and later (Win 7, Server 2008) the system time
started to tick in 1 ms increments instead of 16 ms increments. The bug
where small time corrections are ignored by the Windows kernel was not
yet known, but experiments showed that on systems with 1 ms increments
the time adjustment was usually smoother if the Windows system clock was
So starting with 4.2.6, ntpd tries to figure out if the system clock
increments in 1 ms steps, or more coarsely. In the former case
interpolation was disabled, but in the latter case interpolation was
still used, basically the same way as in ntpd 4.2.4. These Windows
versions also "knew" which CPU types have problems with their TSCs, and
used PMTIMER or HPET instead. If these Windows version decide to use TSC
then TSC usually works reliably. So, if ntpd figures out that QPC uses
TSC then it reads TSC directly, which is faster than using the QPC API.
Some of the default behavior can be altered by using some environment
variables. This great work was done by Dave Hart.
3. So while ntpd 4.2.6 often works great on Windows XP / Server 2003, as
well as on Windows 7 / Server 2008 we at Meinberg received a number of
reports where the system time adjustment loop didn't settle, and ntpq -p
reported a large offset and jitter. We could determine 2 cases where
other drivers messed up the system time, but there were still cases were
we were unable to find the reason. Fortunately a guy named Andrew Dixie
came up with
and brought the Windows bug
to our attention. He also provided a patch with a workaround which was
pulled into the current development version of ntpd. This patch has
fixed the loop settling problems on all systems I know of where the
system time adjustment didn't converge with an earlier version of ntpd.
>> So my advice would have been to use minpoll 4 maxpoll 4,if
>> this setting wouldn't affect the workaround implemented in -dev.
> Would probably get you kicked off most upstream servers eventually!
Maybe if you are using public/pool servers, but not if you are using
your own NTP server. Take care, I'm biased! ;-)
>>> With current stable and a ref clock with prefer or low poll, and
>>> backup servers with low or no minpoll, backup servers are polled
>>> at minpoll or the same rate as the ref clock, so would never see
>>> this issue.
>> Hm, are you really sure the polling interval for the backup server(s)
>> depends on the polling interval of a configured refclock?
>> I haven't noticed this, yet, but I also haven't checked this.
> After installing a refclock with minpoll & maxpoll 4, had to bump my
> upstream minpoll to 6, or all were polled every 16s, and I figured
> someone might notice and object!
Hm, that's strange.
I have not yet used refclocks with the Windows port of ntpd. At least on
my Linux here system this doesn't happen with ntpd 4.2.6p5. I have a
mixture of parse and SHM refclocks all clamped to minpoll 4 / maxpoll 4,
and a backup server without minpoll / maxpoll configured, which stays at
a 64 s polling interval.
I wouldn't expect the poll interval changes to be controlled by
OS-specific code, but I could imagine that it depends on the results of
subsequent pollings, which may yield different offset and jitter figures
under Windows and Linux, or even with or without refclocks under Windows.
> I noticed after a router restart causing minutes of unreachability,
> network servers were temporarily polling every 1024s, then dropped
> back to 64s when they again became reachable.
I think this is expected behavior.
> However, network servers only seem to log peerstats about every 5
> minutes, giving about 288-300 samples per day, every 288-300
I've also observed that the stats files are updated less frequently by
recent versions of ntpd than by earlier versions. I don't have tracked
the changes which cause this, though.
>> What if you don't have a refclock, only upstream servers?
> Poll intervals increase up to maxpoll, depending on the server and
> link quality.
Right, and that's exactly where I have seen offset increasing, at least
under Windows. Thus my suggestion to limit the poll interval, which also
speeds up synchronization, which is also appreciated by most users I've
> Appreciate the feedback and questions, and thanks very much for the
> Windows port,
> the work on it, and the GUI Monitor utility.
Thanks. Ntpd is a great program, and I'm happy to support both the
project and the community whenever I can and have time to do it.
More information about the questions