[ntp:questions] how do I lock in average frequency correction

Dave Hart hart at ntp.org
Mon Feb 13 09:44:07 UTC 2012

On Mon, Feb 13, 2012 at 08:50, David J Taylor
<david-taylor at blueyonder.co.uk.invalid> wrote:
> "Dave Hart" <hart at ntp.org> wrote in message
> news:CAMbSiYDUXf3g4qk_ng2xcgu70=RxMqyrFj-35qMmAU6-
>> Sad fact:  No matter how well ntpd is able to discipline the clock on
>> Windows, other apps are generally stuck with a low-precision system
>> clock.  It may be sync'd to within less than 100 usec to UTC, but when
>> apps read the clock using Windows APIs, it's truncated to a precision
>> between 0.5 and 15.6 msec.  Higher resolution clock reading means
>> requiring ntpd be running and querying it in SNTP fashion.
> Another approach to obtaining more precise timestamps under Windows XP is
> given here:
>  http://www.lochan.org/2005/keith-cl/useful/win32time.html
> and can be used whether NTP is running or not.  With NTP running it can be
> both more precise and more accurate.  I played with this using both TSC and
> QPC (Query Performance Counter) implemented in Delphi and it seemed to work
> as advertised.

Good point, thanks.  I didn't mention you could try to fill in the
gaps (interpolate) using another counter, because while it can be
done, it's not easy and it's even less easy with Windows Vista and
later.  That is, of course, what the Windows port of ntpd is doing to
synthesize a higher-resolution clock (unless it reports "Using native
Windows clock directly" and shows precision of about 2**-10 rather
than about 2**-20).  I recall you pointing that page out to me some
time ago, and I've seen a similar article on the web with different
but conceptually similar code originally published in MSDN Magazine.
There are a few things that jumped out at me as I skimmed it:

"Calibrating the origin is rather harder. If you do a Sleep(), you
might hope that when you regain control you are probably pretty soon
after a clock tick. So assume that, reading the counter immediately
after a Sleep(), and repeat several times, taking the earliest counter
value recorded to be the counter value at the tick. This method will
have a small but hopefully constant offset; it's the best we can do,
at least in user space (more may be possible as a driver, using the
DDK KeQuery* functions; we haven't investigated this)."

This hope is in fact overly optimistic, though it's understandable to
assume scheduling and clock ticks are both triggered by the same tick
interrupt, that's not always the case on Windows NT descendents.  To
do as well as it does, Windows ntpd dedicates a thread to sleeping a
magic interval determined through experimentation (43 msec with 15.625
msec clock, and thanks to much effort on your part testing a rare 10
msec Windows clock, 27 msec with 15.625).  At each wakeup the thread
collects a performance counter reading and the system clock into a
64-deep history.  When attempting to interpolate, that history is used
in a brutish way to attempt interpolation using each of the 64
available correlations in turn, and uses whichever result works out to
the earliest timestamp, reasoning that sample was taken closest to the
preceding OS clock tickover.  This was inspired by the minimum delay
clock filter algorithm of NTP.

All that comes crashing down on Vista and later if the clock advances
every 0.5 or 1 msec, because the scheduling precision is also 0.5 or 1
msec, the oversampling the above approach needs is not possible,
causing ntpd to "use Windows clock directly" if you're lucky.
Fortunately, as you well know, it's possible with care to run a newer
version of Windows and still have the clock advance only every 10-15
msec, though the exact trigger(s) for the change are unknown to me,
timeBeginPeriod AKA the multimedia timer (and ntpd -M on windows) may
be involved.

I believe even better results are possible which would not be fragile
on Vista and later, but it's going to take some ugly and likely
performance and power impacting busy looping to do it, particularly
ugly on systems with a single logical processor.

"While the return value is precise to 100ns, it is only updated at
each timer tick - approximately 64 times per second on WinXP, or once
every 15.625 milliseconds*.  *Note: The precise interval is
configurable, via either the Windows multimedia support library
(timeBeginPeriod etc.) or the undocumented system calls
NTSetTimerResolution and friends. This still doesn't provide the
high-precision timing we're after, though."

I bet there are XP systems where it's 10 msec.  The interval is not
configurable, it's determined by the hardware (which determines the
HAL selected by Windows setup).  15.625 msec is the value on the vast
majority of systems, though.  What is affected by timeBeginPeriod and
the workhorse it eventually calls upon (NtSetTimerResolution) is the
scheduling precision -- it's not uncommon to have the clock ticking 64
times per second while the scheduler interrupts 1000 times per second.

"There are two ways of obtaining some higher-frequency timing data:
QueryPerformanceCounter() and the Intel IA32 instruction RDTSC. In
fact, in Windows XP the former is a wrapper around the latter. RDTSC
reads the value of a 64-bit counter (the Time Stamp Counter) on the
CPU that is incremented every clock cycle. This is quite fast enough
to give us the precision we need. Windows' QueryPerformanceCounter()
simply ensures we get a stable value even on a multiprocessor system
(and possibly corrects for SpeedStep frequency changing technology;
I'm not sure)."

When the code was written it may have been nearly always true that
QueryPerformanceCounter used the TSC on x86, as is widely known now,
that changed after TSC was fouled by a generation or three of AMD
processors.  Windows updates rolled out which steer QPC clear of TSC
or condition it across processors.  When using alternative hardware,
QPC's rate typically dropped from hundreds of MHz (TSC using processor
or memory bus clock) to 14 or even 3 MHz.

Dave Hart

More information about the questions mailing list