[ntp:questions] ntpd wedged again

Dave Hart davehart at gmail.com
Sat Feb 11 19:58:37 UTC 2012


On Sat, Feb 11, 2012 at 17:17, Chuck Swiger <cswiger at mac.com> wrote:
>> Have you tried to time the minimum clock reading time with RDTSC
>> or GetPerformance* counter calls?
>>
>> I wrote a tiny test program on my Win7-64 laptop, it got:
>>
>> Reading the system clock 10000000 times, minimum reading time =
>> 24 clock cycles,
>> minimum OS step = 0 ticks, maximum OS step = 10000 ticks
>>
>> The clock frequency is 2.7 GHz or so, the FileTime ticks should be
>> 100 ns each, so my OS clock is making 1 ms steps, while the clock
>> reading time is down to less than 10 ns!
>
> Well, the code above is not reading a clock; you're reading the
> stored value of what time it was when the kernel's scheduler last
> updated that value.  When the OS scheduler ticks, it reads a clock,
> then trims down the precision to the scheduler quantum (ie, 1ms
> for HZ=1000), and stores in a "commpage", which is mapped RO
> into userland processes.

Terje's code is reading the only clock available on Windows.  It may
not be what you think of as reading a clock based on your
understanding of other operating systems, but Windows isn't
necessarily the same as other operating systems.  15 years ago, most
POSIX-style OSes used a simple tick-based system clock like Windows
that was very fast to read, though typically not as fast as Windows'
because the current time wasn't mapped into unprivileged memory of
each process, so the time to read the clock was dominated by the
system call overhead of transitioning to and from kernel mode/code,
probably a couple of orders of magnitude more expensive than actually
reading the stored current clock value in the kernel.

Your explanation of how Windows increments its clock is mistaken (for
both the traditional Windows clock implementation through WS 2003, and
the newer model introduced in Vista/WS 2008.  It appears you're trying
to apply knowledge of one or more POSIX-style OSes blindly to Windows,
which is sometimes going to lead to the correct understanding, but not
in this case.  In particular, while the classic Unix approach ran the
process scheduling logic once per tick coincident with updating the
tick-based clock, my understanding is Windows' (or more precisely
NT's) scheduler runs independently of and often at a different rate
than its clock ticks.  On WS 2003 and earlier, the scheduler ran as
often as every millisecond (dependent on the so-called multimedia
timer rate) while the clock ticked 64 or 100 times per second.  I am
familiar with no OS using a tick-based clock which reads a
timer/counter each tick -- rather, the system clock is simply
incremented by a fixed value.  On WS 2003 and earlier, this "fixed"
value is what is returned by GetSystemTimeAdjustment, and changed by
ntpd or the windows time service using SetSystemTimeAdjustment to
accomplish the same slewing done using ntp_adjtime or adjtime
syscalls.  On many POSIXy systems, this "fixed" value is a global
constant often named tick which NTP utility tickadj adjusts by
rewriting the kernel executable image file.  On modern POSIXy systems
with a high-precision system clock, reading the high-resolution clock
once per tick, rounding it to the scheduler quantum, and storing it
makes sense if the system also offers a fast, low-precision clock
alternative which simply reads the stored value.

Terje or another participant who has more experience with modern
open-source OSes than I will hopefully correct any mistaken
assumptions of mine.

> >From what I recall, ntpd tries to notice this limited precision, and adds dithering or "fuzz" to the low bits.
>
> Also please note that you can't just call rdtsc by itself and get
> good results.  At the very least, you want to put a serializing
> instruction like cpuid first, and secondly, you really want to call
> rdtsc three times, and verify that t1 <= t2 <= t3, otherwise you
> have a chance of getting bogus time results due to your thread
> switching to another processor core between calls, CPU power-
> state changes, chipset bugs, interference from SMC interrupts,
> and PHK knows what else.  :-)

Not on modern AMDs, or any Intel, as far as my admittedly sub-PHK
understanding goes.  AMD really screwed the pooch by allowing the TSC
to vary between processors and vary with power state, causing all
sorts of headaches for all sorts of software.  Even on buggy systems,
reading TSC once is enough if you've locked the thread to a single
logical processor.  On Windows, one can use the performance counter
(albeit much slower than RDTSC thanks to syscall overhead) and safely
ignore the issue, as all the Windows HALs on machines kept up with
Windows updates avoid RDTSC on systems where it is unreliable.

> To resurrect timings of various clock-sources taken from 2005:
>
>                TSC             ACPI-fast       HPET            i8254
> dual Xeon:      580 nsec        1429 nsec       1120 nsec       3980 nsec
> dual Opteron:   212 nsec        1270 nsec       1760 nsec       4420 nsec

This table is great for those using Linux or FreeBSD or other systems
where one can choose the system clock source.  That is irrelevant to
Windows, where the clock source is whatever the HAL decides it is,
period.  Changing HALs is typically not an option.

Cheers,
Dave Hart


More information about the questions mailing list