[ntp:questions] ntpd wedged again
cswiger at mac.com
Sat Feb 11 17:17:51 UTC 2012
On Feb 11, 2012, at 12:11 AM, Terje Mathisen wrote:
>> In this specific case, the minimum time to read the clock was measured
>> at ntpd startup to be 114 usec, so each raw OS clock reading is
> OK, that's the problem right there: That value is obviously wrong!
> The Win* OS clock is so dead simple to read that I would expect the minimum reading time to be well below a us!
> Even though the clock is maintained by the HAL, and therefore impossible to fix properly, even with a regular driver, the read interface should be effectively a usermode function call with a single load/store operation:
> I.e. GetSystemTimeAsFTime(&LARGE_INTEGER) could look like this:
> int GetSystemTimeAsFTime(&LARGE_INTEGER tstamp)
> if (!LARGE_INTEGER) return 0;
> *tstamp = _kSystemClockCurrentValue; // ??? Some kernel RO value
> return 1;
> The variable is named something else of course, but since Win* doesn't even try to report an interpolated clock value, it can in principle be almost as simple as that.
> Have you tried to time the minimum clock reading time with RDTSC or GetPerformance* counter calls?
> I wrote a tiny test program on my Win7-64 laptop, it got:
> Reading the system clock 10000000 times, minimum reading time = 24 clock cycles,
> minimum OS step = 0 ticks, maximum OS step = 10000 ticks
> The clock frequency is 2.7 GHz or so, the FileTime ticks should be 100 ns each, so my OS clock is making 1 ms steps, while the clock reading time is down to less than 10 ns!
Well, the code above is not reading a clock; you're reading the stored value of what time it was when the kernel's scheduler last updated that value. When the OS scheduler ticks, it reads a clock, then trims down the precision to the scheduler quantum (ie, 1ms for HZ=1000), and stores in a "commpage", which is mapped RO into userland processes.
>From what I recall, ntpd tries to notice this limited precision, and adds dithering or "fuzz" to the low bits.
Also please note that you can't just call rdtsc by itself and get good results. At the very least, you want to put a serializing instruction like cpuid first, and secondly, you really want to call rdtsc three times, and verify that t1 <= t2 <= t3, otherwise you have a chance of getting bogus time results due to your thread switching to another processor core between calls, CPU power-state changes, chipset bugs, interference from SMC interrupts, and PHK knows what else. :-)
To resurrect timings of various clock-sources taken from 2005:
TSC ACPI-fast HPET i8254
dual Xeon: 580 nsec 1429 nsec 1120 nsec 3980 nsec
dual Opteron: 212 nsec 1270 nsec 1760 nsec 4420 nsec
More information about the questions