[ntp:questions] questions Digest, Vol 114, Issue 25

Terje Mathisen terje.mathisen at tmsw.no
Tue Apr 15 09:31:37 UTC 2014


James Gibb wrote:
>
>> Date: Mon, 14 Apr 2014 18:23:56 +0200
>> From: Terje Mathisen <terje.mathisen at tmsw.no>
>> Anyway, Win8 have GetSystemTimePreciseAsFileTime() which does implement
>> exactly the same kind of sw interpolation as pretty much all other
>> server & desktop OSs have had for decades.
>
> Sorry, I was abbreviating when I put SystemTimePrecise, I was referring
> to GetSystemTimePreciseAsFileTime just as you say, as distinct from the
> old GetSystemTimeAsFileTime.  And yes, I was wrong about the precise
> version being introduced in Win7; it's new to Win8 as you say.
>
> So you think GetSystemTimeAsFileTime should always have interpolated
> anyway?  I assume when you say that bug is fixed in Win8, you're
> referring to GetSystemTimePreciseAsFileTime since
> GetSystemTimeAsFileTime still doesn't interpolate does it?

No, it does not, but it should have: The cost of doing so is so 
miniscule that it makes no sense at all to _not_ do it:

// In the HW clock interrupt handler:

   _SystemTime += _prTickTimeIncrement;

// The line above is effectively what the HAL used to do on every
// timer interrupt, now they have added one more line:

   QueryPerformanceCounter(&PerfCounterAtLastTick);

In the user mode library that returns the OS time we have:

// GetSystemTimeAsFileTime( userbuffer)

   userbuffer = _SystemTime;

There's a bit of code in the above to avoid partial updates, except in 
64-bit mode the read will always be atomic.


For GetSystemTimePreciseAsFileTime they added a line or two:

   QueryPerformanceCounter(&now);
   userbuffer = _SystemTime +
     (now - PerfCounterAtLastTick)* _SystemTimePrPerfCount;

The interpolation is most easily performed with a double precision fp 
multiplication, even if that requires conversions between 64-bit int and fp.

With a tick interval of not more than 16 ms (i.e. 64 Hz) a 32-bit 
interval can handle a counter frequency of 256 GHz, which is probably 
sufficient since the OS clock can only resolve 10 MHz, but the easiest 
way to write the above code might still be to do a 64-bit int 
subtraction, followed by a 64x64->64 integer mul and a shift right 
before the result is added to the _SystemTime value from the last timer 
tick.

The total overhead of this (bug) fix is 20-300 clock cycles on every 
timer tick, depending upon which hw device the Performance Counter is 
using. For the GetSystemTimePreciseAsFileTime function they add the same 
overhead, plus ~10-15 cycles to do the interpolation.
>
> Is there still a need to tie the timing threads in Windows 7/8 to a
> single processor?  The MSDN makes it sound as though the TSC should be
> identical across multiple processors.

On modern cpus there is a second TSC which is independent of sleep 
states and temporary turbo mode cpu overclocking, as well as some hw 
support to allow the counters to be synchronized (within the read 
overhead time) between cpus. At that point there is no need to go 
offchip to get a stable interval count.
>
> It also says the QPC frequency is unaffected by power saving CPU clock
> mode changes so it would seem there's no need to have the direct _rdtsc
> option instead of just relying on QPC these days.

See above, QPC have always been intended to be stable, so on many Win* 
machines it has used some kind of low-frequency bus clock, running at 
1-3 MHz, but we are obviously moving in the direction of 
high-resolution/low-overhead clock sources.

Terje
-- 
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"



More information about the questions mailing list