[ntp:questions] Packet timestamps when using Windows-7/Vista

Martin Burnicki martin.burnicki at meinberg.de
Fri Dec 11 13:33:16 UTC 2009


David,

David J Taylor wrote:
> Folks,
> 
> [Posted to NTP Hackers, but no reaction there as yet]
> 
> I've written a small program which sends some SNTP packets to various NTP
> servers on my LAN, and looks at the timestamps which the server adds as
> its RX and TX times.  With Windows-7 and Vista I'm seeing some odd
> results.  I'm measuring (server TX) - (server RX) time.
> 
> 1 - Bacchus - Windows 2000 Server - GPS/PPS ref. clock - older computer -
> 550MHz Pentium III.  Most values around 70 us (microseconds) with a tail
> up to about 120us.

This looks like the clock interpolation works pretty good here. 70us sound
like the "normal" execution time to handle the packet, which may be
extended to e.g 120 us (or even more) if e.g. an IRQ occurs during the
processing.

> 2 - Feenix - Windows XP Home - GPS/PPS ref. clock - 1.9GHz single core
> Pentium 4.  Most values around 20-40us, tail to 100us.  Occasional values
> out to 1000us or more.
>  
> 3 - Narvik - Windows XP Pro - LAN-synced - ~2.2GHz dual-core PC.  Most
> values 6 - 11us.  Occasionally more.

Similar as above. Please keep in mind timestamping is in user space here, so
there may not only be IRQs but also task switches etc. which expand the
time between packet reception and transmission of a reply.
 
> 4 - Gemini - Windows Vista - LAN synced - ~2.2GHz dual-core PC.  A
> distribution ranging from about -1000us to +1000us, possibly triangular
> (I'm looking on a log Y-axis).
> 
> 5 - Puffin - Windows Vista - wifi-LAN-synced - ~2+ GHz dual-core PC.
> Similar results to Gemini.
> 
> 6 - Stamsund - Windows 7 - GPS/PPS ref. clock - 2.8GHz single core HT
> Pentium 4.  Most results in the range 17-25us, but with some extremes.
> 
> 7 - Hydra - Windows 7 - LAN synced - single-core AMD 3200+. Similar
> distribution to Gemini.

For Gemini, Puffin, and Hydra:

If you are running one of Dave Hart's 4.2.5 or 4.2.6 binaries then the clock
interpolation may be disabled, and the system time increases in 1 ms steps.

On the other hand, I've also seen systems where the interpolated time steps
back and forth by 1 ms, due to the time passed by Windows to the timer APC
callback steppingby 1 ms. I have *observed* this on XP, but I can imagine
this also happens on newer systems.

Please note:

Even if under Vista/Windows 7 the system time increments in 1 ms steps, the
nominal standard tick count is still ~15600 (15601 on a Vista machine
here), i.e ~15.6 ms. Since this is not an integral multiple of 1 ms there
must be some math which converts from 1 ms steps to 15.6 ms steps, and that
math may suffer from rounding errors.

AFAICS this is still the basic problem as under XP or earlier, when the MM
timer has been set: The MM timer ticks at 1 ms, but the system time ticks
at 15.625 ms, and there also needs to be a conversion from one tick rate to
the other.

The difference in Vista/7 vs. 2000/XP seems to be that
GetSystemTimeAsFiletime returns values from the 1 ms "tick domain" for the
newer systems whereas it returns values from the 15 ms "tick domain" on
older systems.
 
> So there are two things about these results which concern me:
> 
> A - if my program is correct, it seems that the timestamps on the
> Windows-7/Vista systems are being set inconsistently, in that the server
> transmit timestamp can be /before/ the receive timestamp!  From the
> distribution it seems that one timestamp is "precise" and the other a
> Windows value based on a one millisecond (approx) timer.

As I tried to explain above, this looks to me like a +/- 1 LSB (i.e. 1ms)
problem. IIRC then Dave Hart has implemented some code in the clock
interpolation routine which shall reduce the potential +/- 1 ms jitter in
general. However, I'm not sure whether this routine is in effect if clock
interpolation is disabled, eg. on Vista/7.
 
> B - Why does the Windows-7 system with a GPS/PPS reference clock not
> behave in the same way i.e. it doesn't give negative (TX-RX) times?

Concerning the 1ms-to-15.6ms conversion mentioned above:
A *possible* reason I can imagine is that this depends on whether the clock
runs too fast or too slow at its nominal tick rate (i.e. the on-board xtal
is below or above its nominal frequency). In one case the frequency drift
compensation has to *add* an offset to the standard tick rate, in the other
case an offset needs to be subtracted. Depending on the way how the
conversion has been implemented in the Windows kernel, a positive offset
may lead to rounding errors whereas a negative one may not, or vice-versa.
All the above are only assumptions.
 
> What I don't know is whether these results are to be expected, whether
> they may have any effect on the operation of NTP, and whether they might
> even be the results of coding errors.  I'm wondering whether this
> indicates that something could be done to improve NTP on Windows-7/Vista,
> and whether it might even provide a further clue as to why 4.2.5 performs
> worse on Windows-7/Vista than on 2000/XP.

IMO it will be very hard to improve things for NTP if you do not know the
exact details why this happens. The proper solution would be if the MS
developers cared about the clock interpolation, and made the Windows system
time available at a higher resolution, especially since the available API
calls already support higher resolution.

Those guys know how the Windows timekeeping has been coded, they know when
CPU clock rates are switche to save power, etc., and they could handle this
in kernel space. So timekeeping apps like NTP would not need to care about
limitations of the underlying OS.

Martin
-- 
Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont
Germany




More information about the questions mailing list