[ntp:questions] ntpd wedged again

Dave Hart davehart at gmail.com
Sat Feb 11 15:40:48 UTC 2012


On Sat, Feb 11, 2012 at 08:11, Terje Mathisen <"terje.mathisen at
tmsw.no"@ntp.org> wrote:
> Dave Hart wrote:
>>
>> On Fri, Feb 10, 2012 at 03:07, A C<agcarver+ntp at acarver.net>  wrote:
>>>
>>> So far so good, it's running with reasonable offsets and jitter (PPS not
>>> yet
>>> enabled).
>>>
>>> But this is new to me in the logs:
>>>
>>> 10 Feb 01:41:37 ntpd[242]: ts_min 1328838097.185664676 ts_prev
>>> 1328838097.185550676 ts 1328838097.189745353
>>> 10 Feb 01:41:37 ntpd[242]: sys_fuzz 114000 nsec, this fuzz 0.000099410
>>> 10 Feb 01:41:37 ntpd[242]: get_systime prev result 0xd2def451.30cccb9b is
>>> 0.000780166 later than 0xd2def451.3099aa99
>>
>>
>> This is from recently-introduced code in ntp-dev that is reporting a
>> hoped-to-be-impossible situation of reading the clock using ntpd
>> routine get_systime() where the fuzzed result is earlier than the
>> previous get_systime()'s result.  These messages are not really
>> intended to be scrutable by those not familiar with the internals of
>> that routine, and my hope was they would not be seen at all.
>>
>> Briefly, ntpd "fuzzes" each timestamp's bits below sys_fuzz.  The
>> display is inconveniently mixed between decimal and hexadecimal
>> representations, as those were the forms at hand and I didn't want to
>> try to convert them and risk bugs in such conversion confusing
>> results.  Until recently, the fuzzing was below the reported
>> precision, or minimum increment observed between subsequent clock
>> readings.  That should be no problem on typically high-precision unix
>> system clocks, but led to predictable "time went backwards" issues
>> with low-precision clocks such as Windows ntpd using the native system
>> clock ticking every 0.5 or 1.0 msec.
>>
>> The new code measures not only the minimum increment between
>> subsequent readings, but also the minimum time to read the clock,
>> which is not the same with a low-precision system clock.  If ntpd
>> reads the system clock at the same value 100 times in a row, then it
>> increments by 1 msec, the code infers the time to read the clock is 1
>> msec /  100 or 10 usec, and sets sys_fuzz to 10 usec.  Later when
>> reading the system clock and converting to NTP 32:32 fixed-point
>> representation, a random value up to 10 usec will be added to the
>> value from the OS, where previously up to 1000 usec / 1 msec would be
>> added.
>>
>> At the same time fuzzing approach was changed, paranoid checks to be
>> sure both the OS clock and the converted and fuzzed 32:32 is always
>> advancing as well as the fuzzed conversion always increasing.  So
>> either those paranoid checks are wrong, or the code to measure the
>> fuzz at startup is wrong, leading to the logically impossible result
>> of the clock appearing to move backward relative to the prior reading.
>>
>> In this specific case, the minimum time to read the clock was measured
>> at ntpd startup to be 114 usec, so each raw OS clock reading is
>
> OK, that's the problem right there: That value is obviously wrong!

You made the easy to understand mistake of assuming A C is running
ntpd on Windows.  He's not, he's running it on a museum-quality Sun
IPX.

> The Win* OS clock is so dead simple to read that I would expect the minimum
> reading time to be well below a us!
>
> Even though the clock is maintained by the HAL, and therefore impossible to
> fix properly, even with a regular driver, the read interface should be
> effectively a usermode function call with a single load/store operation:
>
> I.e. GetSystemTimeAsFTime(&LARGE_INTEGER) could look like this:
>
> int GetSystemTimeAsFTime(&LARGE_INTEGER tstamp)
> {
>    if (!LARGE_INTEGER) return 0;
>    *tstamp = _kSystemClockCurrentValue; // ??? Some kernel RO value
>    return 1;
> }

Right.  NT maps some information into every process, including the
64-bit clock returned by GSTAFT().  A straightforward switch to a real
interpolated system clock would require GSTAFT() suddenly become order
of magnitude more expensive, which I suspect is part of the reason it
hasn't happened yet.  I would advocate adding new routines to query a
new interpolated system clock leaving the existing ones furiously fast
for maximum application compatibility.  Those routines would be pretty
straightforward to implement using the existing performance counter
facility and changing the tick-time code to not only update the 64-bit
value in shared memory but also snag a performance counter value at
tickover.  Then reading the interpolated clock would involve reading
the "old" clock and caclulating how far into the tick based on the
current perf. counter value.

> The variable is named something else of course, but since Win* doesn't even
> try to report an interpolated clock value, it can in principle be almost as
> simple as that.
>
> Have you tried to time the minimum clock reading time with RDTSC or
> GetPerformance* counter calls?
>
> I wrote a tiny test program on my Win7-64 laptop, it got:
>
> Reading the system clock 10000000 times, minimum reading time = 24 clock
> cycles,
>  minimum OS step = 0 ticks, maximum OS step = 10000 ticks
>
> The clock frequency is 2.7 GHz or so, the FileTime ticks should be 100 ns
> each, so my OS clock is making 1 ms steps, while the clock reading time is
> down to less than 10 ns!

Yes, this is not news to me.  If you ran recent ntp-dev on that
machine you'd likely see "proto: fuzz beneath 0.100 usec" or so logged
at startup, presuming ntpd didn't decide to use the native clock
directly (disabling app-level interpolation).  The message is logged
only when sys_fuzz is less than the measured precision, which is to
say basically only on systems without what portable ntpd code sees as
a high-precision clock.

>> This code is green and could well be wrong.  I encourage others seeing
>> such messages in syslog or a configured ntpd log file to share a few
>> examples here or in a bug report.
>>
>> Those who followed the introduction of a high-precision clock into
>> FreeBSD many years ago might recall similar OS-level warnings
>> reporting that time appeared to run backwards, seen for example when
>> using processor cycle counters on SMP systems where the counters are
>> not synchronized across processors.  I hope these messages will be
>> tamed by a bug fix and the new paranoid checks can continue to be
>> used, with more comprehensible warning text when the checks fail.
>>
> Good plan!

Thanks.  I knew when adding the code what a nightmare the FreeBSD time
went backwards warnings had been in practice, so seeing the paranoid
checks reported firing by several different people in the wild already
has me filled with dread.

Take care,
Dave Hart


More information about the questions mailing list