[ntp:questions] making sense of stats offset values [or trying to...]

David Woolley david at ex.djwhome.demon.co.uk.invalid
Tue Apr 28 06:57:26 UTC 2009


Bruce Lilly wrote:
> Running ntp
> version="ntpd 4.2.4p5 at 1.1541-o Mon Jan 19 15:18:44 UTC 2009 (1)"
> as reported by ntpq, on opensuse 11.1 Linux, if that matters.
> 
> I'm trying to make sense of the time offset numbers reported in
> loopstats and peerstats files and by ntptrace.
> The documentation is unclear on a few points, and ntptrace appears to
> be broken:
> 
> 1. peerstats:
>   the sign is unspecified in the documentation, but has been described
> here as such that adding

For statistical purposes, the sign shouldn't matter.  The only time it 
might matter is if you were trying to retrospectively correct the time 
scale.

>   the offset to the local clock should give time equivalent to the
> remote peer; i.e. a positive offset
>   means that the local clock is early compared to the remote clock,
> and a negative offset means

In normal operation, offset should tell you more about random 
measurement errors, than about the peer.  There are strong arguments 
that this is often not the case in real life, but that represents a 
failure of ntpd to work well in the real world.  If a clock discipline 
can be sure that a certain proportion of the offset represents a real 
local clock error, it should be attempting to remove that offset 
promptly.  ntpd's position is that it is more likely to represent 
measurement error.

>   that the local clock is late.
> 
>   Is that correct?  If so, a clarification to the description in the
> "monopt" documentation might be
>   helpful to others.
> 
> 2. loopstats:
>   the distributed documentation is totally unclear.  I have found a
> Sun Microsystems document that
>   describes the offset as "how much time (in seconds) the clock will
> be adjusted by in the loop cycle".
>   a. Awkward wording notwithstanding, is that correct?

Definitely not correct.  It is the input to a combining and weight 
process, which in turn is input to a low pass filter, which has a time 
constant much longer than the poll interval, so only a small part of any 
particular offset measurement gets applied in any one interval.

>   b. is the adjustment intended to remove the entire offset between
> the local clock and the best-guess

No.  As above.

>       estimate of UTC, i.e. can the loopstats offset field be
> interpreted as the offset between the local
>       clock and the best-guess estimate of UTC?  Or something else?

Yes, but with the qualification that the best guess has an error band 
which is comparable with the offset.

>   c. what about the sign in this case?
> 
> 3. ntptrace output:
>    The man page (oddly enough, with version in the lower left as
> 4.1.1b-r5) gives an example:

I believe ntptrace is unsupported.

> 
>     % ntptrace localhost: stratum 4, offset 0.0019529, synch distance
> 0.144135
>     server2ozo.com: stratum 2, offset 0.0124263, synch distance
> 0.115784
>     usndh.edu: stratum 1, offset 0.0019298, synch  distance  0.011993,
> refid
> 
>     [let's ignore the missing stratum 3 and the disappearing refid
> value]
> 
>  and text
>     On  each  line,  the  fields  are (left to right): the host name,
> the host stratum, the time offset between
>     that host and the local host (as measured by ntptrace ; this is
> why it is not always zero for "localhost ")...

It is (simplifying slightly) the time offset between the local clock 
when the response is received and the local clock on the server, the 
actual return propagation time ago, plus half the round trip time. 
Either time may have large reading errors, due to clock resolution, e.q. 
W32time has a reading error that can exceed 10ms.

> 
>   This is completely baffling:
>    a. what does it mean for the local host to have a time offset from
> itself?

It means that it takes a finite time for IP messages to propagate from 
one process to another through the networking layer, and for process 
scheduling to switch between processes.  On a machine with poor clock 
resolution, there could be a large measurement difference for a small 
propagation delay.

>    b. are the offset values cached or determined from cached data [if
> I run ntptrace twice a couple of

No.

>        seconds apart, I get offset values identical from one run to
> the next down to the last reported digit,
>        while the synchronization distances vary significantly]?
>    c. is it intended that the offset reported by ntptrace bear no
> resemblance to that reported by ntpq -p
>        and in peerstats?:

They would generally be larger, because ntpq offfsets represent the 
lowest delay values from the last eight polls, spread over from several 
minutes to over an hour, whereas ntptrace represents a one off, 
immediate measurement.

ntprace is also probably running at normal priority and without any 
memory locked into physical memory.

ntptrace offsets are in seconds, whereas ntpq offsets are in milli-seconds.

>        # ntpq -p
>             remote           refid      st t when poll reach   delay
> offset  jitter
>         *megatron.blilly 18.26.4.105      2 u   27   64  377
> 2.927    0.296   0.122
>         # ntptrace
>         megatron.blilly.net: stratum 2, offset 0.002120, synch
> distance 0.024161
> 
>        Note that ntpq reports an offset of 0.296 milliseconds from the
> local host to its system peer, while
>        ntptrace reports an order of magnitude larger offset!

2.120 milli-seconds is not an order of magnitude different.  I think you 
  are expecting milliseconds to have three zeros after the decimal 
point; they don't!

> 
>         Should I really believe what ntptrace says, viz. that the
> local host is offset from a remote
>         stratum 1 server by a mere 3 microseconds in spite of orders
> of magnitude larger values of
>         jitter (and that from a program that says the local host is
> offset from itself by hundreds of
>         microseconds!)?

An instantaneous offset reading can be anywhere within the error band 
that jitter is trying to estimate, so you can believe it as much as any 
other reading that is within the error band.
> 
> Ultimately I'm trying to do a couple of things:
> 1. determine if the loopstats offset value can be correlated to
> something informative about the
>     system time of the local host, such as an estimate of the local
> clock offset from UTC.
> 2. determine the best-guess estimate of the offset of a given peer
> from UTC.

If you operate ntpd in the environment for which Dave Mills designed it, 
the best guess, in real time, should always be zero.  Some people, 
including myself, believe that there are real life cases, e.g. for 
start-up and temperature induced frequency transients, where this 
assumption breaks down.




More information about the questions mailing list