[ntp:questions] offset is correlated with delay

Andrew Schulman andrex at deadspam.com
Sun Jan 18 05:23:27 UTC 2004

When I run 'ntpq -c peers', I've noticed that offset is correlated with
delay.  Here's a recent set of output:

# ntpq -nc peers
 remote          refid     st t when poll reach   delay   offset  jitter
+x.x.x.x        x.x.x.x     3 u   9h  36h  377   13.893  -13.756   6.267
+x.x.x.x        x.x.x.x     2 u   8h  36h  377  201.608   81.170  90.862
*x.x.x.x        x.x.x.x     2 u   8h  36h  377   23.361  -13.657   5.336

(server addresses obscured).  This is typical of what I see, in that the
server with the large delay also shows a large positive offset.

If I understand correctly (the docs don't ever say, as far as I can find),
"delay" is the round-trip time (in ms) of the query to the server, and
"offset" is the estimated difference between my clock and the server's (or
vice versa-- whichever).

Here's my hypothesis about how this might happen:

Delay is the sum of three smaller delays:

query delay = travel time of my NTP query packet to the server
server delay = time spent waiting for the server to answer once it's
received the query
return delay = travel time back from the server.

Large delays are usually (according to my hypothesis) caused by server
overload, so that they mostly consist of server delay.

Now to estimate the true time based on data from the server, ntpd adds the
time in the server's response to half of the total delay, which is its best
available estimate of the return delay.  But half the total delay equals
(assuming that query delay and return delay are about the same) the return
delay plus half of the server delay, which is too large.  Hence the offset
is too large, by about half of the server delay.  

If my hypothesis is true, then the increase in offset should be about half
of the increase in delay (server delay).  And that does seem to be the
case-- it's about right in the example above.

I know that the real method of estimation is more complicated than this,
that there's a statistical model involved.  Right now I'm just trying to
get the basic idea right.  But another possibility is that the statistical
model introduces this correlation.

What do you think?  Am I on the right track here?  Is this a well-known
problem?  Feel free to set me straight if I've completely missed the
boat :)


To reply by email, change "deadspam.com" to "alumni.utexas.net"

More information about the questions mailing list