[ntp:questions] Re: offset is correlated with delay
Richard B. Gilbert
rgilbert88 at comcast.net
Sun Jan 18 13:41:43 UTC 2004
The delay bounds the error; that is we know that the server's timestamps
occurred between the time the packet left our system and the time it
returned. There are four different timestamps in there two from our
clock and two from the server's clock. Those four timestamps are used
to determine both the delay and the offset. See RFC1305 page 100 et.
seq. for the math.
If I haven't made a stupid mistake somewhere, the transit delays are
5.4usec per mile plus the packet size in bits divided by the bit rate,
plus the delays introduced by every network router along the way. I
did a traceroute to one of the servers I was using at the time and found
there were something like seventeen such devices between me and the
server! Each one has to decode enough of the packet to find out where
it's going and then figure out the next step in getting it there. This
takes a little time and these times add up.
The distance between you and the server is not necessarily even close to
the distance shown on a map. Did you ever fly from New York to Los
Angeles direct and return with a stop in Dallas-Fort Worth? Network
routers will use whatever path is "best" at the moment! The "best" path
is not necessarily the shortest.
The network delays usually dominate according to page 101 of RFC1305.
Remember that the average RISC processor (Alpha, Sparc, etc.) can swamp
a 100 megabit/second ethernet! Most WAN connections handle far less
than that. T1 is 1 megabit/second and T3 is 45 megabits/second. Cable
modem and ADSL handle less than that. I suspect that "server overload"
really means "server network overload".
Would anybody running an "overloaded" server care to comment? Is the
machine really working up a sweat or is network congestion the problem?
Andrew Schulman wrote:
>When I run 'ntpq -c peers', I've noticed that offset is correlated with
>delay. Here's a recent set of output:
># ntpq -nc peers
> remote refid st t when poll reach delay offset jitter
>+x.x.x.x x.x.x.x 3 u 9h 36h 377 13.893 -13.756 6.267
>+x.x.x.x x.x.x.x 2 u 8h 36h 377 201.608 81.170 90.862
>*x.x.x.x x.x.x.x 2 u 8h 36h 377 23.361 -13.657 5.336
>(server addresses obscured). This is typical of what I see, in that the
>server with the large delay also shows a large positive offset.
>If I understand correctly (the docs don't ever say, as far as I can find),
>"delay" is the round-trip time (in ms) of the query to the server, and
>"offset" is the estimated difference between my clock and the server's (or
>vice versa-- whichever).
>Here's my hypothesis about how this might happen:
>Delay is the sum of three smaller delays:
>query delay = travel time of my NTP query packet to the server
>server delay = time spent waiting for the server to answer once it's
>received the query
>return delay = travel time back from the server.
>Large delays are usually (according to my hypothesis) caused by server
>overload, so that they mostly consist of server delay.
>Now to estimate the true time based on data from the server, ntpd adds the
>time in the server's response to half of the total delay, which is its best
>available estimate of the return delay. But half the total delay equals
>(assuming that query delay and return delay are about the same) the return
>delay plus half of the server delay, which is too large. Hence the offset
>is too large, by about half of the server delay.
>If my hypothesis is true, then the increase in offset should be about half
>of the increase in delay (server delay). And that does seem to be the
>case-- it's about right in the example above.
>I know that the real method of estimation is more complicated than this,
>that there's a statistical model involved. Right now I'm just trying to
>get the basic idea right. But another possibility is that the statistical
>model introduces this correlation.
>What do you think? Am I on the right track here? Is this a well-known
>problem? Feel free to set me straight if I've completely missed the
More information about the questions