[ntp:questions] strange behaviour of ntp peerstats entries.

Unruh unruh-spam at physics.ubc.ca
Wed Jan 30 20:02:15 UTC 2008


Brian Utterback <brian.utterback at sun.com> writes:

>Unruh wrote:
>> "David L. Mills" <mills at udel.edu> writes:
>> 
>>> Unruh,
>> 
>>> It would seem self evident from the equations that minimizing the delay 
>>> variance truly does minimize the offset variance. Further evidence of 
>>> that is in the raw versus filtered offset graphs in the architecture 
>>> briefings. If nothing else, the filter reduces the variance by some 10 
>>> dB. More to the point, emphasis added, the wedge scattergrams show just 
>> 
>> I guess then I am confused because my data does not support that. While the
>> delay variance IS reduced, the offset variance is not. The correleation
>> between dely and offset IS reduced by a factor of 10, but the clock
>> variance is reduced not at all. 
>> 
>> Here are the results from one day gathered brom one clock (I had ntp not
>> only print out the peer->offset peer->delay as it does in the
>> record_peer_stats , but also the p_offset and p_del, the offset and delays
>> calculated for each packet. I alsy throw out the outliers ( for some reason
>> the system would all of a sudden have packets with were 4ms round trip,
>> rather than 160usec. These "popcorn" spikes are clearly bad. The difference
>> between the variance as calculated from the peer->offset values, and the
>> p_offset values was
>> 

>I do not know your network configuration at all, so I am just guessing,
>but My guess is that you are talking about a client connected on the
>same subnet with one or more servers, right? Connected by ethernet?

Yes, it is on the same submet. Typical round trip is 160usec. I agree that
a 4ms in 40 or 400 would not be as noticeable.

Those spikes appear to be network switches forgetting the  arp entries when
the system is at max poll of 10 or more. 

 

>In that case, you are talking about a situation where the error
>introduced by factors that increase in correlation with the round
>trip time is minimal at best. When they do kick in, you see what
>looks like huge jumps and filter them. A 4ms increase is just what
>you would expect when ethernet timers kick in. Now imagine a RTT of
>60-70ms. A 9ms delay from a collision introduces a 4ms change in the
>delay value and a 2ms change in the offset, but with a delay might not
>perturb the delay value enough to make it obviously an outlyer.

Again while I understand the desire for robustness, designing npt so it
always assumes very worst case scenario seems like overkill.
Very few people are 160msec away from servers. 
Also the use of "pre" packets ( whether ping or ntp) really would be useful
in ntp precisely to wake up switches and reduce these kinds of switch
delays. Ie solutions which try to get the best data afre better than
solutions which trow away most of the data. (I had a stretch of 15
consecutive data thrown out by the clock filter-- so my hypothetical is
absolutely not hypothetical)






More information about the questions mailing list