[ntp:questions] Flash 400 on all peers; can't get ntpd to be happy

Ralph ralph at depth.net
Sat Mar 12 19:10:30 UTC 2011



On Friday, March 11, 2011 11:49:39 PM UTC-8, unruh wrote:
> 
> No, that is not the problem. The problem is that the computer has an
> internal clock that depends on things like counting processor cycles. If
> suddenly the processor disappears for a while with no processor cycles,
> the timing will be messed up. ntpd cannot do anything about that. It
> just looks as if the local call has suddenly slews backwards. 
> 
> > the local ticks simply can't be trusted?  Keep track of how far off the local 
> > clock is from the ntp sources (averaged over numerous queries) and adjust the 
> 
> How?
> 
> > clock based on the average adjustment that is needed.  Don't mess with trying 
> 
> That is what ntpd does. It does it by adjusting the rate of the clock.
> 
> > to calculate the time taken for the round trip and all that, if the replies
> 
> And how does it know what the ntp sources say the time is then?
> 

I think your response demonstrates where the thinking is 'stuck inside the box'.
Stop being so concerned with the internal clock ticks - assume they are wrong 
and assume they are variable and don't try to use them for any measurement of 
time. Simply try to figure out the average of how far off they are - I know 
this is what ntpd does but it does it in a way that requires the ticks to be 
consistent because it is trying to compensate for trip time.  

What I'm saying is that you don't bother compensating for trip time because the 
distance between (org) and (dst) is not linear and consistent the way the time 
between (rec) and (xmt) are.  Instead if you just use (xmt) as the 'correct' 
time and use the difference between (dst) and (xmt) as the adjustment that 
needs to be made, then you can get to a point where you know how much to 
slow down or speed up the ticks on average over time.

So take a VM that has ticks that occurr the following number of nanoseconds 
apart... 1,3,5,5,2,1,1,5,5,2,2,1,1,5,5,1,1,2,2,1  and let's for time to run 
'properly' you need the ticks to be one every 3 nanos.  So what you need to do
is add 0.45 nanos to every tick.  At the core I don't think this is any 
different than what ntpd does today; but the difference is that if it sends 
out a ntp packet and gets nothing but 5 ns ticks and then sends out another and 
gets nothing but 1 ns ticks, it's calculation of the round trip is totally 
inconsistent, right?

So if instead you just say the difference between the (dst) time and (xmt) time 
was local being 300s slow so I need to speed up to get in sync.  And hopefully 
you have multiple time servers so you can average the differences to get a 
difference that has less variation - call this avg(xmt).  And if you keep track 
of the difference for each server between the (xmt) it gave you and the 
avg(xmt), then you should be able to calculate the average that a given source 
is off over time so that you can detect bad round trips vs good ones.

The net result of all this would be that you have a clock that is generally 
consistent but would be potentially succeptable to sustained periods of high 
network latency (vs the 'normal' latency) and runs at a time that is behind 
'real' time by the average number of nanoseconds that a packet take to get 
from a time server to the client.

So instead of trying to find a correction to apply to the internal tick of the 
clock, you are trying to find the correction to make to the average amount of 
time that the ticks are off by.  This means looking at it over a much larger 
period of ticks and it probably takes longer to find the 'proper' adjustment.




More information about the questions mailing list