[ntp:questions] NTP Drifts +ve and -ve

Unruh unruh-spam at physics.ubc.ca
Wed Aug 20 15:05:02 UTC 2008


David Woolley <david at ex.djwhome.demon.co.uk.invalid> writes:

>Arul Murugan wrote:
>> Hi,We are using NTP4, when CPU is very busy some of the UDP packets
>+ hdropped by the kernel, so the local clock drifts 60 milliseconds from

>The problem is not dropped packets, but delayed packets.

>+ the time server. From that point NTP keeps drifts +ve and -ve for 2 to 3

>Especially given the long recovery times you describe, it is more likely 
>that the measured time server time is drifting from its true time and 
>that the local clock is actually rather more accurately tracking it than 
>the offsets imply.

>However, if the errors always start as negative slips, your problem is 
>not CPU overload but a device driver, often IDE run on non-DMA mode, 
>with poor interrupt latency.

>+ three days to become stable. The graph looks a like a sine wave
>+ oscillating and reaching zero after 3 days.My question are:1. Why NTP
>+ drifting +ve and -ve?2. Why should NTP taking 3 days for correcting 60
>+ milliseconds?3. Is this a problem or it is expected? Regards,Arul

You do not state what you regard as "correcting 60 ms". Is it getting the
error down to 1ms? 10ms?

>ntpd isn't designed to cope with systematic changes well, it assumes 
>random perturbations until convinced otherwise, The best way of dealing 
>with those involves running a low pass filter with a time constant 
>reflecting typical crystal frequency variation times.

>I am surprised that it has not been convinced otherwise in this case, so 
>could you confirm that you are using the recommended minpoll of 6 (64 
>seconds) and the recommended maxpoll of 10 (1024 seconds).  Using high 
>values for these will compromise recovery times.

>Normally this problem is caused by network congestion, not CPU overload. 
>  To get round asymmetric link delay problems, you should configure your 
>routers and the corresponding ISP routers to give priority to NTP 
>traffic.  In default of that, you should use the tinker huffpuff option.

>If you really are having effects from CPU loading, you need to find 
>network hardware with better drivers.  If you are losing clock 
>interrupts, you need to investigate the drivers with poor latency.

>Quite a few people believe that ntpd's assumptions about measurement 
>error statistics are not valid in the world in which NTP is used by most 
>system admins, and, if you are using Linux, I'm sure Unruh will suggest 
>one alternative.

Sure, why not. chrony. It responds to change much faster than does ntp
while still maintaining good long term stability.ntp is designed as a
simple feedback loop. To keep the loop stable, the time scale of the loop
is set very long ( about 8-16 poll internals-- because ntp tends to throw
away about 7/8 of the incoming data in order to try to eliminate network
delay errors as much as possible). Since ntp tends to operate on poll 10
which is 20 min, this give a feedback loop time scale of about 5-10 hrs. 
Ie, the error is reduced by 1/e (40%) every 5-10 hrs. (actually since it is
a second order critically damped system, this is not really accurate. The
correction action goes to zero faster than that, overshots by something
like 20%  and then comes
back to zero). But 3 days sounds like a very long time unless you are using
very long poll intervals. 
 

chrony does a linear fit to the past data (corrected for the clock
corrections), testing to see if the errors are
random or consistantly changing, lowering the time scale over which the
slope and offset are determined in the latter case -- ie it has a constantly adjusted Allan
variance minimum. When the noise is random, long times are used to beat
down the statistical noise. When it is consistantly off, it shortens the
scale to allow it to respond rapidly to clock frequency drifts. It also tries to
eliminate the offset, as determined by the fit, much much faster than does
ntp. Very different philosophies. ntp tends to have larger offset variances
and maybe slightly smaller frequency variances. 




More information about the questions mailing list