[ntp:questions] Too high steps in time reset

David Woolley david at ex.djwhome.demon.co.uk.invalid
Tue Apr 22 22:06:26 UTC 2008

massimo.musso at gmail.com wrote:

> [root at gecssrv1 log]# ntpq -p
>      remote           refid      st t when poll reach   delay
> offset  jitter
> ==============================================================================
> xdcf77           LOCAL(0)        11 u  130 1024  377    5.676  1307.74
> 320.974
> x193.204.114.232 .UTCI.           1 u  137 1024  377   20.074  511.544
> 152.824
> *LOCAL(0)        LOCAL(0)        10 l   44   64  377    0.000
> 0.000   0.008

You have serious problems.  It looks like both of your proper sources of 
time are being rejected as having a false time.  Also the difference 
between them is so high that at least one of them has to broken.  (Hand 
tuned clocks will usually track to about 30 seconds a year, so getting 
out by 600ms in quarter of a day, or so, is totally unreasonable.)

I'm going to guess that the DCF system isn't a real NTP server.  I 
suspect it a machine synchronised to its local clock and having that 
local clock stepped to DCF on each update.  A real DCF based ntp server 
would correct for the frequency error.  NTP assumes that time errors 
accumulate smoothly, e.g. as the result of temperature changes or 
crystal aging.  It is not optimised to handle time that jumps by half a 
second, without warning.

Actually, looking back at the DCF machine, it is openly admitting that 
it is using the local clock.  One of the problems with the local clock 
is that it reports an error band consistent with a real, locally 
attached, reference clock, so it is very easy for other machines to go 
outside of the error band.  In this case, all three machines will have 
irreconcilable times.

Assuming this is six hours since the last DCF read, we are talking 27 
ppm.  That's the drift you expect from a completely uncorrected 
motherboard of slightly below average quality.  You should be expecting 
uncorrected frequency errors of more like 0.1ppm, ranging to 1-2ppm if 
there have been violent temperature swings.

You need to install a proper DCF driver on the DCF machine, and delete 
its local clock line.  You should probably also delete the local clock 
line on the other machine.  Finally you need to add properly 
synchronised servers sufficient that you can reliably outvote any broken 
clock.  The problem here is that all three are voting for incompatible 
times, so no time can have a majority.

Note.  This doesn't solve your large step problem, but you need to get a 
vaild configuration before you start worrying about that.  One of the 
things that seems  to have confused things is that you have finally 
introduced a well behaved NTP time source into the system.

If you really can't use a proper DCF driver, you still delete the local 
clock on the non-DCF machine and you should hand calibrate the drift 
file on the DCF machine.  Properly calibrated, it shouldn't drift by 
more than about 100ms a day.  However, because it is using the local 
clock driver, other systems will only think it can have drifted over the 
time since the last time they polled, it not for the whole day, so for 
most of the day it is still likely to give a time that is incompatible 
with that from any other time server.  So on balance, if you can't use a 
DCF ntpd driver, don't use the DCF hardware.

More information about the questions mailing list