[ntp:questions] Clock jumps when refclock used
agcarver+ntp at acarver.net
Mon May 7 18:55:10 UTC 2012
New update, this one is even more interesting.
I reconfigured to use the two refclocks (SHM and ATOM), set SHM as
prefer and let ATOM run as normal. The other five Internet servers that
were there are still present (reminder: 4.2.7p270).
I had kernel disable as the only global configuration option. SHM had a
time1 fudge of 0.6 which put its offset at about zero +/-50ms. ATOM had
flag3 disabled (so no kernel discipline). Everything else was default,
Internet servers were iburst.
The end result of this: The system was synced fine to ATOM and running
happily and then suddenly ntpd went out of control in less than 24
hours. The system had an offset to all servers of over 10 seconds at
the end. There were sys_fuzz messages very frequently and constant
stepping of the clock.
I tried increasing mindist significantly (up to 5) to see if that helped
but no luck, it would still go haywire in under a day.
Now here is where it gets interesting. On a whim I changed time1 of SHM
so that it was no longer centered on zero but instead presented a
single-sided offset to ntpd at all times. In this case I dropped time1
from 0.6 to 0.55 so that the offsets stayed on one side of zero. Now
the offsets go from 0 (actually just slightly above zero) to 100 instead
of -50 to 50. I kept mindist set high because of this larger offset to
prevent clock hop. I also increased the minpoll on ATOM and SHM to 5 so
it's polling once every 32 seconds instead of every 16.
It has been ten days straight without failure, without huge offsets,
without random stepping or any other strange behavior. Even the number
of sys_fuzz messages has dropped. The only time I get any sys_fuzz
messages now is when the heat or air conditioner starts to alter the
room temperature (I haven't thermally isolated the machine yet). In
those cases it sometimes does clock hop back to SHM for a little bit and
then switches back to ATOM after the clock is adjusted (slewing only, no
steps). When the temperature is relatively stable I'm getting offsets
of less than 200 us from ATOM and the system PPM holds reasonably
steady, changing by less than 0.001 PPM in several polling periods. The
offsets at their worst are about 5 ms during major temperature swings
(heat or A/C blows directly at the machine in its current location and I
define "major swing" as a change in ambient by +/- 15F in ten minutes
according to my thermometer -- overall the room changes +/- 25F over the
course of a day according to the same thermometer).
For whatever reason, if the offset was allowed to swing on both sides of
zero, it eventually caused the whole thing to spin out of control with
wild oscillations (almost as if the PID loop was not quite damped enough
and allowed to oscillate with the right amount of initial force
applied). Keeping the offsets single-sided quieted everything down
More information about the questions