[ntp:questions] ntpd wedged again

A C agcarver+ntp at acarver.net
Sat Feb 11 20:33:53 UTC 2012

On 2/11/2012 12:09, unruh wrote:

> 16 sec in 5 min is 50,000 PPM. It is hard to see how ntpd could do that,
> unless it was stepping like mad (one of the problems with the highly
> non-linear stepping that ntp likes to do). It is possible to make the
> clock slew at that rate by using adjtimex, the tickvalue adjustment, but
> ntpd does not do that. (chrony does use it).

I would agree it seems very unusual but at this point I'm not willing to 
rule anything out.  What I know is that the system was ticking along 
fine, it had pretty much converged on an adjustment to the tick rate 
(about -77.9 PPM), offsets were 1 ms or better, and then suddenly the 
offsets grow to seconds within about two polling periods (at the time 
they were 128 seconds).  It never recovered after that, all the clocks 
were deselected and there was no system peer any longer.

The only thing that I've been able to see is a popcorn warning just 
before it happens and then sudden clock steps.  The popcorn warning 
happens on one of the internet servers (not the same one each time) and 
then everything explodes as the clock gets stepped a couple times 
(usually a second or less step).  After that the clock is slewing madly 
because the offset of the PPS is ramping quickly and then phase wraps. 
I'm basing that on the fact that I've verified my PPS independently as 
stable.  So the internal clock is slewing relative to the PPS ticks at a 
rapid rate causing the offset to shift and wrap when the phase flips.

I've tried this with and without flag 3 (kernel discipline) enabled and 
both result in the same failure and both times it happens without using 
the SHM clock so only ATOM and internet servers.

Right now the PPS is labeled a false ticker even after a couple more 
restarts trying to step the clock back to where it belongs.  Up until 
now I wasn't using any flags on ntpd but I've just now added -g just so 
that I can get the clock back in line with the frequent restarts of ntpd.

More information about the questions mailing list