[ntp:questions] Spike detection and clock runaway
A C
agcarver+ntp at acarver.net
Thu Feb 23 21:14:56 UTC 2012
I had ntpd running ok for the past six days with what appeared to be no
more problems. The libc issues were not present (ntpq was polling
regularly with no lockups of ntpd), PPS seemed to be working with kernel
discipline enabled (though I have an outstanding question about PPS and
the PPM adjustments). It was giving me offsets of only +/-5 ms or less.
However, very suddenly it went off track and I don't understand exactly
what happened. The log file is available at http://acarver.net/ntpd/
Near the bottom of that log (at the time stamp 23 Feb 12:35:36) spike
detections started to show up. I had no spikes through the rest of the
log up to that point. After the spike detection, ntpd started to step
the clock around but it has never been able to recover from this. It's
currently still stuck:
remote refid st t when poll reach delay offset
jitter
==============================================================================
127.127.22.0 .PPS. 0 l 66m 16 0 0.000 0.000
0.000
127.127.28.0 .GPSD. 4 l 39 128 37 0.000 -2584.2
1237.15
97.107.134.213 128.4.1.1 2 u 53 64 7 91.841 -222.24
1639.19
173.193.227.67 10.0.77.54 4 u 49 64 7 98.229 -1431.9
842.750
72.18.205.157 .INIT. 16 u 35 256 0 0.000 0.000
0.000
130.207.165.28 130.207.244.240 2 u 15 64 17 78.384 -1576.1
797.335
*131.144.4.10 130.207.244.240 2 u 28 64 17 85.010 -1209.3
929.398
The loopstats and clockstats files for that time period are also at the
same link. Clockstats on the PPS source shows the shift occurred very
quickly (see file clockstats.20120223 near timestamp 55980 45286.524).
It was completely fine up until that moment and then it fell apart.
There were no missing pulses at that point in time and according to the
time stamp they were arriving very regularly.
More information about the questions
mailing list