[ntp:questions] NTPd looses sync regularly / 12 hour intervals.
hael at tv2.dk
Wed Sep 8 06:40:43 UTC 2010
This probably isn't the place to post this, if so, I will apologise in
I have a problem that really baffles me, having run ntp servers for
I have two ntp servers set up in the local network, from which another
40 odd servers synchronise.
Some of the servers do function as expected, and keep time quite
accurately, however, the
majority of servers rapport an error that I cannot lock down, and it's
currently baffling me...
(It's probably something obvious I'm overlooking or so I hope).
The logs on the systems that have problems usually show the following
Sep 7 17:43:48 sn ntpd: synchronized to 10.7.100.28, stratum 2
Sep 7 17:57:51 sn ntpd: time reset +71.784598 s
Sep 7 17:59:10 sn ntpd: synchronized to 10.7.100.27, stratum 2
Sep 8 05:29:11 sn ntpd: no servers reachable
Sep 8 05:34:37 sn ntpd: synchronized to 10.7.100.28, stratum 2
Sep 8 05:35:52 sn ntpd: time reset +74.977115 s
Sep 8 05:36:41 sn ntpd: synchronized to 10.7.100.28, stratum 2
It appears to have an issue nearly exactly every 12 hours, the time
difference is getting worse,
it started at around 1-2 seconds, and has steadily increased, and how
deviate with 74 seconds.
One day I was watching the event, and saw the machine go from being +/-
4 ms out to suddenly
becoming +/- 36000+ out, so it appears to be something specific that
causes the problem.
The strange thing is that the servers appear to be fully synchronised
for most of that time, but
suddenly the jitter increases dramatically, and the clocks offset change
to something very
large, and then is reset by ntp a few cycles later - I've decreased the
poll times to be maximum
of 256 seconds, as that seems to correct the problem faster.
The configuration of all servers (ntp.conf) is
restrict default nomodify notrap noquery
server ntp1.i.tv2.dk minpoll 1 maxpoll 8 iburst
server ntp2.i.tv2.dk minpoll 1 maxpoll 8 iburst
server ntp.i.tv2.dk iburst
tinker huffpuff 7200
The versions of ntp that's in use, varies a lot with the machine ages,
and so forth, strangely enough it seems that most machines have the same
Similiarly the systems are running suse, rh 4/5, on different kernel
versions. My initial
instinct was that it might have been network related.
NB: the tinker huffpuff 7200 was something that was added to see if it
had any effect
or not, minpoll, maxpoll values were added as they seem to improve
recovery from the
problem, however it's treating the symptom.
As far as I've ascertained, the ntp servers are at no time unreachable,
and do not appear
to ever loose sync with their clock sources. Their logs indicate that
- ever hour or so, that they do change between different time servers.
Any ideas are most welcome.
More information about the questions