[ntp:questions] Toggling time sync between two servers
mayer at ntp.isc.org
Sun Mar 26 06:11:56 UTC 2006
> I'm new to ntp, but have been asked to debug an issue with our NTP
> configuration. First, here is our setup. We have two processors which
> each have a third party application reading time off a serial line from
> two different sources and then disciplining the internal clock on each
> processor with the "adjtime" system call. xntpd runs on each of these
> processors with only internal time as a source. One processor is
> fudged at stratum 0 and the other at stratum 2. Currently, xntp is
> configured with "enable pll". My first change will be to change
> "enable pll" to "diable pll" as I believe that we shouldn't have two
> different applications(our third party app and xntp) discipling the
> internal clock at the same time.
> These two processors act as the NTP servers to all other processors on
> our network. I've read here that having two servers is a bad situation
> and that it is better to have four, but that just about
> anything(including just 1) is better than having two. I believe we our
> finding that out first hand, so I work to resolve that as well.
> When I checked the ntp logs for some client processors on our network,
> I see that occasionaly there are large(upto 1 second) differences
> between our two servers. On top of this, the clients can not decide
> which one to sync with and we get stuck in a loop as follows.
> restart min polling
> step time/sync to server1 at stratum 0
> restart min polling due to step
> step time/sync to server2 at stratum 2
> This exact sequence repeats over and over on appx 5 minute intervals
> due to the min polling time. I realize that I have many problems here,
> but I would like to know why ntp toggles between syncing/stepping to
> server1 and server2 after each polling period. I would have thought
> that the resultant action would be the same after each polling period.
> But instead it is consistent that the client steps time to the opposite
> server that it stepped to last time, like so.
> 25 Mar 16:36:58 xntpd: time reset (step) 0.421856 s
> 25 Mar 16:41:55 xntpd: synchronized to 126.96.36.199, stratum=1
> 25 Mar 16:41:54 xntpd: time reset (step) -0.421930 s
> 25 Mar 16:46:33 xntpd: synchronized to 188.8.131.52, stratum=3
> 25 Mar 16:46:34 xntpd: time reset (step) 0.419369 s
> 25 Mar 16:51:31 xntpd: synchronized to 184.108.40.206, stratum=1
> 25 Mar 16:51:30 xntpd: time reset (step) -0.419446 s
> 25 Mar 16:56:09 xntpd: synchronized to 220.127.116.11, stratum=3
Your description sounds like it broke the cardinal rule: you must not
have more than one application disciplining the clock. It sounds like
you do have more than one, but I can't be sure. The application should
have been implemented as a refclock and let ntpd figure out the rest.
fudging to stratum 0 is illegal, that's reserved for the refclock
itself. Everything else is derived from there. At best your server would
be stratum 1 if your time was being set up as a refclock. You'd need to
understand the what the third-party application is doing to figure out
how well it's disciplining the clock. It's probably not doing a good job
and clock-hopping like this is not unlikely.
More information about the questions