[ntp:questions] Re: Toggling time sync between two servers
Richard B. Gilbert
rgilbert88 at comcast.net
Sat Mar 25 20:07:07 UTC 2006
> I'm new to ntp, but have been asked to debug an issue with our NTP
> configuration. First, here is our setup. We have two processors which
> each have a third party application reading time off a serial line from
> two different sources and then disciplining the internal clock on each
> processor with the "adjtime" system call. xntpd runs on each of these
> processors with only internal time as a source. One processor is
> fudged at stratum 0 and the other at stratum 2. Currently, xntp is
> configured with "enable pll". My first change will be to change
> "enable pll" to "diable pll" as I believe that we shouldn't have two
> different applications(our third party app and xntp) discipling the
> internal clock at the same time.
You are absolutely right about that.
> These two processors act as the NTP servers to all other processors on
> our network. I've read here that having two servers is a bad situation
> and that it is better to have four, but that just about
> anything(including just 1) is better than having two. I believe we our
> finding that out first hand, so I work to resolve that as well.
> When I checked the ntp logs for some client processors on our network,
> I see that occasionaly there are large(upto 1 second) differences
> between our two servers. On top of this, the clients can not decide
> which one to sync with and we get stuck in a loop as follows.
> restart min polling
> step time/sync to server1 at stratum 0
> restart min polling due to step
> step time/sync to server2 at stratum 2
There, in a nutshell, you see why two servers is the worst possible
> This exact sequence repeats over and over on appx 5 minute intervals
> due to the min polling time. I realize that I have many problems here,
> but I would like to know why ntp toggles between syncing/stepping to
> server1 and server2 after each polling period. I would have thought
> that the resultant action would be the same after each polling period.
> But instead it is consistent that the client steps time to the opposite
> server that it stepped to last time, like so.
> 25 Mar 16:36:58 xntpd: time reset (step) 0.421856 s
> 25 Mar 16:41:55 xntpd: synchronized to 188.8.131.52, stratum=1
> 25 Mar 16:41:54 xntpd: time reset (step) -0.421930 s
> 25 Mar 16:46:33 xntpd: synchronized to 184.108.40.206, stratum=3
> 25 Mar 16:46:34 xntpd: time reset (step) 0.419369 s
> 25 Mar 16:51:31 xntpd: synchronized to 220.127.116.11, stratum=1
> 25 Mar 16:51:30 xntpd: time reset (step) -0.419446 s
> 25 Mar 16:56:09 xntpd: synchronized to 18.104.22.168, stratum=3
I'd say that this is probably an artifact of the two server
configuration. There is no way do decide between the two so it's just
"flipping a coin".
If I understand what you've said, you have only one source of time, the
"serial line". There is evidently some sort of problem with it or the
"third party" software that is synchronizing your two servers to to it;
else why the disagreement between servers?
You can set up an NTP server with a hardware reference clock (a GPS
timing receiver) for a minimal cash outlay. Like so:
1. A Sun Ultra 10 (used, <$100 on e-Bay) I'd recommend a minimum of
256MB of RAM. I have a 440MHz processor but it's not required; any
supported processor should do the job.
2. An ATA/EIDE disk drive =>10GB. You may well have something suitable
lying around. $80 will buy you a new 80GB drive.
3. A Motorola Oncore M12+T or M12MT GPS reciever with "evaluation board"
for around $200 from Synergy Systems.
4. Solaris 10 Media kit ($40 from Sun).
5. NTP 4.2.0 or 4.2.1. Free.
Plug it together, install software, configure, start, and enjoy. I have
such a configuration running in my home. Two other Suns (Solaris 8 and
Solaris 9), two PCs with Windows XP and W32TIME, one PC with W2K and
W32TIME, and one DEC Alpha running OpenVMS V6.2 synchronize to it. The
Suns hold synch to within a few microseconds. The Alpha to within a few
milliseconds. It's a little hard to tell what the PCs are doing but
they all display the correct time to the second (they may do better
internally but I don't know how I could see that).
More information about the questions