[ntp:questions] Isolated Network Drift Problem

Unruh unruh-spam at physics.ubc.ca
Tue Nov 25 17:12:58 UTC 2008

"Richard B. Gilbert" <rgilbert88 at comcast.net> writes:

>Cal Webster wrote:
>> Gentlemen,
>> If there is any way I can help shed light on this spirited discussion
>> please let me know. I have 4 NTP servers running in isolation (versions
>> 4.2.0.a.20040617-6.el4, ntp-4.2.4p2-6.fc8, and ntp-4.1.2-0.rc1.2 ) and
>> I'm willing to play around with them as long as I don't end up choking
>> my client machines to death.
>> -------- Anyway, here's where I'm at with the original problem --------
>> I found that adjtimex(8) is not available for RHEL (or CentOS) 3 or 4.
>> It was available in all Red Hat distros prior to and since. I may be
>> able to use one from a Fedora version based on gcc-3.4 (looks like fc6)
>> or try building from a RHEL 5 SRPM.
>> Although ntptime(1) is not a replacement for adjtimex(8), it's available
>> for all our NTP servers and it looks like I can set parameters such as
>> frequency offset, clock offset, estimated error and others.
>> What's the best way to determine which of our NTP servers provides the
>> best local clock?

>Note the difference between each clock under test and the correct time.
>Wait seven to ten days and compare these clocks again.  The clock with 
>the least change in the original offset is the "best clock".  Note that 
>this is not fool proof; you might find a situation in which the "best" 
>clock varies wildly from hour to hour and is only the best "on average".

1 sec in 7-10 days is the 2PPM level. And yes, clocks can vary in their
rate a lot especially depending on temperature. One can compensate for temp
variations but usually it is not considered worth it. 

>A more adequate test would be to set up a server with a GPS receiver and 
>compare the offset of all the local clocks under test every thirty 
>minutes over a period of several days.  But, if you are going to do 
>that, you might as well make it permanent and use it for your master clock!

Or connect your system to the net for a while allowing only ntp packets in
or out and use one of the time servers on the net. Since this is good to
1ms or so, the tests you could do with your wristwatch in 10 days can now
be done in a couple of hours. Plus ntp will tell you what the rate
fluctuations are. 

>Consider that the Garmin GPS18LVC has a pulse per second output and 
>costs less than $100 US.  If you can site an antenna with a good view of 
>the sky, you can have a stratum 1 server of your very own and have the 
>time accurate to within a millisecond or less.  Note that while the GPS 
>is accurate to 50ns or better the process of getting the time into your 
>computer may introduce several hundred microseconds of uncertainty.

The Garmin 18LVC only claims an accuracy of the PPS of about 1micro second.
However getting the time in can be done to the usec level. My system which
has a Garmin 18LVC running a parallel port interrupt has fluctuations of
about 2usec rms. And tests I ran in which I sent a signal out on the
parallel port output line to the interrupt input, timing when I sent it out
and when the interrupt was triggered showed about a 1-2us difference. 

>> I've changed my server ntp.conf files so that one machine (jato) is
>> designated as a server with its undisciplined clock set at stratum 5.
>> The other three are peers to each other, each pointing to the one
>> stratum 5 server. The peers still have the undisciplined clock
>> configured but at stratum 8. I guess I'll see how this goes.

I see absolutely no advantage to have the local clock as a possible source
for the clients. It does nothing for you.

>> Before restarting the ntp daemons I zeroed out the drift files then set
>> the system times to the exact second showing on
>> "http://wwp.greenwichmeantime.com/time-zone/usa/eastern-time/". I then

If you can get onto the net why in the world are you not using ntp from a
server on the net? 

>> set the hardware clocks to match system time with "hwclock --systohc".

harware clocks tend to have worse drift behaviour than does the CPU clock,
and can only be read on the second boundary. And recent Linux kernels have
almost completely broken the harware clock as a time source ( the harware
clock roll over interrrupt is simulated by routine which gives only about a
10ms accuracy.)

>> Although the network has no Internet connection, I have a KVM to an

What is a KVM?

>> Internet connected machine I can use. I started the master server daemon
>> first, followed by the others. As I understand it, all NTP servers will
>> now use the master server unless it becomes unreachable when they'll
>> fall back to orphan mode until the master is again reachable. So far, it
>> looks like they've all selected the master server.

More information about the questions mailing list