[ntp:questions] Re: Unexpected ntpd behavior

Richard B. Gilbert rgilbert88 at comcast.net
Wed Mar 9 13:48:37 UTC 2005


Pete Buelow wrote:

>Some quick background. Trying to get ntpd running on some IA64 hardware in a
>pretty simple environment. Two machines in a pair relationship, the first 
>machine in the pairing talks to a known good NTP server, the other talks to
>it's paired buddy. OS is Debain Sarge stable, ntp is 4.1.0-8. Ntp is
>started with -n -c /path/to/conf -x. Conf is simple, and is below.
>
>server 11.0.0.1 prefer
>server 127.127.1.1
>fudge 127.127.1.1 stratum 14 refid LCL
>  
>
The above two lines are in error!   The local clock should be 
127.127.1.0!!!!!

>driftfile /etc/ntp.drift
>pidfile /etc/ntp.pid
>disable stats
>authenticate no
>
>Problem is, if time is slow compared to 11.0.0.1 (which works just fine,
>it's a timeserver for several hundred lab machines), it will catch up quite
>rapidly (much faster than the 2000s/s rate), and run past. If the time is
>ahead of the server, it will just continue ahead. I found a post below
>which states that it should then turn around eventually, and head the other
>direction, bouncing like a bungee, but I've never run the test that long. I
>have no idea why this behavior is happening. And it is the same behavior on
>both machines.
>
>A sample ntpq -p output. Clock was set 6 and a half seconds behind 11.0.0.1.
>
>Node2# ntpq -p
>     remote           refid      st t when poll reach   delay   offset 
>jitter
>==============================================================================
>*11.0.0.1        192.168.31.253   4 u   55   64  377    0.308  6418.55  
>1.565
> LOCAL(1)        LOCAL(1)        14 l   21   64  377    0.000    0.000  
>0.004
>
>Two notes of interest based on other posts I've read
>1. Our tick rate is 1ms instead of 10ms.
>2. On almost all of the test machines, the drift file is populated with the
>value 500. On one it's ~450. According to another poster, that could be the
>source of some issues.
>
>Thoughts? Ideas? I'm assuming right now that it's either a config or a HW
>issue. I'm running a test now with this config and command line options,
>but am adding "disable kernel" to the config file. Wondering if that will
>change the behavior.
>
>Thanks in advance if anyone has any help to offer at all.
> 
>  
>
If almost all of your drift files are populated with 500, something is 
very wrong!!   500 is the limit for correctable frequency errors!  If 
your clock frequencies are all in error by 500ppm or more, I would 
suspect the clock you are trying to synchronize with.   If I had  a 
hundred machines synchronized with a known good clock, I would expect 
ninety percent or more of them to have drift values  in the range from 
-200 to + 200.   Checking the machines running ntp in my home I find:
two Sun Ultra 10 workstations running Solaris 8 and Solaris 9 have   
6.400 and  -3.172 respectively.   A DEC Alphastation 200
running VMS V7.2-1 has  35.488 while a Compaq Deskpro EN running RedHat  
has -4.908.   A very small sample, but indicative of what is "normal".

I also note that the machine you are trying to synchronize with is at 
stratum 4 which is pretty near the bottom of the food chain!!  While 
stratum can range from 1 to 15, I'd consider serving time from any 
stratum higher than 3 as a little bit odd.

Stratum 1 servers get their time directly from a hardware reference 
clock traceable to NIST or some other national standards organization.   
Stratum 2 servers get their time from stratum 1.   Small organizations 
would operate stratum 3 servers and have their leaf nodes at stratum 
4.    Larger organizations would operate stratum 2 or stratum one 
servers with leaf nodes at stratum 2 or 3.




More information about the questions mailing list