[ntp:questions] Re: Unexpected ntpd behavior

Pete Buelow nospam at putzin.net
Wed Mar 9 18:56:35 UTC 2005


Richard B. Gilbert wrote:

> Pete Buelow wrote:
> 
>>Some quick background. Trying to get ntpd running on some IA64 hardware in
>>a pretty simple environment. Two machines in a pair relationship, the
>>first machine in the pairing talks to a known good NTP server, the other
>>talks to it's paired buddy. OS is Debain Sarge stable, ntp is 4.1.0-8. Ntp
>>is started with -n -c /path/to/conf -x. Conf is simple, and is below.
>>
>>server 11.0.0.1 prefer
>>server 127.127.1.1
>>fudge 127.127.1.1 stratum 14 refid LCL
>>  
>>
> The above two lines are in error!   The local clock should be
> 127.127.1.0!!!!!
> 

Interesting, I'll look into this.

>>driftfile /etc/ntp.drift
>>pidfile /etc/ntp.pid
>>disable stats
>>authenticate no
>>
>>Problem is, if time is slow compared to 11.0.0.1 (which works just fine,
>>it's a timeserver for several hundred lab machines), it will catch up
>>quite rapidly (much faster than the 2000s/s rate), and run past. If the
>>time is ahead of the server, it will just continue ahead. I found a post
>>below which states that it should then turn around eventually, and head
>>the other direction, bouncing like a bungee, but I've never run the test
>>that long. I have no idea why this behavior is happening. And it is the
>>same behavior on both machines.
>>
>>A sample ntpq -p output. Clock was set 6 and a half seconds behind
>>11.0.0.1.
>>
>>Node2# ntpq -p
>>     remote           refid      st t when poll reach   delay   offset
>>jitter
>>==============================================================================
>>*11.0.0.1        192.168.31.253   4 u   55   64  377    0.308  6418.55
>>1.565
>> LOCAL(1)        LOCAL(1)        14 l   21   64  377    0.000    0.000
>>0.004
>>
>>Two notes of interest based on other posts I've read
>>1. Our tick rate is 1ms instead of 10ms.
>>2. On almost all of the test machines, the drift file is populated with
>>the value 500. On one it's ~450. According to another poster, that could
>>be the source of some issues.
>>
>>Thoughts? Ideas? I'm assuming right now that it's either a config or a HW
>>issue. I'm running a test now with this config and command line options,
>>but am adding "disable kernel" to the config file. Wondering if that will
>>change the behavior.
>>
>>Thanks in advance if anyone has any help to offer at all.
>> 
>>  
>>
> If almost all of your drift files are populated with 500, something is
> very wrong!!   500 is the limit for correctable frequency errors!  If
> your clock frequencies are all in error by 500ppm or more, I would
> suspect the clock you are trying to synchronize with.   If I had  a
> hundred machines synchronized with a known good clock, I would expect
> ninety percent or more of them to have drift values  in the range from
> -200 to + 200.   Checking the machines running ntp in my home I find:
> two Sun Ultra 10 workstations running Solaris 8 and Solaris 9 have
> 6.400 and  -3.172 respectively.   A DEC Alphastation 200
> running VMS V7.2-1 has  35.488 while a Compaq Deskpro EN running RedHat
> has -4.908.   A very small sample, but indicative of what is "normal".
> 
> I also note that the machine you are trying to synchronize with is at
> stratum 4 which is pretty near the bottom of the food chain!!  While
> stratum can range from 1 to 15, I'd consider serving time from any
> stratum higher than 3 as a little bit odd.
> 
> Stratum 1 servers get their time directly from a hardware reference
> clock traceable to NIST or some other national standards organization.
> Stratum 2 servers get their time from stratum 1.   Small organizations
> would operate stratum 3 servers and have their leaf nodes at stratum
> 4.    Larger organizations would operate stratum 2 or stratum one
> servers with leaf nodes at stratum 2 or 3.

Well, that particular server is in fact a stratum 4. It syncs with an org
server that syncs with a corp wide server which is connected to a stratum 1
server in the outside world. Network issues keep us from all syncing with
the stratum 2. It's a big corporation, and there are subnet issues, so it
makes sense to break this out some.

However, this is a test environment. In the field, the actual ntp server for
these machines will be a stratum 2, or possibly a stratum 1 connected to a
couple of independent GPS clocks. Time in this application is very
important.
-- 
Pete Buelow
replace nospam with putzin if you feel the urge to reply to me directly.



More information about the questions mailing list