[ntp:questions] Re: Unexpected ntpd behavior
Richard B. Gilbert
rgilbert88 at comcast.net
Wed Mar 9 13:48:37 UTC 2005
Pete Buelow wrote:
>Some quick background. Trying to get ntpd running on some IA64 hardware in a
>pretty simple environment. Two machines in a pair relationship, the first
>machine in the pairing talks to a known good NTP server, the other talks to
>it's paired buddy. OS is Debain Sarge stable, ntp is 4.1.0-8. Ntp is
>started with -n -c /path/to/conf -x. Conf is simple, and is below.
>
>server 11.0.0.1 prefer
>server 127.127.1.1
>fudge 127.127.1.1 stratum 14 refid LCL
>
>
The above two lines are in error! The local clock should be
127.127.1.0!!!!!
>driftfile /etc/ntp.drift
>pidfile /etc/ntp.pid
>disable stats
>authenticate no
>
>Problem is, if time is slow compared to 11.0.0.1 (which works just fine,
>it's a timeserver for several hundred lab machines), it will catch up quite
>rapidly (much faster than the 2000s/s rate), and run past. If the time is
>ahead of the server, it will just continue ahead. I found a post below
>which states that it should then turn around eventually, and head the other
>direction, bouncing like a bungee, but I've never run the test that long. I
>have no idea why this behavior is happening. And it is the same behavior on
>both machines.
>
>A sample ntpq -p output. Clock was set 6 and a half seconds behind 11.0.0.1.
>
>Node2# ntpq -p
> remote refid st t when poll reach delay offset
>jitter
>==============================================================================
>*11.0.0.1 192.168.31.253 4 u 55 64 377 0.308 6418.55
>1.565
> LOCAL(1) LOCAL(1) 14 l 21 64 377 0.000 0.000
>0.004
>
>Two notes of interest based on other posts I've read
>1. Our tick rate is 1ms instead of 10ms.
>2. On almost all of the test machines, the drift file is populated with the
>value 500. On one it's ~450. According to another poster, that could be the
>source of some issues.
>
>Thoughts? Ideas? I'm assuming right now that it's either a config or a HW
>issue. I'm running a test now with this config and command line options,
>but am adding "disable kernel" to the config file. Wondering if that will
>change the behavior.
>
>Thanks in advance if anyone has any help to offer at all.
>
>
>
If almost all of your drift files are populated with 500, something is
very wrong!! 500 is the limit for correctable frequency errors! If
your clock frequencies are all in error by 500ppm or more, I would
suspect the clock you are trying to synchronize with. If I had a
hundred machines synchronized with a known good clock, I would expect
ninety percent or more of them to have drift values in the range from
-200 to + 200. Checking the machines running ntp in my home I find:
two Sun Ultra 10 workstations running Solaris 8 and Solaris 9 have
6.400 and -3.172 respectively. A DEC Alphastation 200
running VMS V7.2-1 has 35.488 while a Compaq Deskpro EN running RedHat
has -4.908. A very small sample, but indicative of what is "normal".
I also note that the machine you are trying to synchronize with is at
stratum 4 which is pretty near the bottom of the food chain!! While
stratum can range from 1 to 15, I'd consider serving time from any
stratum higher than 3 as a little bit odd.
Stratum 1 servers get their time directly from a hardware reference
clock traceable to NIST or some other national standards organization.
Stratum 2 servers get their time from stratum 1. Small organizations
would operate stratum 3 servers and have their leaf nodes at stratum
4. Larger organizations would operate stratum 2 or stratum one
servers with leaf nodes at stratum 2 or 3.
More information about the questions
mailing list