[ntp:questions] NTP High Jitter and Reject Condition

ter310 at gmail.com ter310 at gmail.com
Tue Aug 1 09:44:46 UTC 2006


Hello all,

I am completely stumped by an NTP problem that I am having on a server.
 I've googled and tried everything to no luck, am hoping that someone
here can provide some advice.  Here is the situation:

I have a server running CentOS 4.3 2.6.9-34.0.2.ELsmp on a PDSMi
Supermicro motherboard with a Celeron D 3.06GHz processor.  Note, I'm
running an SMP kernel so that APIC can handle the interrupts properly.

I have /etc/ntp.conf configured to the CentOS 4.3 defaults, which means
it is polling the 0-2.pool.ntp.org server pool.  Upon boot, everything
looks okay, but within a minute or two, the jitter sky rockets:

[root at neon ~]# date
Tue Aug  1 05:30:09 EDT 2006
[root at neon ~]# ntpq -p
     remote           refid      st t when poll reach   delay   offset
jitter
==============================================================================
 ouvaton.info    134.157.254.19   2 u   31   64    1   87.432  -164.78
 0.001
 bq.serverrack.n 65.111.164.223   3 u   30   64    1    0.636  -177.35
 0.001
 msb.significant 64.142.72.248    3 u   29   64    1    2.278  -166.81
 0.001
 LOCAL(0)        LOCAL(0)        10 l   28   64    1    0.000    0.000
 0.001
[root at neon ~]# date
Tue Aug  1 05:31:59 EDT 2006
[root at neon ~]# ntpq -p
     remote           refid      st t when poll reach   delay   offset
jitter
==============================================================================
 ouvaton.info    134.157.254.19   2 u   12   64    7   87.308  -748.55
578.177
 bq.serverrack.n 65.111.164.223   3 u   10   64    7    0.604  -752.02
587.210
 msb.significant 64.142.72.248    3 u   10   64    7    1.921  -750.65
582.576
 LOCAL(0)        LOCAL(0)        10 l   10   64    7    0.000    0.000
 0.001

If I check the associations, they always show 'reject':

[root at neon ~]# ntpq
ntpq> associations

ind assID status  conf reach auth condition  last_event cnt
===========================================================
  1 34972  9014   yes   yes  none    reject   reachable  1
  2 34973  9014   yes   yes  none    reject   reachable  1
  3 34974  9014   yes   yes  none    reject   reachable  1
  4 34975  9014   yes   yes  none    reject   reachable  1

I'm just baffled to why this is occuring; within an hour it'll be >
1000.  I have run ntpdate several times and set the hwlock to the
system time, but inevitably the system clock runs way too fast, gaining
several minutes per day.  Note, if I do not run NTP at all, the hwclock
stays perfect, but the system clock still speeds too fast.  So, I
cannot be sure if this is a HW problem, network problem, or
configuration issue.  I have tried adding noapic, nolapic, noacpi to
the kernel bootup to no avail.  I have tried to boot into a
uniprocessor kernel, same behavior.  The server is hosted at Equinix in
Ashburn, VA and the network appears okay:

[root at neon ~]# mii-tool
eth0: negotiated 100baseTx-FD, link ok

I have setup NTP logging, but really do not know enough about the
protocol to troubleshoot effectively.  I've also tried pointing at
other NTP servers and get the same behavior.  I do not currently have
iptables running (the server is not in production yet) so 123 UDP
traffic is going through, and the provider has verified that there is
nothing on their side blocking it.

Any help at all would be appreciated.  I will be glad to supply log
files, trace files, anything to get this resolved.  Thanks so much.

Best Regards,
Tom




More information about the questions mailing list