[ntp:questions] Frequent time reset messages

Bob Robison bob.robison at swri.org
Thu Dec 1 22:12:13 UTC 2005

I'm running a moderate number (around 50) dual-opterons that are
diskless booting a Linux 2.6.12 smp kernel and trying to synch with a
Symmetricon XLI-GPS stratum-1 NTP server on an isolated network.

The problem I have is that when I run "ntpq -c peers" on a number of
these machines to check the status of the ntp synchronization, I see
offsets ranging over almost 1000 msecs.  If I grep through the /var/log/
messages file, I see that there are often messages around every 20
minutes like this:

Dec  1 20:30:28 (none) ntpd[27203]: time reset 0.613771 s
Dec  1 20:30:28 (none) ntpd[27203]: synchronisation lost
Dec  1 20:50:45 (none) ntpd[27203]: time reset 0.931388 s
Dec  1 20:50:45 (none) ntpd[27203]: synchronisation lost
Dec  1 21:19:23 (none) ntpd[27203]: time reset 0.451491 s
Dec  1 21:19:23 (none) ntpd[27203]: synchronisation lost
Dec  1 21:36:24 (none) ntpd[27203]: time reset 0.391510 s
Dec  1 21:36:24 (none) ntpd[27203]: synchronisation lost

This seems like large (and frequent) steps to be occuring.  I have a
fairly simple ntp.conf file: 
restrict default ignore
restrict mask nomodify notrap noquery

server   iburst
server     iburst # local clock
fudge stratum 5 # default was 10

driftfile /var/lib/ntp/drift

These machines each have a Gigabit network connection to a high-end
network switch.  I believe the NTP Server probably has only a 100MBit
link, and he has all the traffic, but I don't think that is the

Probably the main issue is the CPU and I/O loading on these opteron
machines.  They are each handling streaming data from a firewire card
(IEEE-1394a) and the CPUs stay fairly busy handling that data -- though
they are not pegged at 100% or anything.

Here is a typical ntpq output:
ntpq> as
ind assID status  conf reach auth condition  last_event cnt
  1 48644  9634   yes   yes  none  sys.peer   reachable  3
  2 48645  9034   yes   yes  none    reject   reachable  3
ntpq> rv 48644
status=9634 reach, conf, sel_sys.peer, 3 events, event_reach,
srcadr=ntpserv, srcport=123, dstadr=, dstport=123, leap=00,
stratum=1, precision=-9, rootdelay=0.000, rootdispersion=5.554,
refid=GPSM, reach=377, unreach=0, hmode=3, pmode=4, hpoll=7, ppoll=7,
flash=00 ok, keyid=0, offset=360.879, delay=2.544, dispersion=3.803,
jitter=6.636, reftime=c739efcd.cf993b0f  Thu, Dec  1 2005 21:55:25.810,
org=c739efde.6ea22848  Thu, Dec  1 2005 21:55:42.432,
rec=c739efde.1292f6e8  Thu, Dec  1 2005 21:55:42.072,
xmt=c739efde.0c8ede54  Thu, Dec  1 2005 21:55:42.049,
filtdelay=     2.54    4.42    2.50    2.98    2.55    2.61    2.44
filtoffset=  360.88  354.24  412.02  412.20  464.11  -95.25
-78.39  -56.90, 
filtdisp=      1.96    3.90    5.82    7.77    9.70
11.62   12.61   13.57

If anyone has any suggestions about what might be happening, or how to
keep these guys synched up more tightly, I would certainly appreciate
it.  I've dug around through FAQs, Wiki's, Docs, etc... but not sure
exactly why my time is bouncing around so much.

thanks in advance,
Bob Robison                        bob.robison at swri.org
Staff Engineer                     210-522-3935
Southwest Research Institute       San Antonio, TX

More information about the questions mailing list