[ntp:questions] NTP.log interpretation

GregL greg.leibfried at gmail.com
Fri Apr 18 03:53:22 UTC 2014


I'm trying to determine what this section of an ntp.log is telling me.

This is from a default xntpd instance on AIX 5.3.

ntp.conf has two severs listed, 34 and 97, with the '34' server preferred.

The first couple of lines are the last in a *long* list of hourly logged
offset messages.

It appears to me that *something* caused the system to sync to the
non-preferred server.  My guess is a lack of response from the '34' server?

Q#1)  Is that a safe assumption?  How many times, and how long would it
wait to get a response from '34'?

It turns out that due to a configuration issue, the '34' server was
actually out of time sync, and the second, non preferred server,  '97', was
actually the correct time. They were seconds off.   ('34' ended up not
being configured to an outside time server pool... only to sync with
'97'... but clearly there was a problem with that 'sync' happening.).


After this first reset, then the system went into this cycle of resetting
back and forth.  I'm trying to understand the logic here.  Is it telling me
that the was that the response from the preferred server was sporadic
enough that it would regularly flip to sync with secondary server...thus
the flip flopping back and forth with time.   This continued for hours
until the configuration issue with the two time servers were fixed, so they
were both "in sync" with the correct time... then back to regular old
boring offset log entries.

Q#2)  does "sychonisation lost" mean a lack of response from the preferred
server, or does it mean a dramatic difference in time, such that it needs
to be reset (as another log message indicates).

Thanks for any help in learning how to read this.  I have no indications of
any network issues, but given NTP uses UDP, it could be hard to track.   It
seems suspicious that once the two time servers were in sync...the issues
went away.   I.e.. meaning a *real* network issue did not exist... perhaps
just a small hiccup that caused the flip/flopping to occur and continue
until fixed.

Right now, my plan is to add the "-x" option to the xntpd startup;
hopefully that would avoid setting the clock backwards.   Additionally,
actions are taken to make sure the two time servers never get that far out
of sync without throwing out some alerts.

Any advice/counsel concerning this scenario would be greatly appreciated.
 I would like to feel a better understanding of what the log is telling
me.... which is why I am here ;-)

Thanks!



10 Apr 00:57:22 xntpd[245888]: offset -0.000638 freq 1.908 poll 6

10 Apr 01:57:22 xntpd[245888]: offset -0.000978 freq 2.094 poll 6

*10 Apr 02:19:40 xntpd[245888]: synchronized to 172.16.32.34, stratum=3

*10 Apr 02:21:05 xntpd[245888]: synchronisation lost

*10 Apr 02:21:48 xntpd[245888]: synchronized to 172.16.56.97, stratum=1

*10 Apr 02:27:48 xntpd[245888]: synchronized to 172.16.32.34, stratum=2

10 Apr 02:35:18 xntpd[245888]: time reset (step) -2.580356 s

10 Apr 02:35:18 xntpd[245888]: synchronized to 172.16.56.97, stratum=3

10 Apr 02:35:18 xntpd[245888]: synchronisation lost

10 Apr 02:35:18 xntpd[245888]: system event 'event_clock_reset' (0x05)
status 'sync_alarm, sync_unspec, 15 events, event_peer/strat_chg' (0xc0f4)

10 Apr 02:35:18 xntpd[245888]: system event 'event_sync_chg' (0x03) status
'sync_alarm, sync_unspec, 15 events, event_clock_reset' (0xc0f5)

10 Apr 02:35:18 xntpd[245888]: system event 'event_peer/strat_chg' (0x04)
status 'sync_alarm, sync_unspec, 15 events, event_sync_chg' (0xc0f3)

10 Apr 02:35:50 xntpd[245888]: peer 172.16.56.97 event 'event_reach' (0x84)
status 'reach, conf, 15 events, event_reach' (0x90f4)

10 Apr 02:36:22 xntpd[245888]: peer 172.16.32.34 event 'event_reach' (0x84)
status 'reach, conf, 15 events, event_reach' (0x90f4)

10 Apr 02:40:06 xntpd[245888]: synchronized to 172.16.56.97, stratum=3

10 Apr 02:40:09 xntpd[245888]: time reset (step) 2.569861 s

10 Apr 02:40:09 xntpd[245888]: synchronisation lost

10 Apr 02:40:09 xntpd[245888]: system event 'event_clock_reset' (0x05)
status 'sync_alarm, sync_unspec, 15 events, event_peer/strat_chg' (0xc0f4)

10 Apr 02:40:41 xntpd[245888]: peer 172.16.32.34 event 'event_reach' (0x84)
status 'reach, conf, 15 events, event_reach' (0x90f4)

10 Apr 02:41:13 xntpd[245888]: peer 172.16.56.97 event 'event_reach' (0x84)
status 'reach, conf, 15 events, event_reach' (0x90f4)

10 Apr 02:44:57 xntpd[245888]: system event 'event_peer/strat_chg' (0x04)
status 'sync_alarm, sync_ntp, 15 events, event_clock_reset' (0xc6f5)

10 Apr 02:44:57 xntpd[245888]: synchronized to 172.16.32.34, stratum=2

10 Apr 02:44:54 xntpd[245888]: time reset (step) -2.877219 s

10 Apr 02:44:54 xntpd[245888]: synchronisation lost

10 Apr 02:44:54 xntpd[245888]: system event 'event_clock_reset' (0x05)
status 'sync_alarm, sync_unspec, 15 events, event_peer/strat_chg' (0xc0f4)

10 Apr 02:45:26 xntpd[245888]: peer 172.16.56.97 event 'event_reach' (0x84)
status 'reach, conf, 15 events, event_reach' (0x90f4)

10 Apr 02:45:58 xntpd[245888]: peer 172.16.32.34 event 'event_reach' (0x84)
status 'reach, conf, 15 events, event_reach' (0x90f4)

10 Apr 02:49:42 xntpd[245888]: system event 'event_peer/strat_chg' (0x04)
status 'sync_alarm, sync_ntp, 15 events, event_clock_reset' (0xc6f5)

10 Apr 02:49:42 xntpd[245888]: synchronized to 172.16.56.97, stratum=3

10 Apr 02:49:45 xntpd[245888]: time reset (step) 2.859654 s

10 Apr 02:49:45 xntpd[245888]: synchronisation lost

10 Apr 02:49:45 xntpd[245888]: system event 'event_clock_reset' (0x05)
status 'sync_alarm, sync_unspec, 15 events, event_peer/strat_chg' (0xc0f4)

10 Apr 02:50:17 xntpd[245888]: peer 172.16.32.34 event 'event_reach' (0x84)
status 'reach, conf, 15 events, event_reach' (0x90f4)

10 Apr 02:50:49 xntpd[245888]: peer 172.16.56.97 event 'event_reach' (0x84)
status 'reach, conf, 15 events, event_reach' (0x90f4)

10 Apr 02:54:33 xntpd[245888]: system event 'event_peer/strat_chg' (0x04)
status 'sync_alarm, sync_ntp, 15 events, event_clock_reset' (0xc6f5)

10 Apr 02:54:33 xntpd[245888]: synchronized to 172.16.32.34, stratum=2

10 Apr 02:54:29 xntpd[245888]: time reset (step) -3.167034 s

10 Apr 02:54:29 xntpd[245888]: synchronisation lost

10 Apr 02:54:29 xntpd[245888]: system event 'event_clock_reset' (0x05)
status 'sync_alarm, sync_unspec, 15 events, event_peer/strat_chg' (0xc0f4)

<snip...  hours of the same... with the time reset step growing to 48 and
-48 seconds>

Thanks!


More information about the questions mailing list