[ntp:questions] How should an NTP server fail?

Miroslav Lichvar mlichvar at redhat.com
Thu Jun 10 08:07:50 UTC 2010


On Wed, Jun 09, 2010 at 11:48:00PM +0000, David L. Mills wrote:
> Seriously untrue. If the server is unreachable from the client, the
> tally code shown at the client is blank and can never be the system
> peer. To prove your case, show evidence the reach register is zero
> and the tally code is *,

David,

have you tried blocking the connection after receiving four replies
from the peer as I wrote in the bug report? That seems to be the
easiest way how to reproduce the problem.

It happens in other situations too. Here is an ntpd log where the
outgoing packets were blocked in pattern: 100 unblocked, 6 blocked, 4
unblocked, 2890 blocked. As you can see there are 52 hours when the
peer was unreachable and marked as system peer.

31 May 08:55:47 ntpd[31411]: 192.168.1.1 8011 81 mobilize assoc 20984
31 May 08:55:47 ntpd[31411]: 0.0.0.0 c016 06 restart
31 May 08:55:47 ntpd[31411]: 0.0.0.0 c012 02 freq_set kernel -16.405 PPM
31 May 08:55:48 ntpd[31411]: 192.168.1.1 8024 84 reachable
31 May 08:59:05 ntpd[31411]: 192.168.1.1 963a 8a sys_peer
31 May 08:59:05 ntpd[31411]: 0.0.0.0 c615 05 clock_sync
31 May 11:03:39 ntpd[31411]: 192.168.1.1 8643 83 unreachable
 2 Jun 15:31:22 ntpd[31411]: 192.168.1.1 8654 84 reachable
 2 Jun 15:31:22 ntpd[31411]: 0.0.0.0 0618 08 no_sys_peer
 2 Jun 15:41:17 ntpd[31411]: 192.168.1.1 966a 8a sys_peer
 2 Jun 15:54:25 ntpd[31411]: 0.0.0.0 0613 03 spike_detect -0.239374 s
 2 Jun 16:08:34 ntpd[31411]: 0.0.0.0 061c 0c clock_step -0.536514 s
 2 Jun 16:08:34 ntpd[31411]: 0.0.0.0 0615 05 clock_sync
 2 Jun 16:08:35 ntpd[31411]: 192.168.1.1 8074 84 reachable


This seems to happen when the reachable register was not full when the
server stopped responding and clock_filter() didn't call
clock_select() which would unselect the system peer.

When the source stops responding, the first few MAXDISPERSE samples
pushed from the transmit() timeout won't pass the epoch check in
clock_filter(), then the function will abort on the condition which
checks if there are any acceptable samples.

-- 
Miroslav Lichvar




More information about the questions mailing list