[ntp:questions] EXTERNAL: Re: Issue with peering and orphan mode

Danny Mayer mayer at ntp.org
Mon Oct 10 13:56:27 UTC 2011


On 10/10/2011 8:01 AM, Conner, Matthew wrote:
> Danny,
> 
> Thank you for the response.
> 
> We specify minpoll/maxpoll because of specific time requirements on
our system. The system needs to stay within a 1ms offset and must poll
no less than every 32 seconds. The system is completely closed off with
the exception of the TFDS servers which are basically GPS modules. The
only servers that query the TFDSs are these "timehosts". Since peers are
only used in case of a stratum 1 failure, I may try removing
minpoll/maxpoll from the peers to see if it helps in resolving our issue.
> 

It's a common mistake to assume that using minpoll/maxpoll will improve
alignment of the clocks, it doesn't. It has to do with polling frequency
which you want to decrease as ntpd settles down to a stable state. Once
it settles down you want the clock changes to be more gradual  and react
to longer-term changes rather than shorter-term spikes that can occur at
any time.


> As for the prefer statement, it only comes into play when the
> Stratum
1 servers are available. Would that really still affect Orphan mode?
> 

If the tfds1 server is the only stratum 1 server it will automatically
be selected as a result. Even if it is only a stratum 2 it will be
automatically selected.

> Any idea why this issue wasn't seen in 4.2.4p7?
> 

No, there have been a lot of changes in this area.


Danny

> Thanks,
> 
> Matt Conner
> 
> 
> -----Original Message-----
> From: Danny Mayer [mailto:mayer at ntp.org] 
> Sent: Sunday, October 09, 2011 4:44 PM
> To: Conner, Matthew
> Cc: questions at lists.ntp.org
> Subject: EXTERNAL: Re: [ntp:questions] Issue with peering and orphan mode
> 
> On 10/6/2011 3:11 PM, Conner, Matthew wrote:
>> We are experiencing an issue using Orphan mode and peering in our ntpd 4.2.6p4 set-up. With the loss of our stratum 1 time hosts, the stratum 2 are not properly choosing a primary time provider. Below is our ntp.conf for all 4 of the stratum 2 servers:
>>
>>                 tinker step .010 stepout 60 panic 0
>>                 server tfds1 prefer minpoll 4 maxpoll 5 burst iburst
>>                 server tfds2 minpoll 4 maxpoll 5 burst iburst
>>                 server tfds3 minpoll 4 maxpoll 5 burst iburst
>>
>>                 peer timehost1 minpoll 4 maxpoll 5 burst iburst
>>                 peer timehost2 minpoll 4 maxpoll 5 burst iburst
>> peer timehost3 minpoll 4 maxpoll 5 burst iburst
>> peer timehost4 minpoll 4 maxpoll 5 burst iburst
>>
>>                 tos orphan 4
>>
>>                 driftfile /etc/ntp/drift
>>
> 
> Why did you think it was a good idea to set minpoll to 4 and maxpoll to
> 5? Unless you have a very good understanding of how NTP operates you
> should never be specifying minpoll and maxpoll. It's is rarely necessary
> and is usually detrimental to your NTP server. Also the prefer keyword
> is not beneficial to Orphan mode. The peer select from among each other.
> 
> Danny
> 
>> The stratum 2 (timehost[1-4]) attempt to peer with the loss of the stratum 1 (tfds[1-3]}. However, instead of them all staying at stratum 4 as was seen when using ntpd 4.2.4p7 (have other issues with 4.2.4p7 and need to update), the peers are dropping down 1 stratum from the peer they are locking to. Since they are peering to one another, this results in the timehosts slowly dropping in stratum as they attempt to stay 1 stratum below the locked to host. They continue to drop in stratum until reaching a stratum 16. Once they hit stratum 16, all other hosts disconnect and the peers previously locking to the now stratum 16 host will unlock and jump back to a stratum 4. Once at least 1  peer jumps back to 4, the others will begin jumping to stratum 4-5. This process will repeat itself until the stratum 1 hosts are reconnected or the timehosts choose a primary. We have only once seen it stabilize with all 4 hosts and it took almost a full 24 hours to do so. With only 3 timehos
t
> s r
>>  unning, they will stabilize within minutes.
>>
>> >From what we are able to tell, a primary peer is chosen when 3 of the 4 timehosts lock to the same peer.  When the 4th peer sees that the others are all connected to it, it syncs to its internal clock and remains a stratum 4. Is this correct, or is something else going on here?
>>
>> Further questions:
>> Are the peers intentionally dropping below the orphan mode set stratum, or is that a bug?
>> Are we missing anything in ntp.conf to make orphan mode work properly?
>> Is this possibly just a limitation on the number of peers?
>> If working as intended, is there a way to force a primary peer quicker?
>>
>> Note: We have tested without burst/iburst on the peer declarations as well as the removal of the timehost declaration of the host itself. None of these modifications had an impact.
>>
>> Thanks,
>>
>>
>> Matt
>>
>> _______________________________________________
>> questions mailing list
>> questions at lists.ntp.org
>> http://lists.ntp.org/listinfo/questions
>>
> 
> 



More information about the questions mailing list