[ntp:questions] Re: Sufficient # servers to sync to

Richard B. Gilbert rgilbert88 at comcast.net
Thu Mar 17 16:30:49 UTC 2005


John Sasso wrote:

>Richard,
>
>Thanks for the response!  The data centers are roughly 20 mi apart, so
>latency won't be an issue.
>
>Question regarding 3 NTP servers and sanity-checking.  I can understand that
>if one out of the 3 failed such that it could not longer be contacted then
>the other two would pose a dilemma to the client, as you noted.  However,
>suppose the one failed in such a manner that it could still be contacted and
>give out time but it provided clearly erroneous time.  In this situation,
>wouldn't the other two provide a "sanity check" against the falseticker?  My
>logic here is that if this is the case, and since we will monitor the
>health/uptime of the Stratum-1's (since we own them), then 3 would be
>sufficient since if one of the Stratum-1's fail then its outage time would
>be 1-3 days (depending on how fast we can get a replacement shipped to us).
>The remaining two would still provide time.
>
>Thanks!
>
>--john
>
>"Richard B. Gilbert" <> wrote in message
>news:88SdnTK4KtYVAaTfRVn-qg at comcast.com...
>  
>
>>John Sasso wrote:
>>
>>    
>>
>>>I am working on a design for the NTP infrastructure for our company.  We
>>>purchased 6 Stratum-1, GPS-sync'd NTP servers, three for each of our two
>>>data centers located at remote sites.  We have a number of subnets at
>>>      
>>>
>each
>  
>
>>>of our secured sites, each secured by a firewall.
>>>
>>>According to
>>>      
>>>
>>http://ntp.isc.org/bin/view/Support/SelectingOffsiteNTPServers#Section_5.3.
>>    
>>
>3.
>  
>
>>>it suggests NTP clients should sync to a minimum of 4 NTP servers.
>>>Specifically, it states:
>>>
>>>"While the general rule is for 2n+1 to protect against "n" falsetickers,
>>>this actually isn't true for the case where n=1. It actually takes 2
>>>      
>>>
>servers
>  
>
>>>to produce a "candidate" time, which is really an interval. The winner is
>>>the shortest interval for which more than half (counting the two that
>>>      
>>>
>define
>  
>
>>>the interval) have an offset (+/- the dispersion) that lies on the
>>>      
>>>
>interval
>  
>
>>>and that contains the point of greatest overlap."
>>>
>>>In the past, I've had NTP clients sync to up to 3 [out of 4] Stratum-2
>>>      
>>>
>NTP
>  
>
>>>servers.  The 4 NTP servers each sync'd to 4 off-site Stratum-1 NTP
>>>      
>>>
>servers,
>  
>
>>>as well as off one-another for additional sanity checking.
>>>
>>>For the design, is it overkill for me to require to NTP clients to sync
>>>      
>>>
>to 4
>  
>
>>>NTP servers?  How about just 3?  The NTP clients consist of Cisco routers
>>>and firewalls, Windows, Sun, and Linux systems.  Part of the environment
>>>uses Windows AD w/ Kerberos as well as SSL, which I think require
>>>      
>>>
>accurate
>  
>
>>>time.
>>>
>>>--john
>>>
>>>
>>>
>>>
>>>      
>>>
>>Many people would be satisfied with one "good" server.  If the
>>consequences of that one "good" server being wrong someday are
>>sufficiently serious to justify the expense, then four servers is the
>>way to go.   Those four servers don't all have to be on-site and running
>>GPS reference clocks, but you do need four.  The problem with three is
>>that if one fails you have two left and no way to determine which, if
>>either, is correct when they disagree.
>>
>>If your two data centers are not unreasonably far apart it might make
>>sense to have each serve as a backup to the other.   Everybody
>>configures six servers.   In each data center, one of the local servers
>>will probably be selected but five others are available as a sanity
>>check and "advisory committee".  For sites more than two or three
>>hundred miles apart, the network delays may add enough uncertainty to
>>make this choice undesirable.
>>    
>>
>
>  
>
Take an extreme case:

Server A says it's 11:53
Server B says it's 11:55
Server C says it's 23:52

Server C has clearly lost its mind, but which of the two remaining 
servers do you believe?  Flip a coin and you have a something like a 50% 
probability that you are off by as much as two minutes.

Add Server D which thinks it's 11:52 and the algorithm will pick server 
A.  You may still not have the exact time but you can have a little 
confidence that you've got the best available guess.

In the real world the differences will probably be in tens or hundreds 
of milliseconds (using network servers) rather than minutes but the 
example illustrates the principles involved.




More information about the questions mailing list