[ntp:questions] Re: tinker step 0 (always slew) and kernel time discipline

Joseph Harvell jharvell at dogpad.net
Fri Sep 22 02:46:14 UTC 2006


Richard B. Gilbert wrote:

> How about designing your NTP subnet in such a way as to prevent these failures in the first place?
> Since you say, elsewhere, that you are more concerned that time be strictly monotonically increasing than that it be accurate perhaps you don't need NTP at all; set your local clock from your wrist watch once a week while the application is not running
> Your original problem, IIRC, resulted from an extremely poor design of your NTP subnet; two servers each serving its unsynchronized local clock and drifting apart.
> If you really do need NTP the easiest configuration is for your client to use from four to seven servers.  Those servers should be stratum 2 internet servers (rules of engagement prohibit use of public stratum 1 servers unless you are serving 100 or more clients).  This requires that you study the list of public stratum two servers at http://ntp.isc.org/bin/view/Servers/StratumTwoTimeServers
>  to find four to seven servers within, say, 300 miles of your site and adding these servers to your ntp.conf file.  It also requires a connection to the internet that allows port 123 in both directions.  If you specify the numeric IP address of each server, you need not open any other port in the firewall.  If you wish to use domain names, the you will have to open the port(s) necessary to allow DNS to work (don't know which ones offhand.
> The simplest configuration is to make the machine running the application a stratum 1 server by installing ntpd and a GPS timing receiver as a hardware reference clock.  The weakness of this configuration is that the GPS receiver becomes a single point of failure; if it dies, you rapidly lose any claim to accuracy.  Since you don't insist on accuracy perhaps this would not be a problem.  Actually, ntpd would continue to discipline the clock using the last known frequency correction so you would have several hours of "hold over" before your clock drifted significantly (assuming a controlled temperature in your data center).
> You can increase the reliability by using four GPS timing receivers to synchronize four NTP servers and configuring your client to use those four servers.


I really appreciate the advice.  I think you are getting the wrong idea
about my approach to handling the problem since I don't seem concerned
about the glaring problems in my configuration.  The reason for this is
the original problem manifested in a testbed for one of our products.  I
am concurrently tracking this down internally to determine whether the
two servers are actually synced to a stratum 1 clock (or whether they
are part of the same synchronization subnet at all).  And I plan to
correct the problem.

Also, I completely agree that we should configure 4+ peers for each NTP
client to avoid this failure scenario altogether.

But keep in mind that it may not be practical for our customers to have
4+ NTP servers in their synchronization subnet.  And arguably, they
deserve what they get if they fail to follow our recommendation to have
more servers.

Nevertheless, I am still very interested in preventing step corrections
in these scenarios.  And I think this is a legitimate concern.  So I
would really appreciate it if you could also address the questions in my


Joe Harvell

More information about the questions mailing list