[ntp:questions] Query about NTP accuracy

Terje Mathisen "terje.mathisen at tmsw.no" at ntp.org
Sat May 23 20:02:37 UTC 2009

Andy Yates wrote:
> Unruh wrote:
>> Andy Yates <andyy1234 at gmail.com> writes:
>>> Hal Murray wrote:
>>>> In article <4a15e001$0$18238$da0feed9 at news.zen.co.uk>,
>>>>  Andy Yates <andyy1234 at gmail.com> writes:
>>>>> Does anybody have any figures that shows the effect on accuracy of an
>>>>> NTP v3 client using a stratum 1 server rather than a stratum 2 or 3
>>>>> server? It's all in a GE LAN based scenario, commercial stratum 1
>>>>> servers connected to GPS and stratum 2 and 3 servers are typically
>>>>> dedicated Linux boxes.
>>>>> However I'm been pressed to supply an SLA for accuracy. My argument is
>>>>> that although you can get your stratum one server to synchronize to
>>>>> microseconds of UTP, as soon as the client uses NTP v3 over the LAN,
>>>>> even a GE LAN, then the accuracy degrades and putting well designed well
>>>>> specified stratum between the boxes is not going to decrease accuracy
>>>>> sufficiently to warrant purchasing many stratum one appliances.
>>>> What sort of accuracy are you interested in?  1 ms?  10 ms?  100 ms?
>>> Hi Hal
>>> Its up to us to specify what we think the SLA should be - the guide is
>>> "as accurate as possible"!
>> That is of course completely idiotic. They do NOT want it as accurate as
>> possible. That would cost them millions of dollars and is not in fact
>> needed. What are they using the time for? what kind of machines are they
>> ( Windows, linux, BSD, special home grown OS?)
>>>> How stable is your temperature?  (Both the room and the CPU load.)
>> ntp is terrible if anything varies ( absurdly long settling times).
>>> Temperature will be very stable, the DC is the very well specified and
>>> scrupulously engineered - no cables blocking air flow etc. Generally
>>> speaking the CPU is over specified.
>>>> What is the load on the LAN between the clients and servers?
>>>> (Delay is not a problem.  Variation in delay is a problem.)
>>> The NTP will be on a separate management LAN to the production traffic
>>> so not subject to the variances that application load has on the network.
>>>> I suggest you measure it.  Start with your current system.
>>>> Setup a box as a ntpd system and tell it to use several target boxes
>>>> as servers and turn on logging.  peerstats will tell you the difference
>>>> between your local clock and the target system.
>>> I'll look at the current NTP infrastructure however its completely
>>> different to the new. The old has 3 geographically diverse GPS receivers
>>> plus a GPS and radio source on the roof of the data centre and we use
>>> network components to provide the intermediate strata between stratum
>>> one and client - and have not really had many issues after almost 10
>>> years of use. However, requirements are changing and we will probably be
>>> using dedicated stratum 2/3 servers as required.
>> Does teh GPS have PPS ( puse per second) or are they just using the nmea
>> time signal? Radio about as bad as GPS with just NMEA for timing. 
> Hi Unruh
> it uses IRIG-B but can use PPS amongst others - However I should have
> said "as accurate as possible using a well designed NTP system". Our
> stratum zero sources and stratum 1 sources will be v.accurate - its the
> distribution via NTP that were trying to get our heads around.

Andy, you probably know that several dedicated NTP servers have puny 
cpus, so they can't handle too many client?

Any PC manufactured within the last 15 years, i.e. Pentium and up, is 
capable of handling at least 100K clients, so the way I setup the ntp 
service at Norway's largest corporation was to have three pairs of 
FreeBSD boxes, in three widely separated locations.

Each pair has one Oncore UT+ timing receiver (capable of 15-35 ns 
accuracy) connected to one of the servers, but already configured into 
the other server as well, so if we have a server crash, the GPS serial 
cable can be moved to the alternate box.

I also have 3-4 NTP appliance servers, as well as a couple of home-made 
units (one TAPR-based Oncore eval board and one Garmin 18 which uses USB 
power). Finally there's an Endrun CDMA receiver in the US.

These "extra" resources are used as backup servers for the six primary 
boxes, along with a number of ntp pool servers.

The key idea is that all six official servers are more than strong 
enough to handle the maximum possible load, and with so many independent 
clock courses, and 6-way redundancy for every single client machine, the 
likelihood of serving bogus time is sufficiently small, even though the 
total cost of the setup was very reasonable.


PS. Previous to this setup, all three NTP appliance servers failed at 
least once, and failed wrong, i.e. they kept responding but with the 
wrong time! Each time this happened, we found one or two critical 
servers that had been rebooted, then used ntpdate to pick the bogus 
server to step the time, and then refused to listen to the two others 
which at that point were "clearly wrong" :-(

Today we never use ntpdate of course, just regular ntp allowing a single 
(large) startup step.
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

More information about the questions mailing list