[ntp:questions] Re: ntpd, boot time, and hot plugging

David L. Mills mills at udel.edu
Thu Feb 3 19:50:06 UTC 2005


It is true that ntpd is specifically engineered for Internet badlands 
where popcorn spikes, evil masqueraders, misbehaving clocks and other 
vermin might poison your DNS cache. A client using the pool servers 
scheme really needs this ammunition.

The current version tries hard to strike a compromise between verifiable 
assertions and acquisition speed. However, with the twinkle I described 
earlier you can engineer any compromise your wish, including set the 
clock on the first response received and in principle when the first two 
responses from at least two servers and so on.

In very many simulation runs here I found it hard to get into a true 
lockup condtion where the daemon did not recover from large initial time 
or frequency offsets, even with the -x option. However, there are some 
things the simulator can't pick up, like a large frequency offset 
pre-programmed in the kernel. The recommended repair procedure should 
that somehow happen is to run "ntptime -f 0" to kill the kernel offset 
and then remove the ntp.drift file. Upon restart ntpd measures the 
intrinsic frequency offset over about fifteen minutes, sets the clock 
and resumes normal operation. I would think this a good way to determine 
if a motherboard is or is not acceptable. I've seen lots of motherboards 
and found most of them within 100 PPM and all of them within 500 PPM. 
Even if over 500 PPM the clock is still disciplined but the offset 
cannot be forced to zero.

So, best advice is to run ntpd with -g and "tos maxdist 16" in the 
configuration file. I assume the version with this command will soon 
appear as a snapshot. Note that the only thing this does is admit 
servers to the selection algorithm no matter what the synchronization 
distance is. Ordinarily the distance starts from 16 and reduces by half 
for each response received. Other than this criterion, the algorithms 
operate without change. You can do your own security analysis.


Tom Smith wrote:
> David L. Mills wrote:
>> Kenneth,
>> This is the single most persistent issue in the engineering design of 
>> NTP. There must be tradeoffs between security, robustenss, accuracy 
>> and initial delay. In the current design compromise, a server is 
>> acceptable only after three/four rounds of messages and the ensemble 
>> time is acceptable with at least one of possibly several acceptable 
>> servers. With IBURST mode, takes takes 6-8 seconds.
>> For better robustness use "tos minclock N", where the at least N 
>> (default 1) servers must be acceptable to set the clock. Tonight I put 
>> in a "tos maxdist M", where M is the distance threshold below which 
>> the server is acceptable. Set "tos maxdist 16" and the first sample 
>> received from any server will set the clock likety-split. Of course, 
>> essentially all the mitigation algorithms using multiple-sample 
>> redundancy and multiple-server diversity are systematically defeated. 
>> You might as well use SNTP.
> David,
> I know the subject has been workstations, but let's talk for a moment
> about this religion as it concerns servers - like the ones that run
> telephone companies, stock exchanges, and banks inside heavily
> defended firewalls. It's the same issue, it's just that the stakes
> are higher. The issue is how quickly can you get these
> systems back up at boot. 15-30 seconds is a long time to wait.
> Too long.
> We're not talking about one-shot sampling for maintaining the time,
> so comparisons to SNTP are not helpful. We're talking about speed of
> acquistion of an initial "good enough" time, keeping in mind that the
> perfect is often the enemy of the good.
> You might argue that if boot time is critical, just let the server come
> up with whatever random time it comes up with and let ntpd fix
> it up later. Give it a "-g" so it doesn't complain. A lot of folks
> have tried this in the past inadvertently (and continue to do so)
> by neglecting to put ntpdate into their boot sequence ahead of ntpd.
> I've fixed a lot of systems whose drift files were pinned
> at 500 ppm and whose systems ran perpetually fast or slow as
> a result. We've also spent a lot of money fruitlessly replacing
> motherboards on those systems. Turning a large initial offset over
> to ntpd is decidedly NOT a Good Idea.
> The reason why so many of your constituency keep bringing this
> subject up is that they know that ntpd needs a good (not perfect)
> estimate of the time before it starts and that critical systems
> can't wait for perfection to get that estimate.
> -Tom
> ________________________________________________________________________
> Tom Smith                       smith at alum.mit.edu,smith at cag.lkg.hp.com
> Hewlett-Packard Company                          Tel: +1 (603) 884-6329
> 110 Spit Brook Road ZKO1-3/H42                   FAX: +1 (603) 884-6484
> Nashua, New Hampshire 03062-2698, USA           Mobile: +1 978 397 3411

More information about the questions mailing list