[ntp:questions] 500ppm - is it too small?

Unruh unruh-spam at physics.ubc.ca
Sat Aug 15 14:52:18 UTC 2009


David Woolley <david at ex.djwhome.demon.co.uk.invalid> writes:

>nemo_outis wrote:

>> ...
>>> Instead I see what looks like a religion, where questions are treated as 
>> apostasy or treason.

>Sycophancy is common on internet forums.

>In this particular case, my view is that any clock that is out by more 
>than 500ppm has something fundamentally wrong with it that should be 
>addressed before trying to run time discipline software on it.  If the 
>actual hardware clock is out by this much, there is a good chance that 
>it is not being effectively disciplined by its crystal and will be very 
>unstable.  If the machine is losing clock interrupts, that needs to be 
>fixed, avoided (e.g. by using a sustainable clock interrupt frequency), 
>or compensated for by OS specific code.  If the clock frequency 
>calibration is unreliable, it needs to be fixed (e.g. remember from 
>startup to startup, use a longer baseline for the calibration, or ensure 
>that the calibration is done on a quiet system).

I certainly agree in that if the rate is varying by 500PPM ( as it is if
it is losing interrupts or is being undisciplined by the crystal) almost
nothing can properly discipline it. If it is an operating system thing,
then there is nothing most people can do, since first understanding and
then hacking the kernel is not something most either can or should do.
As long as that calibration is off by a stable amount however, ntp
should be able to compensate for it. 



>As Unruh says, some of the things that need doing to ntpd to improve its 
>real world performance are so radical, that the result would not be a 
>conforming implementation of NTP.  One of the key issues is that NTP 
>clients can also be NTP servers, so behaviour difference have network 
>wide implications.

>When using conforming NTP algorithms, I believe that certain parameters 
>have been set on the basis that nowhere in the trail back to stratum 0 
>will you find a machine that is slewing at faster than 500ppm 
>(presumably plus an allowance for a reasonable static clock error).  One 

Actually there is no such allowance. The limit of 500PPM is a hard
limit, so that if the clock's drift rate is 490PPM, ntp has only 10PPM
to use in slewing the clock ( well +10PPM or -990 PPM) as I read the
code.


>could probably keep the standard NTP algorithms and parameters and 
>permit a faster slew at start up, but one would need to refuse to act as 
>a peer or server until confidence had been built that one had a good 
>estimate of the static error.  From then on one would have to report the 
>frequency offset from that value, not from the boot time value provided 
>by the kernel, i.e one would do ones own once per session frequency 
>calibration.

Unfortunately one of the down sides to the slew rate limit is that it
takes the clock a very long time to settle down if it has, for one
reason or another suffered a timing  or a rate glitch. At the same time NTP DOES
allow steps ( infinite slew rates). I have real trouble reconciling the
limit on the slew rate with allowing stepping. And the "trackback"
requirement is strange given that the clocks can we "way out" (hundreds
of milliseconds)for a long time because of the ntp algorithm and still be compliant. Surely being
way out on the time is far worse than slewing rapidly. It is after all
the time that they are supposed to deliver (as recognized by the
willingness to step the time). 



>As I suspect many people with bad clocks only want leaf node client 
>operation, in spite of the contra-indication of having a local clock 
>configured, for many people having a leaf mode only mode which removed 
>slew rate restrictions might be acceptable.  Technically, such 
>implementations are SNTP, rather than NTP, even if they retain some of 
>the NTP algorithms.




More information about the questions mailing list