[ntp:hackers] Does ntpd need to whine more ?

Poul-Henning Kamp phk at phk.freebsd.dk
Mon Oct 3 17:33:27 UTC 2005


In message <43415774.5000808 at udel.edu>, "David L. Mills" writes:

>We are talking past each other. The issue of poll interval is related to 
>time constant; indeed, that is the fundamental scaling assumption.

Do you recall my proposal to change the shift register to take not
the median, but the newest measurement that survives the selection ?

That does wonders for the timeconstant.

>The design intent is that the best recovery after an outage is using 
>iburst mode, so it takes no more than 16 s to refresh the clock filter 
>in that case, once the first response from the server arrives.

I may be seeing bugs in the implementation, but it looks to me like
the polling interval stays at 1024 even in case of iburst.

>It does not seem prudent to get more agressive than this; othewise 
>something like the Wisconsin incident might happen again.

As long as we stay firmly at 64s or above Wisconsin should be safe.

>Your comment about rapid temperature excursions leading to steps is very 
>relavent. I don't see that here in room temperature controlled 
>environments, but laptops could be another story.

It depends wildly on the kind of hardware, with some correlation to
price, but also a lot on the A/C at the installation.

As I see it, the NTP clock is tuned for the good boys and makes
a really rough ride for the regular boys because of it.  Most of
the hosts that poll my NTP server with > 256second poll rate
do not send me timestamps back that justify it.

>I don't understand your scenarios with cold rock behavior after a 
>lengthy outage. My experience here with typical systems is within 50 ms 
>after 36 h between ACTS updates and 10 ms after several hours of WWV 
>signal loss, and this with cold rock frequency compensation up to a 
>couple hundred PPM. I've run the ACTS and WWV drivers for several months 
>without ever stepping.

Thanks to the refid="STEP" and ="INIT" it is possible to monitor this
in the client population.  Steps correlate with poll interval which
to me indicates that the increase in poll interval was unwarranted.

>As for outage notification, the daemon does log reachability events now. 

Yes, although the message could be improved.

>It could be that it should do this every hour or something like that; I 
>have no problem with that.

I think we should do that as a first step.  When I contact servers in
my client population and tell them that their NTPD isn't doing anything
the response is invariably: "Why didn't it tell me?!!"

>The NIST folks use the filegen facility and 
>call the helpdesk beeper if something goes wrong in any of their servers.

As soon as you get into that level of operational service, custom
methods are called for.


-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.


More information about the hackers mailing list