[ntp:hackers] Does ntpd need to whine more ?

David L. Mills mills at udel.edu
Tue Oct 4 03:06:31 UTC 2005


I'm not surprised; I have seen the same thing. The obvious remedy is to 
clamp maxpoll to something lower. See rackety.udel.edu, which has the 
same temperature wiggle, but keeps to low microseconds. The data for 
rackety and its brother pogo in the same machine room are dominated by 
small diurnal variations, but curiously show no artifact due to A/C cycling.

I'm not sure where we are going here. I have bias toward increasing the 
poll interval, even it does result in a few tens of milliseconds 
residual. The behavior is easy to change be revising the hysteresis 
counter limits, and jitter gate in the clock discipline algorithm. There 
is a compromise between rapid adaptation and network overhead. I 
consistently argue for the latter on the expectation that smart guys 
like you know how to optimize your particulat scenario. If you can 
figure out which combination is optimum for you, I will put in a tinker 
for it.


Mark Martinec wrote:

> >From Dave:
>>> Your comment about rapid temperature excursions leading to steps is very
>>> relavent. I don't see that here in room temperature controlled
>>> environments, but laptops could be another story.
> >From Poul-Henning Kamp:
>> It depends wildly on the kind of hardware, with some correlation to
>> price, but also a lot on the A/C at the installation.
>> As I see it, the NTP clock is tuned for the good boys and makes
>> a really rough ride for the regular boys because of it. Most of
>> the hosts that poll my NTP server with > 256second poll rate
>> do not send me timestamps back that justify it.
> The A/C in a computer room with its 2 degC temperature fluctuation
> can wreak havoc in the ms range. As Dave is asking for specific
> proof, I have just the right example for the case.
> See Fig.5 in my document: http://www.ijs.si/time/
> The accompanying text says:
> Example: The following diagram shows the behaviour of two similar NTP V4
> stratum-2 servers located close together in the same air-conditioned 
> room.
> Their reference stratum-1 servers are accessed over WAN (about 50 ms
> round-trip delay), resulting in a typical polling interval of 1024 
> seconds
> and consequently large PLL time constant. The quartz oscillator of 
> host P has
> a relatively large temperature dependency: judging from the diagram the
> temperature coefficient is about +1.2 ppm / K. The other host K has 
> almost 20
> times lower temperature coefficient (perhaps only apparently due to a 
> better
> chassis design) and is shown as a reference.
> Despite air-conditioning the temperature changes near the computer 
> cabinets
> show daily peak-to-peak range of almost three degrees centigrade, the
> temperature excursion on day 3 at noon (time = 3.5) was due to other 
> reasons.
> Fortunately the temperature changes are gradual and quite smooth -- no 
> direct
> airflow from the air-conditioning equipment was hitting the computer
> cabinets.
> [...]
> The example clearly demonstrates that from +/- 5 ms up to 30 ms of 
> time offset
> error in this example is a direct consequence of relatively large 
> temperature
> coefficient of a quartz crystal coupled with temperature fluctuations 
> in the
> computer room, augmented by a relatively large PLL time constant, 
> typical for
> a WAN-synchronized stratum-2 server.
> Mark

More information about the hackers mailing list