[ntp:questions] ntpdate.c unsafe buffer write

Tom Smith smith at cag.zko.hp.com
Mon Feb 11 19:12:25 UTC 2008


"ntdate -b" steps the clock. That's the function under discussion.
The one that's used nearly universally in boot sequences.

-Tom

David L. Mills wrote:
> Guys,
> 
> There seems to some misinformation here.
> 
> Both ntpdate and ntpd -q set the offset with adjtime() and then exit. 
> After that, stock Unix adjtime() slews the clock at rate 500 PPM, which 
> indeed could take 256 s for an initial offset of 128 ms. A prudent 
> response would be to measure the initial offset and compute the time to 
> wait. The ntp-wait script waits for ntpd to enter state 4, which could 
> happen with an initial offset as high as 128 ms.
> 
> The ntpd time constant is purposely set somewhat large at 2000 s, which 
> results in a risetime of about 3000 s. This is a compromise for stable 
> acquisition for herky-jerky Internet paths and speed of convergence for 
> LANs. For typical Internet paths the Allan intercept is about 2000 s. 
> For fast LANs with nanosecond clock resolution, the Allan intercept can 
> be as low as 250s, which is what the kernel PPS loop is designed for.
> 
> Both the daemon and kernel loops are engineered so that the time 
> constant is directly proportional to the poll interval and the risetime 
> scales directly. If the poll exponent is set to the minimum 4 (16 s) the 
> risetinme is 500 s. While not admitted in public, the latest snapshot 
> can set the poll interval to 3 (8 s), so the risetime is 250 s. This 
> works just fine on a LAN, but I would never do this on an outside circuit.
> 
> Dave
> 
> Unruh wrote:
>> Harlan Stenn <stenn at ntp.org> writes:
>>
>>
>>>>>> In article <47af716b$0$513$5a6aecb4 at news.aaisp.net.uk>, David 
>>>>>> Woolley <david at ex.djwhome.demon.co.uk.invalid> writes:
>>
>>
>>> David> Harlan Stenn wrote:
>>>
>>>>> Why would ntpd be exiting during a warm start?
>>
>>
>>> David> Because we are discussing using it with the -q option.  If you 
>>> just
>>> David> use -g, it will take a lot longer to converge within a few
>>> David> milliseconds, as it will not slew at the maximum rate.  If you 
>>> use
>>> David> -q, you need to force a step if you want fast convergence.
>>
>>
>>> I still maintain you are barking up the wrong tree.
>>
>>
>>> In terms of the behavior model of ntp, "state 4" is as good as it 
>>> gets.  You
>>> are in the right ballpark.
>>
>>
>> And as has been commented on numerous times, ntp is state 4 is very 
>> slow to
>> converge to the best possible time control. This was a deliberate design
>> decision, as I understand it, so that in steady state the time is 
>> averaged
>> over a large number of samples ( not helped by the fact that 85% of 
>> samples
>> are thrown away), to reduce the statistical error in the clock control.
>> Note that at poll 7 the number of actual samples averaged over in the 
>> time
>> scale of the ntp feedback loop is only about 3, so the statistical
>> averaging even with such a long time constant, is not very good.
>>
>>
>>
>>> If you want something else, something you consider "better" than 
>>> state 4,
>>> please make a case for this and lobby for it.
>>
>>
>> I think many people have lobbied for faster response. In the 
>> discussion of
>> the chrony/ntp comparison, chrony is much faster to correct errors, 
>> and at
>> least on a local network, better at disciplining the clock as well ( in
>> part I think because on such a minimal round trip network, the frequency
>> fluctuations dominate over the offset measurement errors-- Ie, the Allen
>> intercept is much lower than the assumed 1500 sec. in that kind of
>> situation-- also the drift model on real systems is not well modeled 
>> by 1/f
>> noise.) So, what I think the point is that using ntpdate, one can rapidly
>> bring the clock into a few msec of the correct time, rather than waiting
>> for the feedback loop to finally eliminate that last 128msec of offset.
>>
>>
>>>>> For the case I'm describing the startup script sequence is to fire up
>>>>> 'ntpd -g' early.  If there are applications that need the system 
>>>>> clock to
>>>>> be on-track stable (even if a wiggle is being dealt with), that's 
>>>>> 'state
>>>>> 4', and running 'ntp-wait' before starting those services is, to 
>>>>> the best
>>>>> of my knowledge, all that is required.
>>
>>
>>> David> State 4 means within 128ms and using the normal control loop, 
>>> which
>>> David> has a time constant of around an hour.
>>
>>
>>> OK, and so what?
>>
>>
>>> Is State 4 insufficient for your needs, or are you just splitting hairs?
>>
>>
>>> David> For a cold start, it won't reach state 4 for a further 900 
>>> seconds
>>> David> after first priming the clock filter.
>>
>>
>>>>> If the system has a good drift file, I disagree with you.
>>
>>
>>> David> The definition of cold start is that there is no drift file.
>>
>>
>>> OK, now I know what the definitions are.
>>
>>
>>> I don't recall offhand the expected time to hit state 4 without a drift
>>> file.
>>
>>
>>> 1) This should not be the ordinary case
>>> 2) How does this have any bearing on the ntpdate -b discussion?
>>
>>
>>>>> And what is the big deal with using different config files?  The 
>>>>> config
>>>>> file mechanism has "include" capability so it is trivial to to easily
>>>>> maintain common 'base' configuration with customizations for separate
>>>>> start/run phases.
>>
>>
>>> David> You are now talking about using -q.  The difficulty is that 
>>> people
>>> David> have enough trouble getting the run phase config file right.
>>
>>
>>> I mention it because it's what you seem to be insisting on talking 
>>> about.
>>
>>
>>> I was providing a way to address the problems you describe with the (IMO
>>> bad) mechanism (-q) under discussion.
>>
>>
>>>>> But the bigger problem is why are you insisting on separate start/run
>>>>> phases?  This has not been "best practice" for quite a while, and 
>>>>> if you
>>>>> insist on using this method you will be running in to the exact 
>>>>> problems
>>>>> you are describing.
>>
>>
>>>>> No, the best advice is to understand why you have been using 
>>>>> ntpdate -b
>>>>> so far and understand the pros/cons of the new choices.
>>
>>
>>> David> We are talking about system managers and package creators, 
>>> neither of
>>> David> which have much time to study the details.
>>
>>
>>> Blessed are those who get what they deserve.
>>
>>
>>> These are the same folks who must get ssh configurations and various 
>>> other
>>> network configurations working.
>>
>>
>>> If the stock things work well enough for folks, great.
>>
>>
>>> If folks have suggestions for improvements I welcome them.
>>
>>
>>> If folks want something different I invite them to make a case for it.
>>> Please remember the scope and complexity of the problem case.  It's much
>>> easier to have a simpler solution if one is prepared to ignore certain
>>> problems.  Another case in this point is Maildir.
>>
>>
>>> If somebody is in the situation where they know they have specific
>>> requirements for time, they are in the situation where they have enough
>>> altitude on their requirements to know the costs/benefits of what is
>>> involved in getting there.
>>
>>
>> Well, I disagree. The sign of a good piece of software is that it does 
>> what
>> it needs to do despite the user having a bad idea of how to accomplish 
>> the
>> task. The use of software should not be an essotaric exercise. Let me 
>> again
>> bring up chrony. It manages to get the system into msec of the right time
>> on a timescale of minutes, not hours. It had a very different model 
>> for the
>> clock control mechanism from ntp. From what I have seen now, both in a
>> local net system ( with .2ms roundtrip times) and an adsl connection 
>> (20ms
>> round trip times) chrony also does as good a job or better than ntp at
>> disciplining the clock. I have just ordered another Garmin 18LVC so I can
>> make measurments as to how well chrony and ntp actually discipline the 
>> adsl
>> system's time to true time despite all of the noise that adsl adds to the
>> measurement process. (both ntp and chrony seem to have about the same
>> standard deviation in the measured offset, so that gives no 
>> information as
>> to how well the clock is actually disciplined-- one could discipline 
>> it to
>> 5usec and the other to 100usec and you could not tell the difference from
>> the measured times which have a variance of 500usec due to round trip
>> problems).
>>
>>
>>
>>
>>
>>
>>> -- 
>>> Harlan Stenn <stenn at ntp.org>
>>> http://ntpforum.isc.org  - be a member!




More information about the questions mailing list