[ntp:questions] ntpdate.c unsafe buffer write

David L. Mills mills at udel.edu
Mon Feb 11 19:03:36 UTC 2008


Guys,

There seems to some misinformation here.

Both ntpdate and ntpd -q set the offset with adjtime() and then exit. 
After that, stock Unix adjtime() slews the clock at rate 500 PPM, which 
indeed could take 256 s for an initial offset of 128 ms. A prudent 
response would be to measure the initial offset and compute the time to 
wait. The ntp-wait script waits for ntpd to enter state 4, which could 
happen with an initial offset as high as 128 ms.

The ntpd time constant is purposely set somewhat large at 2000 s, which 
results in a risetime of about 3000 s. This is a compromise for stable 
acquisition for herky-jerky Internet paths and speed of convergence for 
LANs. For typical Internet paths the Allan intercept is about 2000 s. 
For fast LANs with nanosecond clock resolution, the Allan intercept can 
be as low as 250s, which is what the kernel PPS loop is designed for.

Both the daemon and kernel loops are engineered so that the time 
constant is directly proportional to the poll interval and the risetime 
scales directly. If the poll exponent is set to the minimum 4 (16 s) the 
risetinme is 500 s. While not admitted in public, the latest snapshot 
can set the poll interval to 3 (8 s), so the risetime is 250 s. This 
works just fine on a LAN, but I would never do this on an outside circuit.

Dave

Unruh wrote:
> Harlan Stenn <stenn at ntp.org> writes:
> 
> 
>>>>>In article <47af716b$0$513$5a6aecb4 at news.aaisp.net.uk>, David Woolley <david at ex.djwhome.demon.co.uk.invalid> writes:
> 
> 
>>David> Harlan Stenn wrote:
>>
>>>>Why would ntpd be exiting during a warm start?
> 
> 
>>David> Because we are discussing using it with the -q option.  If you just
>>David> use -g, it will take a lot longer to converge within a few
>>David> milliseconds, as it will not slew at the maximum rate.  If you use
>>David> -q, you need to force a step if you want fast convergence.
> 
> 
>>I still maintain you are barking up the wrong tree.
> 
> 
>>In terms of the behavior model of ntp, "state 4" is as good as it gets.  You
>>are in the right ballpark.
> 
> 
> And as has been commented on numerous times, ntp is state 4 is very slow to
> converge to the best possible time control. This was a deliberate design
> decision, as I understand it, so that in steady state the time is averaged
> over a large number of samples ( not helped by the fact that 85% of samples
> are thrown away), to reduce the statistical error in the clock control.
> Note that at poll 7 the number of actual samples averaged over in the time
> scale of the ntp feedback loop is only about 3, so the statistical
> averaging even with such a long time constant, is not very good.
> 
> 
> 
>>If you want something else, something you consider "better" than state 4,
>>please make a case for this and lobby for it.
> 
> 
> I think many people have lobbied for faster response. In the discussion of
> the chrony/ntp comparison, chrony is much faster to correct errors, and at
> least on a local network, better at disciplining the clock as well ( in
> part I think because on such a minimal round trip network, the frequency
> fluctuations dominate over the offset measurement errors-- Ie, the Allen
> intercept is much lower than the assumed 1500 sec. in that kind of
> situation-- also the drift model on real systems is not well modeled by 1/f
> noise.) So, what I think the point is that using ntpdate, one can rapidly
> bring the clock into a few msec of the correct time, rather than waiting
> for the feedback loop to finally eliminate that last 128msec of offset.
> 
> 
>>>>For the case I'm describing the startup script sequence is to fire up
>>>>'ntpd -g' early.  If there are applications that need the system clock to
>>>>be on-track stable (even if a wiggle is being dealt with), that's 'state
>>>>4', and running 'ntp-wait' before starting those services is, to the best
>>>>of my knowledge, all that is required.
> 
> 
>>David> State 4 means within 128ms and using the normal control loop, which
>>David> has a time constant of around an hour.
> 
> 
>>OK, and so what?
> 
> 
>>Is State 4 insufficient for your needs, or are you just splitting hairs?
> 
> 
>>David> For a cold start, it won't reach state 4 for a further 900 seconds
>>David> after first priming the clock filter.
> 
> 
>>>>If the system has a good drift file, I disagree with you.
> 
> 
>>David> The definition of cold start is that there is no drift file.
> 
> 
>>OK, now I know what the definitions are.
> 
> 
>>I don't recall offhand the expected time to hit state 4 without a drift
>>file.
> 
> 
>>1) This should not be the ordinary case
>>2) How does this have any bearing on the ntpdate -b discussion?
> 
> 
>>>>And what is the big deal with using different config files?  The config
>>>>file mechanism has "include" capability so it is trivial to to easily
>>>>maintain common 'base' configuration with customizations for separate
>>>>start/run phases.
> 
> 
>>David> You are now talking about using -q.  The difficulty is that people
>>David> have enough trouble getting the run phase config file right.
> 
> 
>>I mention it because it's what you seem to be insisting on talking about.
> 
> 
>>I was providing a way to address the problems you describe with the (IMO
>>bad) mechanism (-q) under discussion.
> 
> 
>>>>But the bigger problem is why are you insisting on separate start/run
>>>>phases?  This has not been "best practice" for quite a while, and if you
>>>>insist on using this method you will be running in to the exact problems
>>>>you are describing.
> 
> 
>>>>No, the best advice is to understand why you have been using ntpdate -b
>>>>so far and understand the pros/cons of the new choices.
> 
> 
>>David> We are talking about system managers and package creators, neither of
>>David> which have much time to study the details.
> 
> 
>>Blessed are those who get what they deserve.
> 
> 
>>These are the same folks who must get ssh configurations and various other
>>network configurations working.
> 
> 
>>If the stock things work well enough for folks, great.
> 
> 
>>If folks have suggestions for improvements I welcome them.
> 
> 
>>If folks want something different I invite them to make a case for it.
>>Please remember the scope and complexity of the problem case.  It's much
>>easier to have a simpler solution if one is prepared to ignore certain
>>problems.  Another case in this point is Maildir.
> 
> 
>>If somebody is in the situation where they know they have specific
>>requirements for time, they are in the situation where they have enough
>>altitude on their requirements to know the costs/benefits of what is
>>involved in getting there.
> 
> 
> Well, I disagree. The sign of a good piece of software is that it does what
> it needs to do despite the user having a bad idea of how to accomplish the
> task. The use of software should not be an essotaric exercise. Let me again
> bring up chrony. It manages to get the system into msec of the right time
> on a timescale of minutes, not hours. It had a very different model for the
> clock control mechanism from ntp. From what I have seen now, both in a
> local net system ( with .2ms roundtrip times) and an adsl connection (20ms
> round trip times) chrony also does as good a job or better than ntp at
> disciplining the clock. I have just ordered another Garmin 18LVC so I can
> make measurments as to how well chrony and ntp actually discipline the adsl
> system's time to true time despite all of the noise that adsl adds to the
> measurement process. (both ntp and chrony seem to have about the same
> standard deviation in the measured offset, so that gives no information as
> to how well the clock is actually disciplined-- one could discipline it to
> 5usec and the other to 100usec and you could not tell the difference from
> the measured times which have a variance of 500usec due to round trip
> problems).
> 
> 
> 
> 
> 
> 
>>-- 
>>Harlan Stenn <stenn at ntp.org>
>>http://ntpforum.isc.org  - be a member!




More information about the questions mailing list