[ntp:hackers] Does ntpd need to whine more ?
mayer at ntp.isc.org
Mon Oct 3 19:13:13 UTC 2005
Poul-Henning Kamp wrote:
> In message <43413DB6.308 at udel.edu>, "David L. Mills" writes:
>>There is a fundamental misunderstanding here.
> Agreed, but we may not agree what the misunderstanding is.
>>There is a fundamental misunderstanding here. The clock discipline is in
>>fact a flywheel which is nudged at each poll update to correct the time
>>and update the frequency estimate. If you stop nudging it for awhile it
>>may accumulate error, but not much. How long should you wait before
> I don't think it is unreasonable to expect people to have a plain
> XO (unless they tell NTPD otherwise) and therefore few systems
> actually have a recoverable offset after one day on the island.
> And expecting to recapture with a poll of 1024 after free-wheeling
> for a day is waaaay more optimistic than 25 cent XO's deserve.
> I would say that once the shift register runs dry, we should
> reduce the poll rate (if minpoll allows) for every empty shift
> register we see:
> That way you should have a scenario like:
> 0 poll = 1024 shift=11111111
> 1024 poll = 1024 shift=11111110
> 2048 poll = 1024 shift=11111100
> 3072 poll = 1024 shift=11111000
> 4096 poll = 1024 shift=11110000
> 5120 poll = 1024 shift=11100000
> 6144 poll = 1024 shift=11000000
> 7168 poll = 1024 shift=10000000
> 8192 poll = 1024 shift=00000000, reduce poll, start timer 512 * 8
> 12288 poll = 512 shift=00000000, reduce poll, start timer 256 * 8
> 14336 poll = 256 shift=00000000, reduce poll, start timer 128 * 8
> 15360 poll = 128 shift=00000000, reduce poll, start timer 64 * 8
> 15872 poll = 64 shift=00000000 at minpoll, do nothing
> That way we are back to 64s poll rate after 4h24m and that sounds
> very compatible with typical XO performance.
I disagree with this unless I misunderstood what you are suggesting.
The poll interval should not change as long as the server does not
respond to an NTP packet. When the first packet is returned, then you
should look to see how far back it didn't respond and then decide by how
much to change, if any, the poll interval. If it's never responded we
are presumably at minpoll otherwise the system is probably stable anywat
and really doesn't need to change the poll frequency much in order to
assure itself it has good statistics for that server. We have far too
many misimplementations of NTP which actually increase their poll
frequency if they don't get a response.
The other related issue is KOD packets. A KOD packet should cause it to
immediately back off to maxpoll and if there are 3 KOD's in succession
if should unceremoniously drop that server as a source of Chimes. Kind
of like a "3 death chimes and you're out".
> My first and primary beef is that we do not whine loudly when we
> have lost reachability, no matter how long this has been going on.
> Can't we at least agree that after being unreachable for N hours
> we should syslog something rather severe ?
> I'd propose 24 for N, but even 168 will improve on the current
> situation where people have no inkling that their system has
> wandered off into the sunset.
Yes, we should log this, but the main question is how often? We don't
want to do this too much otherwise we are clogging up the syslog with a
lot of unnecessary messages.
More information about the hackers