[ntp:questions] Re: What went wrong with the leap second
David J Taylor
david-taylor at blueyonder.co.not-this-bit.nor-this-part.uk.invalid
Thu Jan 5 08:36:38 UTC 2006
David L. Mills wrote:
> In all the testing with this thing, I never got unexpected behavior,
> but since lots of others did experience pinball machine behavior,
> some wee thing must have been overlooked. Workin' on it.
My observation is that the single system seems to work OK.
The problem comes when about half the systems don't step at the same time,
leaving the client without a reference clock (or sometimes with one)
having two clusters about one second apart. As the leap-second propagated
through, the population of these clusters changed, leaving clients
confused as to the correct time.
The one thing which stands out as making things much worse was the
willingness of ntpd to discard a drift value which had been built up over
many days too quickly - within an hour of the leap seconds drift values
had shot to the limits (+/- 500) on some of my systems, and it was the
recovery (in two cases failure to recover in a reasonable time), which was
the major issue.
I'd suggest that drift should not be changed by a large value, unless
there is a good reason. No, I'm not an ntp expert!
More information about the questions