[ntp:questions] Can a clock drift be too big for ntpd?
pln at glast2.Stanford.EDU
Fri Oct 19 22:22:44 UTC 2007
On 2007-10-19, Richard B. Gilbert <rgilbert88 at comcast.net> wrote:
> Some Linux systems have known problems with losing timer interrupts!
> During periods of heavy I/O load disk drivers may mask or disable
> interrupts for a little too long a time. . . . Some Windows systems
> have also been known to exhibit similar behavior.
I would like to know more about this. How can I monitor the interrupts?
After my original post, I remembered that this machine has a unique
feature. I compiled a new kernel to add the Reiser file system.
I don't think I changed anything else, but I don't have any previous
experience with custom kernels.
> What's the value stored in your drift file?
Currently it's 74.080. This morning it started out around 30.
> DON'T use burst! The burst keyword was intended for situations where
> ntpd has to make a phone call to NIST (or similar service) to get the
> time. It is NOT suitable for general use over the internet.
Without burst, it just drifts freely. The size of the drift is even
worse than I thought. With burst, here are some lines from the log
Oct 19 13:37:23 client ntpd: time reset +13.151972 s
Oct 19 13:55:58 client ntpd: time reset +8.779090 s
Oct 19 14:08:09 client ntpd: time reset +8.712040 s
Oct 19 14:28:21 client ntpd: time reset +11.494533 s
Oct 19 14:44:53 client ntpd: time reset +9.450835 s
If I ever get this situation under control I'll turn off burst.
> Iburst is good. It gets you a fast startup and then lets your system
> poll the server at normal intervals.
> Check the value of a Kernel variable called "HERTZ". Some Linux systems
> set it to 1000 which is not good for NTP. If yours is set to 1000 (or
> 250) try changing it to 100.
More ignorance on my part. Where would I look for this? I searched
the kernel source code and didn't find it.
> Using a single server is not usually a good idea. Two servers are the
> worst possible configuration; ntpd has no good way to decide which one
> to believe. Three are good but four are better. Try to select servers
> that are close to you in network space (low values of Delay).
Again, I'll fix this if I ever get things running properly. The one
server I chose is our master campus server, which is quite close
Here's another issue. I just learned about the distinction between
the kernel clock and the hardware (TOY) clock. I have tried running
hwclock from time to time, comparing it to my WWV-controlled wall
clock. It never seems to be more than 1 or 2 seconds off. Is there
any way to exploit this?
More information about the questions