[ntp:questions] Flash 400 on all peers; can't get ntpd to be happy
cswiger at mac.com
Tue Mar 8 23:26:34 UTC 2011
On Mar 8, 2011, at 1:18 PM, Steve Kostecke wrote:
On 2011-03-08, Chuck Swiger <cswiger at mac.com> wrote:
>> Seriously, each physical machine only has one RTC and crystal
>> oscillator. It's useful to run one instance of ntpd in the Dom0 (or
>> host ESX) context where it can actually work and keep this real
>> hardware clock in sync.
> NTP disciplines the system (i.e. kernel) clock, not the hardware clock
> on the mother board.
That's right, although in reasonably common for platforms to periodically write the system clock time back to the hardware clock-- variously called the RTC/TOD/TOY clock which is in the BIOS/EFI/firmware and keeps time when the system is off.
The kernel/system clock is typically based off of a timer source like ACPI or HPET, which in turn uses a crystal oscillator running at some fairly rapid rate (HPET provides >10 MHz interrupts, for example), rather than the ~32kHz frequency of a classic RTC. It generates interrupts at kern.hz (or a multiple, perhaps, if you're doing a separate profile or stats clock for profiling or process usage) which invoke the scheduler and call hardclock or equivalent.
Anyway, there isn't a separate RTC *or* timer crystal driving ACPI/HPET/etc for each VM.
>> Running ntpd's in the other DomUs/guest VMs is almost entirely
>> pointless; it might be useful only if Dom0->DomU time is busted,
>> and even in that case, ntpd is unlikely to ever obtain good time
>> synchronization running in a DomU.
> That's debatable.
> I have a Debian 6.0 system running as a VMWare guest. ntpd on this
> system has no problem disciplining the clock.
OK. Does it do any better than using VMWare's "tools.syncTime = true"?
> Recent peer billboard snapshot:
> steve at www:/var/log/ntpstats$ ntpq -p
> remote refid st t when poll reach delay offset jitter
> +ntp.my.isp .GPS. 1 u 34 1024 377 60.665 1.623 1.617
> -enob... .PPS. 1 u 1041 1024 377 39.552 -8.220 2.120
> *emit... .PPS. 1 u 184 1024 377 27.404 3.936 1.347
> +yamo... [snip] 2 u 768 1024 377 33.565 -1.757 2.256
> -3snd... [snip] 2 u 102 1024 377 26.294 7.261 1.179
Your jitter values are well over an order of magnitude worse than that of ntpd running on a non-virtualized machine, and your offsets are nearly an order of magnitude worse:
% ntpq -p -c rv
remote refid st t when poll reach delay offset jitter
-ntp.pbx.org 22.214.171.124 2 u 119 256 377 22.076 0.946 0.027
*bonehed.lcs.mit .PPS. 1 u 183 256 377 23.741 -0.079 0.027
+hickory.cc.colu 126.96.36.199 2 u 138 256 377 22.427 -0.210 0.049
+time1.apple.com 188.8.131.52 2 u 168 256 377 55.828 0.315 0.202
[ ... ]
associd=0 status=0694 leap_none, sync_ntp, 9 events, freq_mode,
version="ntpd 4.2.4p5-a Wed Feb 16 17:12:20 EST 2011 (1)",
processor="i386", system="FreeBSD/7.4-PRERELEASE", leap=00, stratum=2,
precision=-19, rootdelay=23.741, rootdispersion=25.764, peer=5314,
reftime=d1212f3d.75251aea Tue, Mar 8 2011 17:42:05.457, poll=8,
clock=d1213495.8f71f337 Tue, Mar 8 2011 18:04:53.560, state=4,
offset=-0.079, frequency=19.348, jitter=0.167, noise=0.032,
For all of that, your VM is doing pretty well running ntpd compared to others I'd seen. I'd imagine the host running the VM isn't especially busy; if it was, I wouldn't be surprised if ntpd can't manage to discipline the clock without "tinker panic 0".
Seriously, even VMware documents this, for example see http://kb.vmware.com/kb/1006427:
"The configuration directive tinker panic 0 instructs NTP not to give up if it sees a large jump in time. This is important for coping with large time drifts and also resuming virtual machines from their suspended state.
Note: The directive tinker panic 0 must be at the top of the ntp.conf file.
It is also important not to use the local clock as a time source, often referred to as the Undisciplined Local Clock. NTP has a tendency to fall back to this in preference to the remote servers when there is a large amount of time drift."
>> You are better off running ntpdate (or sntp) periodically via cron in
>> the DomUs.
> Perhaps in certain cases, but not across the board.
I'd be happy to review counterexamples to my generalization....
PS: I'd just updated this system a two weeks ago, but it's running the system-provided /usr/sbin/ntpd. At least this thread has reminded me to switch to the 4.2.6p2 in /usr/local. :-)
More information about the questions