[ntp:questions] orphan mode, manycast, and virtualization

unruh unruh at invalid.ca
Mon Sep 9 17:24:58 UTC 2013

On 2013-09-09, Horvath Bob-BHORVAT1 <Bob.Horvath at motorolasolutions.com> wrote:
> Another question if you guys have the time :),
> We situations in which we have almost everything deployed as virtualized servers running inside of VMware ESXi.    It seems like the recommendation on time synchronization  with ESXi has changed from release to release.  It seems to boil down to one of the following:
> A)     Run ntpd on the bare iron at the ESXi layer and let it change the virtual hardware clock (without ntpd running on VMs)
> B)      Run ntpd on the bare iron at the ESXi layer and let ntpd on the VMs use it as a time source
> C)      Run ntpd on all the VMs and point them to as many "good" ntp servers that you can find whether in its own ESXi or another physical server's ntpd.

Use A. C is horrible, and it is very easy for the VM's to exceed the
500PPM ntpd threshold. And ntpd does a really horrible job of
disciplining a clock that keeps changing and losing time on a short
timescale. It is designed for a clock with a bad, but consistant, rate.
By design it takes a long time to settle down. And having something like
your VM clock going to sleep for random amounts of time will drive ntpd
crazy. That rules out 2 or 3. 
Any virtual machine should get its time from the underlying system. 

> One of the other constraints we have is sometimes we don't have a good time source, and the requirement is that they all be consistent, if not correct.  In other words, there is no stratum one source, and no connectivity to a network that has one.

They all-- meaning all of the VM running on one machine or the VM
running on a bunch of different machines?

> Other times there is a good time source, but we have to survive being unable to reach it.  We are looking to take advantage of orphan mode but the version of ESXi we are using has the version of ntpd that has the orphan mode bug where the stratum starts climbing. I am not sure we can change that, so the thinking is that we have orphan modes in "pairs" of stratum.   Of course, a man with two watches is never quite sure what time it is.

Change the version of ntpd. 

> One thought I had was using manycast to let the various servers, physical and virtual, to find the best set of time sources (again physical or virtual) they can find.    I am assuming over time they will mark inaccurate virtualized sources as false tickers.   Even if all the instances of ntpd were virtualized (which I know is crazy),  would ntp eventually find the best source of time, or would it thrash about dealing with the best of a lot of bad clocks?
> Is there any wisdom out  there to deal with such cases?

More information about the questions mailing list