[ntp:questions] Clock and Network Simulator

Miroslav Lichvar mlichvar at redhat.com
Thu Jul 1 11:27:49 UTC 2010

On Wed, Jun 30, 2010 at 10:00:06PM +0000, David L. Mills wrote:
> Is there somebody around here that understands feedback control
> theory? You are doing extreme violence to determine a really simple
> thing, the discipline loop impulse response. There is a much simpler
> way.

It was a demonstration of what clknetsim can do. You may be able
to predict the result, but I'm not. I think being able to verify a
theory with simulations is always a good thing.

> Of particular importance is the damping factor, which is evident
> from the overshoot. If SHIFT_PLL is radically changed, I would
> expect the overshoot to be replaced by an exponentially decaying
> ring characteristic.

That's not what I see in tests on real hw and simulations with

> The change in SHIFT_PLL would result
> in unstable behavior below 5 (32 s), as well as serious transients
> if the discipline shifts from the daemon to the kernel and back. All
> feedback loops become unstable unless the time constant is at least
> several times the frequency update interval, which is this case is
> one second. If you do want to explore how stability may be affected,
> restore the original design and recompile the distribution with
> NTP_MINPOLL changed from 3 to 1.

Is poll 1 SHIFT_PLL 4 really equal to poll 3 SHIFT_PLL 2 in this
respect? If you can provide information how to demostrate the
instability with SHIFT_PLL 2 and normal polls, it'll be much easier to
convince the kernel folks to change it back to 4.

With polls 3-10 and SHIFT_PLL 2, the only instability I've seen is
with very long update intervals (e.g. when the network connection
repeatedly goes up and down), the frequency will eventually start
jumping between +500 and -500 ppm. But kernel loop with SHIFT_PLL 4
and daemon loop with small poll intervals have the same problem, the
threshold is just 4 times higher for them.

clknetsim has a pll_clamp option which can be enabled to avoid this
instability, it clamps the PLL update interval to
tc * (1 << (ntp_shift_pll + 1)), where tc is the time constant in
seconds. I will be doing more testing with it and possibly propose to
include a similar code in the kernel.

As for runtime switching between daemon and kernel discipline, I
haven't tried that. I didn't even know it is supported by ntpd.

> To fix the original problem reported to me, change the frequency
> gain (only) by the square of  100 divided by the new clock frequency
> in Hz. For instance, to preserve the loop dynamics with a 1000-Hz
> clock, divide the frequency gain parameter by 100. In the original
> nanokernel routine ktime.c at line 60 there is a line
> SHIFT_PLL * 2 + time_constant.
> Replacing it by SHIFT_PLL * 20 + time_constant would fix the
> progblem for 1000-Hz clocks.

I'm not a kernel developer, but I think this is already fixed. Current
kernels can be configured to use a dynamic HZ (CONFIG_NO_HZ aka
tickless mode), so the ntp code had to be rewritten to allow such
operation. With SHIFT_PLL 4, the response and the overshoot is exactly
as you describe it should be.

BTW, the effect of changing SHIFT_PLL to 2 on clock accuracy in
various network conditions is shown here:


With poll 6 and 10ppb/s wander, the crossover is around 10ms jitter.
With larger jitters SHIFT_PLL 2 can be up to 2 times worse (it seems
this can't be improved by lowering the poll interval) and with very
small jitters it can be about 50 times better.

Miroslav Lichvar

More information about the questions mailing list