[ntp:questions] Clock and Network Simulator

Miroslav Lichvar mlichvar at redhat.com
Thu Jul 1 11:27:49 UTC 2010


On Wed, Jun 30, 2010 at 10:00:06PM +0000, David L. Mills wrote:
> Is there somebody around here that understands feedback control
> theory? You are doing extreme violence to determine a really simple
> thing, the discipline loop impulse response. There is a much simpler
> way.

It was a demonstration of what clknetsim can do. You may be able
to predict the result, but I'm not. I think being able to verify a
theory with simulations is always a good thing.

> Of particular importance is the damping factor, which is evident
> from the overshoot. If SHIFT_PLL is radically changed, I would
> expect the overshoot to be replaced by an exponentially decaying
> ring characteristic.

That's not what I see in tests on real hw and simulations with
SHIFT_PLL 2.

> The change in SHIFT_PLL would result
> in unstable behavior below 5 (32 s), as well as serious transients
> if the discipline shifts from the daemon to the kernel and back. All
> feedback loops become unstable unless the time constant is at least
> several times the frequency update interval, which is this case is
> one second. If you do want to explore how stability may be affected,
> restore the original design and recompile the distribution with
> NTP_MINPOLL changed from 3 to 1.

Is poll 1 SHIFT_PLL 4 really equal to poll 3 SHIFT_PLL 2 in this
respect? If you can provide information how to demostrate the
instability with SHIFT_PLL 2 and normal polls, it'll be much easier to
convince the kernel folks to change it back to 4.

With polls 3-10 and SHIFT_PLL 2, the only instability I've seen is
with very long update intervals (e.g. when the network connection
repeatedly goes up and down), the frequency will eventually start
jumping between +500 and -500 ppm. But kernel loop with SHIFT_PLL 4
and daemon loop with small poll intervals have the same problem, the
threshold is just 4 times higher for them.

clknetsim has a pll_clamp option which can be enabled to avoid this
instability, it clamps the PLL update interval to
tc * (1 << (ntp_shift_pll + 1)), where tc is the time constant in
seconds. I will be doing more testing with it and possibly propose to
include a similar code in the kernel.

As for runtime switching between daemon and kernel discipline, I
haven't tried that. I didn't even know it is supported by ntpd.

> To fix the original problem reported to me, change the frequency
> gain (only) by the square of  100 divided by the new clock frequency
> in Hz. For instance, to preserve the loop dynamics with a 1000-Hz
> clock, divide the frequency gain parameter by 100. In the original
> nanokernel routine ktime.c at line 60 there is a line
> 
> SHIFT_PLL * 2 + time_constant.
> 
> Replacing it by SHIFT_PLL * 20 + time_constant would fix the
> progblem for 1000-Hz clocks.

I'm not a kernel developer, but I think this is already fixed. Current
kernels can be configured to use a dynamic HZ (CONFIG_NO_HZ aka
tickless mode), so the ntp code had to be rewritten to allow such
operation. With SHIFT_PLL 4, the response and the overshoot is exactly
as you describe it should be.

BTW, the effect of changing SHIFT_PLL to 2 on clock accuracy in
various network conditions is shown here:

http://fedorapeople.org/~mlichvar/clknetsim/test1_exp.png

With poll 6 and 10ppb/s wander, the crossover is around 10ms jitter.
With larger jitters SHIFT_PLL 2 can be up to 2 times worse (it seems
this can't be improved by lowering the poll interval) and with very
small jitters it can be about 50 times better.

-- 
Miroslav Lichvar



More information about the questions mailing list