[ntp:questions] Clock and Network Simulator

David L. Mills mills at udel.edu
Wed Jun 30 22:00:06 UTC 2010


Miroslav,

Is there somebody around here that understands feedback control theory? 
You are doing extreme violence to determine a really simple thing, the 
discipline loop impulse response. There is a much simpler way.

Forget everything except the tools that come with the NTP distribution. 
Find a good, stable server and light up a selected client. Make sure the 
client kernel is enabled. Set minpoll and maxpoll to 6. Configure the 
loopstats monitoring function. Run the client until operation stabilizes 
as determined by the loopstats data

While the daemon is running, use ntptime to set the clock offset to 100 
ms. Go away and do something useful a couple of yours.

Inspect the loopstats data. It should start at 100 ms, exponentially 
decay to zero in about 3000 s, overshoot about six percent of the 
initial offset., then slowly decrease to zero over a period of hours. 
This is the intended nominal behavior for a poll interval (same as time 
constant) of 6. If you increase (decrease) the poll interval by one, the 
impulse response will look the same, but at double (half) the time 
scale. This should hold true for poll intervals from 3 to 10.

Of particular importance is the damping factor, which is evident from 
the overshoot. If SHIFT_PLL is radically changed, I would expect the 
overshoot to be replaced by an exponentially decaying ring 
characteristic. With the intended loop constants the behavior should 
have a single overshoot characteristic in the order of a few percent. 
 From a mathematical and engineering point of view the intended behavior 
provides the fastest convergence, relative to the chosen time constant, 
with only nominal overshoot.

If the intended effect of the SHIFT_PLL change was to decrease the 
convergence time, that is the absolute worst thing to do. The nanokernel 
allows the time constant to range from zero to ten and carefully scales 
the state variables to match, although it (and the daemon discipline) 
starts to become unstable at values below 3, the minimum enforced by the 
daemon. The change in SHIFT_PLL would result in unstable behavior below 
5 (32 s), as well as serious transients if the discipline shifts from 
the daemon to the kernel and back. All feedback loops become unstable 
unless the time constant is at least several times the frequency update 
interval, which is this case is one second. If you do want to explore 
how stability may be affected, restore the original design and recompile 
the distribution with NTP_MINPOLL changed from 3 to 1.

Now to the issue of multiple tandem server/clients. You don't need to 
explore the behavior; it can be reliably predicted. Assume the server 
and all downstream clients are started at the same time. The impulse 
response of  the first downstream client of the original client 
operating as a server is the convolution of the original impulse 
response with itself. Roughly speaking, the offset decay is slower, 
reaching zero in twice the original time or about 6000 s. The behavior 
of the next downstream client is the convolution of this convolution and 
the original impulse response and so on.

To fix the original problem reported to me, change the frequency gain 
(only) by the square of  100 divided by the new clock frequency in Hz. 
For instance, to preserve the loop dynamics with a 1000-Hz clock, divide 
the frequency gain parameter by 100. In the original nanokernel routine 
ktime.c at line 60 there is a line

SHIFT_PLL * 2 + time_constant.

Replacing it by SHIFT_PLL * 20 + time_constant would fix the progblem 
for 1000-Hz clocks.

Dave

Miroslav Lichvar wrote:

>On Tue, Jun 29, 2010 at 06:31:01PM +0000, David L. Mills wrote:
>  
>
>>From your description your simulator is designed to do something
>>else, but what else is not clear from your messages. It might help
>>to describe an experiment using your simulator and show what results
>>it produces.
>>    
>>
>
>It's designed to test NTP implementations, but it uses a more general
>approach.
>
>Ntpdsim tests ntpd as an NTP client with simulated NTP servers in a
>simulated network. Clknetsim doesn't simulate NTP servers, it
>simulates only a network to which are connected real NTP clients and
>servers.
>
>The difference is that ntpdsim tests one NTP client and clknetsim
>tests whole NTP network.
>
>Say we want to test how does the Linux SHIFT_PLL change affect an NTP
>network. There is a chain of seven ntpd daemons configured, all using
>poll 6. Strata 1, 3, 5, 7 have SHIFT_PLL 2 and strata 2, 4, 6 have
>SHIFT_PLL 4. Stratum 1 has clock with zero wander and frequency offset
>and is using the LOCAL driver, the rest have clocks with 1ppb/s
>wander. Between all nodes is network delay with exponential
>distribution and a constant jitter. The simulations are repeated with
>jitter starting from 10 microseconds and increased to 0.1 second in 28
>steps. Each simulation is 4000000 seconds long and the result is a
>list of RMS offsets, one for each stratum.
>
>After finishing all iterations, we'll make an RMS offset/jitter plot:
>
>http://fedorapeople.org/~mlichvar/clknetsim/test5_ntp2.png
>
>And the same experiment with all strata using SHIFT_PLL 4:
>
>http://fedorapeople.org/~mlichvar/clknetsim/test5_ntp.png
>
>You probably know what to expect here, but I was surprised to see that
>with high jitter the SHIFT_PLL 4 strata are actually better than their
>SHIFT_PLL 2 sources.
>
>  
>




More information about the questions mailing list