[ntp:questions] Re: tinker step 0 (always slew) and kernel time discipline

Joe Harvell harvell at nortel.com
Thu Sep 21 21:31:14 UTC 2006


Okay, I'll stick with this thread.

I plan to use "tinker step 0" to prevent step corrections on an NTP client only host.  I am only expecting large offsets in certain failure scenarios.  In these scenarios, operator intervention is required and the clock will be stepped manually as part of a much larger procedure.

What I want to understand is how using "tinker step 0" interacts with the kernel time discipline.  Consider DM's (David Mills) 2005 post below:

> In truth, the kernel is disabled only if the actual correction is
> greater than 0.5 s no matter what the tinker. However, there could be 
> grief if the discipline switches from one to the other. There are a 
> rather large number of untested and probably dangerous things that
> might happen with uncharted combinations of tinkers and I'm not
> thrilled when spending lots of time fixing the broken combinations.
> You are on your own, but a "diable kernel" command might be appropriate.

When I first read this, I thought the kernel time discipline had some sort of dependency on step corrections being used for large errors.  However, after reading David's "NTP Clock Discipline Principles" slides describing the NHPFL, it seems like this algorithm would be unaffected by whether or not step corrections are made.  Is the concern here that if step corrections are not made (i.e. tinker step 0), then you have untested behavior resulting from a swtich from the daemon to the kernel discipline once the daemon discipline brings the offset to <0.5s?

Also, DM made the following post later in the same thread:

> As I said in my response to your message in my inbox, all claims on 
> stability are cancelled in the configuration you selected. I am not 
> surprised that you see large overshoots, instability and incorrect 
> behavior. There are all kinds of reasons for this, including violations 
> of the expected Unix behavior, unexpected transients between the kernel 
> and daemon feedback loops and who knows what else. The engineered 
> parameters are not designed for your application. The best choice, as I 
> said in my reply to your message, is to forget ntpd altogether and use 
> your wristwatch. You will have to analyze each and every case on your
> own.

Unfortunately, I can't tell from the context here whether the "selected configuration" is the one with or without "disable kernel" as part of it.  It seems like DM is condemning the use of "tinker step 0" altogether.  Is this in fact the case?  Also, I get the "unexpected transients..." comment, but not the "violations of the expected Unix behavior."  What expected Unix behavior? What violations? How would using "tinker step 0" in either configuration contribute to this?

I would also like to point out a more recent thread (http://groups.google.com/group/comp.protocols.time.ntp/browse_frm/thread/f575d212fb84384c/c4b7a8e20feff49f?lnk=st&q=tinker+step+0&rnum=7&hl=en#c4b7a8e20feff49f), 
in which it seems like a bug causing the behavior originally described by Nikolaev in the Feb 2005 thread was discovered and fixed.  But there is no indication that the fix was ever sourced.  Does anyone know whether this was in fact a bug and if it was fixed?

Despite DM's assertion that "all claims on stability are cancelled...", I do not understand why there is a fundamental problem with using the kernel time discipline and "tinker step 0" together.  What I'd really like is a detailed description of why this is the case.

For starters, why is the kernel time discipline disabled when the offset is greater than 0.5 seconds?  I do appreciate your speculative response to that question.  Could someone verify whether this is in fact true?

Now on to the "unexpected transients."  If a large offset is being corrected via the daemon discipline, and then the offset gets to within 0.5 s, the kernel loop would be started, right?  This should keep moving the offset towards zero, but with the NHPFL instead, right?  Is there a concern that the offset could jump back and forth between the 0.5 s threshold *as a result* of switching between the daemon and kernel disciplines?  Is that what DM is suggesting?

David Woolley wrote:
> In article <eeucua$5q9$1 at zcars129.ca.nortel.com>,
> Joe Harvell <harvell at nortel.com> wrote:
> 
> I have an application that is sensitive to step corrections and am considering using 'tinker step 0' to disable them altogether.  However, I noticed a thread on this topic in February 2005 (http://lists.ntp.isc.org/pipermail/questions/2005-February/004468.html) that suggested setting 'tinker step 0' without explicitly using 'disable kernel' will essentially yield unpredictable behavior.
> 
>> So when "disable kernel" has been used, how is the clock frequency
>> adjusted?  Also, why is the kernel time discipline disabled when a
> 
> By doing all the calculations in user space and periodically (every 4 seconds
> on ntpd v3) calling adjtime to apply the slew correction for that second.
> The result is a sawtoothing of the phase.
> 
> The kernel mode does the calculations every tick.
> 
>> correction of > 0.5 seconds is required?
> 
> I suspect the limit is imposed because something in the kernel overflows.
> 
> PS.  Please don't keep starting new threads for what is clearly part of
> a single thread.




More information about the questions mailing list