[ntp:questions] drift value very large and very unstable

Richard B. Gilbert rgilbert88 at comcast.net
Fri Mar 7 19:58:53 UTC 2008


Andy Helten wrote:
> Fran Horan wrote:
> 
>><snip lots of detail>
>>
>>  
>>
>>>So, the summary is that drift goes to 500ppm when stepping is disabled
>>>but runs normally when stepping is enabled and both situations never
>>>require a time step.  This makes no sense to me.  By the way, as
>>>mentioned previously, we require that time does not step backward due to
>>>a problem in some commercial software that cannot currently tolerate
>>>time moving backwards.
>>>
>>>Quite frankly, I don't think it's unreasonable that a system require
>>>time to monotonically increase.
>>>    
>>
>>Forgive me if this answer misses a point in the earlier details, or shows my
>>ignorance of NTP, but a few ideas/thoughts.
>>
>>Oscillators and drift can go in either direction, fast or slow, its a
>>physics-based situation. You can't write code around that and provide a
>>software solution that is monotonic at all times. However, a single negative
>>step just at the start may be required before going monotic after that
>>event. (Not an expert, but that is my understanding).
>>
>>With this ref clock and a GPS-drive IRIG source, you may only see a single
>>negative step when NTP first begins running on a new system with no drift
>>file, or a system that has been powered off a long time with a
>>battery-driven clock drifting over that long time. Once NTP is humming along
>>after the initial step and some updates, you shouldn't see a step again.
>>This makes me think that you should insert a delay in launching your
>>sensitive application, or block the application at some point, so it does
>>not see the (possible) first time step.
>>
>>Fran Horan
>>JHU/APL
>>
>>  
> 
> Hey Fran,
> 
> Yes, exactly, we do perform an initial time sync with stepping enabled. 
> This is done prior to initializing the commercial software and so it
> does not cause problems if time moves backwards.  And, yes, if we are
> below the step threshold after the initial sync (which should always be
> the case), then we should stay below that threshold until the end of
> time.  Following this logic, we should allow time steps and be comforted
> knowing they will never occur in a normally functioning system.  I agree
> this is reasonable and does not conflict with my own rant that "if we
> have an offset of more than 10ms in this system, then something isn't
> working correctly".
> 
> This approach is definitely worth considering and I'll bring it up with
> the decision makers.  However, there is always concern that months or
> years from now someone will say -- "Hey, some dumbass left time stepping
> enabled, let's disable it on all systems immediately".  Surely this
> wouldn't be done without some regression testing, but then again such a
> mundane change shouldn't need exhaustive testing, right?  Riiiiight.
> 
> I guess was just hoping someone will say, "Oh, right, that's a known
> problem.  You need to do 'X' to fix it."
> 
> Andy

A comment in ntp.conf and/or the startup file, explaining WHY stepping 
is enabled should go a long way toward solving the "dumbass" problem.




More information about the questions mailing list