[ntp:questions] Re: NTP stepping issue

Philip Homburg philip at pch.home.cs.vu.nl
Mon Oct 25 08:08:49 UTC 2004


In article <clep4d$mmo$1 at dewey.udel.edu>,
David L. Mills <mills at udel.edu> wrote:
>Philip Homburg wrote:
>> In article <cldtmo$gde$1 at dewey.udel.edu>,
>> David L. Mills <mills at udel.edu> wrote:
>>>It is 
>>>useful primarily at long poll intervals where errors are dominated by 
>>>the intrinsic frequency stability (wander) of the clock oscillator. At 
>>>shorter poll intervals the errors are dominated by phase errors due to 
>>>network and operating system latencies. The trick is to combine then in 
>>>an intelligent hybrid loop, as described on the NTP project page.
>> 
>> 
>> The strange thing is that the 500 ppm / 128 ms limit keeps popping up. 
>> I can understand strange limits closed-source / broken operating systems.
>> But somehow that limit is also present on open source operating systems.
>>
>I don't understand your comment. Are you saying the 500-PPM limit and/or 
>128-ms step threshold are strange? 

Yes. Round trip latencies on the Internet can easily exceed 128 ms. That
means that due to asymmetry on overloaded links, offsets of more than
128 ms can occur.

Furthermore, a clock that is disconnected from the net for more than a
day and that is 2 ppm or more off, will result in an error of more than
128 ms when the network connection is restored.

In my opinion, NTP should handle those situations gracefully instead of
stepping the clock.

When large offsets are to be corrected quickly, slew rates of more than
500 ppm are required, so NTP and the kernel interface should be
prepared to handle them.

>> The problem is of course time. The main experiment I want to do is 
>> black box testing of NTP implementations: create a reference clock with
>> a known distortion, feed it to the ntp implementation that is to be tested 
>> and then poll that implementation to compare the filtered time to
>> the input signal.
>>
>Yes, the problem is time, your time and mine. That's exactly why the 
>simluator was built. Testing things in vivo takes lots and lots of time, 
>  especially when testing for stability at long poll intervals. With the 
>simulator, testing over a week takes a few seconds. You can even turn on 
>debugging and file statistics, which is really useful in finding little 
>warts like you will be looking for.

As far as I understand, the simulator is built by linking the code of
the simulator to an implementation of a clock discipline algorithm.

I can see two serious problems with that approach:
1) The internal interfaces in my code are complete different from the
   interfaces in NTP. I can either create a version of my software with
   interfaces that match those in NTP or a I can adapt the simulator to
   my interfaces.

   Both case are undesirable. It takes time to create and maintain two sets
   of interfaces. 

   It is not clear whether the simulated results have any value. If I change
   the simulator it would be necessary to verify that the results are
   comparible to a simulation of NTP on an unchanged simulator.
   If I chance my code, I would have to verify that the simulated code
   corresponds to the real code.

2) I would have to verify that the model behind the simulator is 
   accurate enough. In a world with Windows doing weird things, with
   Linux kernels losing interrupts, etc. modeling the kernel/hardware
   is not trivial. 

>> As far as I know, such a blackbox test setup does not exist.
>
>Again, the black box test setup is the simulator. I make the case it is 
>a rigorous test, since the black box code really and truly is the same 
>code as runs in the daemon itself. 

That may be the case for NTP but it doesn't apply to other implementations
time synchronization that use the ntp packet format. 

I understand the value of a discreet event simulator when developing NTP.
But to answer the question whether a particular installation is accurate
and stable enough, it is in my opinion necessary to do 'in vivo' black-box
testing.


-- 
This Monk had first gone wrong when it was [...] cross-connected to a video
recorder that was watching eleven TV channels simultaneously, [...] The video
recorder only had to watch them, of course. It didn't have to believe them all
as well. This is why instruction manuals are so important    -- Douglas Adams



More information about the questions mailing list