[ntp:questions] NTP slow to start correction after a drift

Mike K Smith mks-usenet at dsl.pipex.com
Mon May 12 07:54:26 UTC 2008


On 9 May, 16:46, Unruh <unruh-s... at physics.ubc.ca> wrote:

> Why would you use a solaris system? AFAIK its kernel timeing routines are
> primative. Use a Linux/BSD system.
This is an existing system which I don't have the means to change even
if I felt that Solaris were somehow intrinsically inferior to Linux or
BSD. I have worked with Solaris for a long time.

> >Normally all clocks show times within +/- 4ms, but every 7-8 days I
> >see an event where all 7 clocks drift out by about 10-18 ms over a
> >period of 2-3 hours before they are corrected.
>
> Yee gads. With GPS time you should be withing usec, not msec.
The median time for each clock measured over the course of a week has
an offset within microseconds. The 1% and 99% centiles are around -4ms
and +4ms, again measured over a week.

I'll try to look into the causes of dispersion later, slow drift
correction is a bigger and more immediate problem.

>
> >I am interpreting this as being due to drift in the local clock on the
> >Solaris box which is doing trhe monitoring, I would expect the stratum
> >2 servers to lag the stratum 1s if the time on the stratum 1 servers
> >was drifting due to some common-mode problem with their time
> >reference.
> >I am concerned about the length of time it takes before NTP starts
> >correcting the local clock on the Solaris server.
> >I have a graph which you can see at
> ><http://www.flickr.com/photos/36096832@N00/2477948892/sizes/o/in/
> >set-72157604959850048/>
> >The above graph shows offset against time for all seven clocks. An
> >hour of steady state operation is shown before the beginning of the
> >drift event, the system has been in steady state for some days prior
> >to the drift event.
> >The poll interval is initially 1024 seconds.
>
> So nothing can be corrected in times less than may times 1024 sec ( ie
> hours).
> ntp is designed to make sure tht nothing happends on time scales shorter
> than many times the poll interval to maintian stability.

I knew that NTP is bised towards long-term stability, but I hadn't
realised that it was quite that inflexible, I had expected that the
poll interval would decrease more rapidly in the event of drift.

> >The drift event starts about an hour into the graph, the offset
> >increases by about 15ms in about 2 hours (roughly 2ppm) then a
> >correction is applied and the clock drifts back to zero offset at
> >about the 3.5 hour mark.
> >I am concerned that the drift went uncorrected for so long, and am
> >trying to understand the cause.
>
> ntp design.
>
> >Is the clock-filter algorithm rejecting updated timestamps which are
> >not the lowest of the most recent eight? From my reading of the book
> >and the RFCs, this is what should happen, but that means that the
> >clock can drift significantly before a new timestamp passes through
> >the clock filter algorithm.
>
> Yes. ntp only uses about 1/8 of the data. Ie your actual time span is about
> 3 hours. and ntp can only correct on time scales longer than that. Design
> decision.

Thanks for the comments. As with the use of Solaris, I don't have the
option to throw out NTP and replace it with something else, so I have
to try to make the best use of it.

Looks like I should be reducing maxpoll. I guess the design of NTP is
optimised for clocks with predictable drift rates, and a sudden
variation in drift rate takes longer to correct.

I would appreciate comments from other regulars who are more closely
linked with the development and maintenance of NTP, too.

Thanks,

Mike







More information about the questions mailing list