[ntp:questions] NTP slow to start correction after a drift

Unruh unruh-spam at physics.ubc.ca
Fri May 9 15:46:57 UTC 2008


Mike K Smith <mks-usenet at dsl.pipex.com> writes:

>Apologies for a long post, but I was unable to make it shorter.

>I have been monitoring timekeeping performance on an environment which
>contains 3 stratum 1 clocks and 4 Cisco routers running as stratum 2.
>The stratum 1s use time which is derived originally from GPS, but fed
>to the stratum 1 clocks via IRIG.

>The monitoring is carried out from a single Solaris system which takes
>time from all seven servers.

Why would you use a solaris system? AFAIK its kernel timeing routines are
primative. Use a Linux/BSD system.


>Normally all clocks show times within +/- 4ms, but every 7-8 days I
>see an event where all 7 clocks drift out by about 10-18 ms over a
>period of 2-3 hours before they are corrected.

Yee gads. With GPS time you should be withing usec, not msec. 


>I am interpreting this as being due to drift in the local clock on the
>Solaris box which is doing trhe monitoring, I would expect the stratum
>2 servers to lag the stratum 1s if the time on the stratum 1 servers
>was drifting due to some common-mode problem with their time
>reference.

>I am concerned about the length of time it takes before NTP starts
>correcting the local clock on the Solaris server.

>I have a graph which you can see at
><http://www.flickr.com/photos/36096832@N00/2477948892/sizes/o/in/
>set-72157604959850048/>

>The above graph shows offset against time for all seven clocks. An
>hour of steady state operation is shown before the beginning of the
>drift event, the system has been in steady state for some days prior
>to the drift event.

>The poll interval is initially 1024 seconds.

So nothing can be corrected in times less than may times 1024 sec ( ie
hours).
ntp is designed to make sure tht nothing happends on time scales shorter
than many times the poll interval to maintian stability.



>The drift event starts about an hour into the graph, the offset
>increases by about 15ms in about 2 hours (roughly 2ppm) then a
>correction is applied and the clock drifts back to zero offset at
>about the 3.5 hour mark.

>I am concerned that the drift went uncorrected for so long, and am
>trying to understand the cause.

ntp design.


>Is the clock-filter algorithm rejecting updated timestamps which are
>not the lowest of the most recent eight? From my reading of the book
>and the RFCs, this is what should happen, but that means that the
>clock can drift significantly before a new timestamp passes through
>the clock filter algorithm.

Yes. ntp only uses about 1/8 of the data. Ie your actual time span is about
3 hours. and ntp can only correct on time scales longer than that. Design
decision.





More information about the questions mailing list