[ntp:questions] Re: OS recomendations for stratum 2 clocks
Joseph Gwinn
JoeGwinn at comcast.net
Tue Sep 13 02:53:15 UTC 2005
In article <8c-dnZ2dnZ39fdy0nZ2dnb_puN6dnZ2dRVn-yZ2dnZ0 at comcast.com>,
"Richard B. Gilbert" <rgilbert88 at comcast.net> wrote:
> Joseph Gwinn wrote:
>
> >In article <p06200715bf49e0aff7e0@[10.0.1.210]>,
> > brad at stop.mail-abuse.org (Brad Knowles) wrote:
[snip]
> >Probability of necessity varies with application.
> >
> >Right now I have a problem with a closed network where the computer
> >clocks sometimes get ten or twenty milliseconds out of synch, even
> >though they usually stay within a millisecond or so. The LANs are very
> >lightly loaded, and the whole system would fit into a sphere 35 meters
> >in diameter, so transport delay isn't the issue.
> >
> >The problem is that other realtime activities (application code) in the
> >various servers is kicking the NTP daemons sidewise during heavy system
> >load. The daemons are at default priority. NTP cannot tell this from
> >real transport delay, randomly asymmetrical delay at that, so a lot of
> >really bad samples eventually leak through the median filter and corrupt
> >NTP's notion of the time offset to the master clocks. NTP is actually
> >fairly resistant to this kind of abuse, but the application code is
> >sufficiently overloaded that the necessary abuse is often arranged.
> >
> >The immediate solution will have to be to promote the daemons to higher
> >realtime priority than that of those interfering other activities, but
> >the people responsible for those activities are likely to object (more
> >from fear than from thought, but ... the pressure is on). Or, just live
> >with it.
> >
> >Joe Gwinn
> >
> >
> If these servers are running Windows, there's little hope!
True enough.
But no; they are running SGI IRIX on the servers where the sideways
kicking happens. No Windows in this drama.
> If they are running some flavor of Linux and the clock tick rate is set
> to 1000 Hz, it can be changed to 100 Hz and the kernel rebuilt. This
> cuts the opportunity to lose interrupts by a factor of ten.
Which helps, but isn't close to a solution. There should be *no* lost
timer interrupts.
Joe Gwinn
More information about the questions
mailing list