[ntp:questions] Re: OS recomendations for stratum 2 clocks
Joseph Gwinn
JoeGwinn at comcast.net
Fri Sep 16 23:03:56 UTC 2005
In article <5M-dnfDyh5qKubreRVn-rA at comcast.com>,
"Richard B. Gilbert" <rgilbert88 at comcast.net> wrote:
> Joseph Gwinn wrote:
>
> >In article <8c-dnZ2dnZ39fdy0nZ2dnb_puN6dnZ2dRVn-yZ2dnZ0 at comcast.com>,
> > "Richard B. Gilbert" <rgilbert88 at comcast.net> wrote:
> >
> >
> >
> >>Joseph Gwinn wrote:
> >>
> >>
> >>
> >>>In article <p06200715bf49e0aff7e0@[10.0.1.210]>,
> >>>brad at stop.mail-abuse.org (Brad Knowles) wrote:
> >>>
> >>>
> >[snip]
> >
> >
> >>>Probability of necessity varies with application.
> >>>
> >>>Right now I have a problem with a closed network where the computer
> >>>clocks sometimes get ten or twenty milliseconds out of synch, even
> >>>though they usually stay within a millisecond or so. The LANs are very
> >>>lightly loaded, and the whole system would fit into a sphere 35 meters
> >>>in diameter, so transport delay isn't the issue.
> >>>
> >>>The problem is that other realtime activities (application code) in the
> >>>various servers is kicking the NTP daemons sidewise during heavy system
> >>>load. The daemons are at default priority. NTP cannot tell this from
> >>>real transport delay, randomly asymmetrical delay at that, so a lot of
> >>>really bad samples eventually leak through the median filter and corrupt
> >>>NTP's notion of the time offset to the master clocks. NTP is actually
> >>>fairly resistant to this kind of abuse, but the application code is
> >>>sufficiently overloaded that the necessary abuse is often arranged.
> >>>
> >>>The immediate solution will have to be to promote the daemons to higher
> >>>realtime priority than that of those interfering other activities, but
> >>>the people responsible for those activities are likely to object (more
> >>>
> >>>
> >>>from fear than from thought, but ... the pressure is on). Or, just live
> >>
> >>
> >>>with it.
> >>>
> >>>Joe Gwinn
> >>>
> >>>
> >>>
> >>>
> >>If these servers are running Windows, there's little hope!
> >>
> >>
> >
> >True enough.
> >
> >But no; they are running SGI IRIX on the servers where the sideways
> >kicking happens. No Windows in this drama.
> >
> >
> >
> >
> >>If they are running some flavor of Linux and the clock tick rate is set
> >>to 1000 Hz, it can be changed to 100 Hz and the kernel rebuilt. This
> >>cuts the opportunity to lose interrupts by a factor of ten.
> >>
> >>
> >
> >Which helps, but isn't close to a solution. There should be *no* lost
> >timer interrupts.
> >
> >
> >Joe Gwinn
> >
> >
> Then I think you need to talk to Silicon Graphics about it. If it's a
> bug they may be able to patch it. If, as seems likely, it's an O/S
> design issue, the fix may require a lot of time and resources.
The lost interrupts were in Linux, not IRIX (which appears to have a
1024-Hz RT timer interrupt). I was reacting to the proposal that one
drop the timer interrupt rate in Linux: while this will certainly reduce
the number of lost interrupts, the root problem remains uncorrected.
Joe Gwinn
Joe
More information about the questions
mailing list