[ntp:questions] Re: OS recomendations for stratum 2 clocks

Richard B. Gilbert rgilbert88 at comcast.net
Mon Sep 12 12:27:13 UTC 2005


Joseph Gwinn wrote:

>In article <p06200715bf49e0aff7e0@[10.0.1.210]>,
> brad at stop.mail-abuse.org (Brad Knowles) wrote:
>
>  
>
>>At 10:22 PM -0400 2005-09-10, Danny Mayer wrote:
>>
>>    
>>
>>>> There is one trick that never gets mentioned: On OSs that support this,
>>>> give the NTP daemon the highest realtime priority, exceeding even parts
>>>> of the OS, rather than its usual default priority.
>>>>        
>>>>
>>> I believe that NTP does this already in it's code.
>>>      
>>>
>>	It has the "-N" option, yes.  However, that is not used by 
>>default, although it can always be specified on the command line.
>>    
>>
>
>At least in Solaris and IRIX, one needs special permissions to use true  
>realtime priorities (where winner takes all), so the -N option by itself 
>may not suffice.
>
>
>  
>
>>	What I believe that Joseph Gwinn was talking about was running at 
>>actual real-time priority (rtprio), which causes ntpd to actually run 
>>at higher priority than most parts of the OS itself, which is a 
>>feature that only some OSes support.  
>>    
>>
>
>You are correct; that is what I was talking about.
>
>This can also be done with RTOSs like VxWorks. In one application ten 
>years ago, we ran a homebrew NTP responder at realtime priority under 
>VxWorks, answering polls from the NTP daemons in some workstations.  The 
>NTP responder had direct hardware access to the system's master clock, 
>and so was our Stratum 1.
>
>I believe that the commercial GPS-based NTP time servers work the same 
>way.
>
>
>  
>
>>He is right that this can be 
>>very useful, and can help eliminate lost clock interrupts and other 
>>sources of internal jitter, although it is rarely necessary.
>>    
>>
>
>Probability of necessity varies with application. 
>
>Right now I have a problem with a closed network where the computer 
>clocks sometimes get ten or twenty milliseconds out of synch, even 
>though they usually stay within a millisecond or so.  The LANs are very 
>lightly loaded, and the whole system would fit into a sphere 35 meters 
>in diameter, so transport delay isn't the issue.
>
>The problem is that other realtime activities (application code) in the 
>various servers is kicking the NTP daemons sidewise during heavy system 
>load.   The daemons are at default priority.  NTP cannot tell this from 
>real transport delay, randomly asymmetrical delay at that, so a lot of 
>really bad samples eventually leak through the median filter and corrupt 
>NTP's notion of the time offset to the master clocks.   NTP is actually 
>fairly resistant to this kind of abuse, but the application code is 
>sufficiently overloaded that the necessary abuse is often arranged.
>
>The immediate solution will have to be to promote the daemons to higher 
>realtime priority than that of those interfering other activities, but 
>the people responsible for those activities are likely to object (more 
>from fear than from thought, but ... the pressure is on).  Or, just live 
>with it.
>
>Joe Gwinn
>  
>
If these servers are running Windows, there's little hope!

If they are running some flavor of Linux and the clock tick rate is set 
to 1000 Hz, it can be changed to 100 Hz and the kernel rebuilt.   This 
cuts the opportunity to lose interrupts by a factor of ten.




More information about the questions mailing list