[ntp:hackers] Dull Blade

Brian Utterback brian.utterback at sun.com
Mon Jul 25 06:45:12 PDT 2005


I have been mucking about on deacon with dtrace, and have discovered
what is causing the exit. Apparently, one of the two refclocks on
deacon send something on the serial port that the ldterm stream module
interprets as an interrupt character. It then dutifully sends a
SIGINT signal to ntpd, which likewise dutifully exits. I have no idea
why one binary would be affected and not the other.

This suggests that:

1. The serial lines have all such control character processing
turned off.

2. That most signals log some info to the log before causing ntpd
to take the long dive.

David L. Mills wrote:
> Brian,
> 
> Thanks for the tip. I took a look at the dtrace man page and quickly 
> drowned. Seems as I need to read a lot of stuff. I'm neck deep in the 
> book just now and will have to get back to it sometime after the 
> manuscript deadline in September. Meanwhile, deacon will just have to 
> coast and I'll run ntpdate from time to time.
> 
> Dave
> 
> Brian Utterback wrote:
> 
>> This is just the kind of thing that dtrace was made for. Do you get a 
>> core or anything? If not, it should
>> be fairly simple to make a dtrace script that stops ntpd just before 
>> it exits to get a stack.
>>
>> David L. Mills wrote:
>>
>>> Guys,
>>>
>>> I just did a complete rebuild from scratch on the backroom machines 
>>> after finding suspicious behavior possibly due to Solaris 10 upgrade. 
>>> It went well on all the Solaris and FreeBSD machines except Solaris 
>>> Blade 1500 deacon. I didn't change anything in the sources and the 
>>> previous build Solaris 9 worked fine. However, and only on the Blade, 
>>> the ntpd starts apparently successfully and then dies anywhere from a 
>>> few seconds to several hours later with nothing in the log. This 
>>> behavior happens only when the control terminal is detached. It runs 
>>> forever under gdb and with the debug trace turned on.
>>>
>>> This behavior is not new. It has happened on several occasions with 
>>> Linux. On previous occasions the problem went away by itself as 
>>> sources were wiggled in various ways.


-- 
blu

Remember when SOX compliant meant they were both the same color?
----------------------------------------------------------------------
Brian Utterback - OP/N1 RPE, Sun Microsystems, Inc.
Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom



More information about the hackers mailing list