[ntp:hackers] Dull Blade

Brian Utterback brian.utterback at sun.com
Mon Jul 25 07:21:50 PDT 2005


I just checked the code for ldterm:

http://cvs.opensolaris.org/source/xref/usr/src/uts/common/io/ldterm.c#1184

Apparently, you can also get a SIGINT from a BRK indication, unless 
IGNBRK option was set on the serial line. This would occur if there was
a power cycle or disconnect on the serial line. The location in ldterm
where the signal was generated does indeed correspond to this line of
code in the traceback, so we have a smoking gun. Alas, the gun should
have been loaded with blanks, since it appears that ignbrk is set by
refclock_setup.


Brian Utterback wrote:
> I have been mucking about on deacon with dtrace, and have discovered
> what is causing the exit. Apparently, one of the two refclocks on
> deacon send something on the serial port that the ldterm stream module
> interprets as an interrupt character. It then dutifully sends a
> SIGINT signal to ntpd, which likewise dutifully exits. I have no idea
> why one binary would be affected and not the other.
> 
> This suggests that:
> 
> 1. The serial lines have all such control character processing
> turned off.
> 
> 2. That most signals log some info to the log before causing ntpd
> to take the long dive.
> 
> David L. Mills wrote:
> 
>> Brian,
>>
>> Thanks for the tip. I took a look at the dtrace man page and quickly 
>> drowned. Seems as I need to read a lot of stuff. I'm neck deep in the 
>> book just now and will have to get back to it sometime after the 
>> manuscript deadline in September. Meanwhile, deacon will just have to 
>> coast and I'll run ntpdate from time to time.
>>
>> Dave
>>
>> Brian Utterback wrote:
>>
>>> This is just the kind of thing that dtrace was made for. Do you get a 
>>> core or anything? If not, it should
>>> be fairly simple to make a dtrace script that stops ntpd just before 
>>> it exits to get a stack.
>>>
>>> David L. Mills wrote:
>>>
>>>> Guys,
>>>>
>>>> I just did a complete rebuild from scratch on the backroom machines 
>>>> after finding suspicious behavior possibly due to Solaris 10 
>>>> upgrade. It went well on all the Solaris and FreeBSD machines except 
>>>> Solaris Blade 1500 deacon. I didn't change anything in the sources 
>>>> and the previous build Solaris 9 worked fine. However, and only on 
>>>> the Blade, the ntpd starts apparently successfully and then dies 
>>>> anywhere from a few seconds to several hours later with nothing in 
>>>> the log. This behavior happens only when the control terminal is 
>>>> detached. It runs forever under gdb and with the debug trace turned on.
>>>>
>>>> This behavior is not new. It has happened on several occasions with 
>>>> Linux. On previous occasions the problem went away by itself as 
>>>> sources were wiggled in various ways.
> 
> 
> 


-- 
blu

Remember when SOX compliant meant they were both the same color?
----------------------------------------------------------------------
Brian Utterback - OP/N1 RPE, Sun Microsystems, Inc.
Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom



More information about the hackers mailing list