[ntp:hackers] Correctly detecting interfaces

Brian Utterback brian.utterback at oracle.com
Fri Jul 12 17:51:32 UTC 2013


On 7/12/2013 12:48 PM, Danny Mayer wrote:
> On 7/12/2013 11:54 AM, Danny Mayer wrote:
>> On 7/12/2013 10:54 AM, Brian Utterback wrote:
>>
>>> Speaking of which, is there any process to get the changes I made to fix
>>> bug 1485 pushed upstream to the ISC bind people for inclusion in
>>> subsequent release of BIND? Anyone have a contact?
> I noticed that you have not verified the bug as being fixed. I would
> also need details of the bug and the code that fixed it.
>
> Danny
>

Okay, no problem.

The issue is that the ifiter_ioctl.c file has a bunch of places where 
the ioctl call is used to query the system about the network interfaces. 
However the code assumes that the ioctl call cannot be interrupted, 
i.e., never returns EINTR. As a practical matter, this is probably 
mostly true, but it is certainly not guaranteed.

Reading the kernel structures may (and in Solaris, does) involve 
acquiring some locks on the structures. If there is a delay obtaining 
the locks the kernel may temporarily cause the thread to sleep while 
waiting for the lock. If the SIGALRM signal fires at this point the 
thread will be interrupted and will return EINTR instead of the correct 
data. Since the EINTR return is not accounted for in the library code, 
it treats this like any other error return. This means that in some 
cases the error is noted and the program aborts or the incorrect data 
induces a program error and the program again aborts.

As is the case with most syscalls that can return EINTR, the correct way 
of handling it is to simply try again. Of course we don't necessarily 
want to keep trying indefinitely.

Rather than rewrite every instance of calling ioctl to include a test 
for EINTR and loop to try again, I wrote a new function (isc_ioctl) 
which is simply a wrapper function around the regular ioctl call. This 
wrapper function incorporates such a test and loop. I used a "three 
strikes and you're out" rule, but any number could be used. Probably two 
tries is really enough, since the interrupt could come anytime after the 
ioctl is called the first time, but a second failure would imply that 
the call is taking more than one second to complete and it might never 
complete without an interrupt if that state is persistent.

That's all there is to it.

Brian Utterback


More information about the hackers mailing list