[ntp:bugs] [Bug 596] ntpd dies after 2 days

bugzilla at ntp.isc.org bugzilla at ntp.isc.org
Sun Apr 16 23:13:28 UTC 2006


http://bugs.ntp.isc.org/show_bug.cgi?id=596



----------------------------------------------------------------------------
Additional Comments From stenn at ntp.org (Harlan Stenn)
Submitted on 2006-04-16 23:13
Subject: Re: [ntp:bugs]  ntpd dies after 2 days 

Shenanigans.

> Just to be clear, the full_recvbuffs() function is not thread-safe as it does
> not take out a lock before returning the value.

It does not need to take out a lock to read the value.  In this case it
does not matter - the worst that could happen is that a new item will be
added to the list but the counter will not have been incremented from 0
to 1, in which case the encompassing loop will restart and we'll catch
the incremented value then (and process the buffer).

> By contrast the get_full_recv_buffer() is thread safe as it takes out
> a lock first before removing the buffer from the list. There's too
> much overhead to use the locks just to return the value.

You are, IMO, incorrectly mixing things together here.  We want the "Is
there more to do" question to be answered quickly and easily.  That is
separate from the "do some work" question.

And while I have not seen your new patch, your old one seemed
incomplete in that it left comments in the code that are wrong after
your patch, and it left code in place that you argue is broken.

get_full_recv_buffer() is the only place that decrements full_recvbufs,
and it does so inside a LOCK.

There are several other places in the loop in ntpd.c that check
full_recvbuffs().

I believe there are 2 issues going on here, one a thread issue with
windows, and a signalled IO issue with *IX.

For Windows, we need to be sure we have LOCK/UNLOCK calls in the right
places.

For the appropriate *IX systems, we need BLOCKIO/UNBLOCKIO calls in the
right places.

I might be wrong about something here, but if so, I bet I'm not wrong by
much.

H



-- 
Harlan Stenn <stenn at ntp.org>



------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


More information about the bugs mailing list