[ntp:hackers] ntpd shm changes

Dave Hart hart at ntp.org
Sat Mar 19 01:25:10 UTC 2011


2011/3/18 Håkan Johansson <f96hajo at chalmers.se>:
> On Fri, 18 Mar 2011, Dave Hart wrote:
>
>> [cross-posted to hackers at ntp and gpsd-dev@ by Dave Hart]
>>
>> Additionally, I noticed the comments in our driver are one-sided,
>> describing only instructions for ntpd's interaction with the shared
>> memory without spelling out the writer's contract.  Harlan raised a
>> concern about writers getting it wrong, and it took me a few guesses
>> to infer the right steps for a writer before resorting to gpsd source
>> code to to confirm.  In short, writers increment count, then the
>> payload data, then increment count again, then set valid.  Readers
>> fail if valid is zero, otherwise store count, copy payload, compare
>> count, fail if they don't match.  Whether count matches or not, valid
>> is cleared.
>
> As ntpshm stands right now, both valid and count are needed. Note how gpsd
> also clears valid before the first count update, avoids the race pointed out
> by tz.  (Missing in gpsd is memory barriers for SMP.)
>
> http://lists.berlios.de/pipermail/gpsd-dev/2009-November/007006.html

Thank you for pointing that out, Hakan.  The version of gpsd source I
glanced at did not clear valid.

I am not suggesting changing the interpretation or protocol of mode 0
or mode 1, but in an imaginary mode 2 with volatile keywords on all
the shm members, I think we can safely share the memory relying only
on 32-bit operations on count being atomic.

>> I'm not sure valid is achieving anything.  Readers don't read when
>> it's zero, but that seems redundant to me with the count checking.
>> Also I don't like the reader resetting valid to 0 -- that shuts down
>> the possibility of multiple readers, and for what gain?
>
> Probably to avoid using the same sample twice?  But this can be checked by
> either count, or the time-stamp itself.  I also cannot see anything improved
> by having the reader clear valid.

Agreed, we need to be sure to not use the same sample twice but valid
isn't the only way.

>> valid is
>> cleared even when the reader abandoned the attempt because it was
>> mid-update, which seems particularly ill-advised to me.
>>
>> There are missing volatile qualifiers, as far as I see, the shared
>> memory should be accessed solely through volatile pointers, as it is
>> being modified by code outside the scope of the compiler's
>> optimizations.
>>
>> To facilitate correctness and avoid duplication of code, I'd like to
>> see public header file lay out the structure and provide inline code
>> for producer and consumer.  That would be eased by having not just one
>> structure, but a payload structure used by the producer and consumer
>> and a container structure with mode, count, and if needed, valid along
>> with an embedded payload struct.
>>
>> Do you believe we need both valid and count?  Can you think of other
>> changes or extensions that might be an improvement in some way?
>
> One could use only count, if one in addition says that an odd count means
> 'being updated' and an even count 'is valid'.  To allow for a previous crash
> during update, the increment before the update would ensure the count to be
> odd.
>
> count++;
> count |= 1;
> ... do updating ...
> count++;

Assuming all the shared memory struct variables are declared volatile,
preventing the compiler from reordering accesses, I think we can
dispense with the odd/even concept as well, and simply rely on count's
update being atomic.  That does imply we need to allocate 64 bits for
count, so that access _is_ atomic on 64-bit, while using only the low
order 32 bits from 32-bit code.

> Also nice would be if the consumer (ntpd) would write some information of
> its current status (stratum, active source, ... - basically what is in a NTP
> network packet) to also allow the interface to be used for alternative
> transmission methods, such as PPS-like over a serial line.

Feel free to prototype something.  It may be worthwhile to consider
this as a general ntpd capability not a feature restricted to
refclock_shm users.

Thanks,
Dave Hart


More information about the hackers mailing list