[ntp:hackers] ntpd shm changes
hmurray at megapathdsl.net
Sat Mar 19 02:40:05 UTC 2011
Do we want to be having this discussion here or on questions/usenet where it
Do the gpsd guys want to see all of ntp's dirty laundry? (as compared to get
notified when we have a reasonable proposal that needs checking/testing)
I don't care what the answers are. I just want to have the discussion in the
right place and/or not clutter up mailboxes that don't want it.
> I am not suggesting changing the interpretation or protocol of mode 0 or
> mode 1, but in an imaginary mode 2 with volatile keywords on all the shm
> members, I think we can safely share the memory relying only on 32-bit
> operations on count being atomic.
The word "mode" is ambigious. It refers to either the mode keyword on the
server line that sets up the SHM refclock, or the mode which is the first
word of the chunk in shared memory as described in driver28.html
Juergen Perlinger is also working on this area. Check the hackers archives
from 9-feb or so.
Subject: Re: [ntp:hackers] SHM clock improvements
I can't convince myself that the current code will always work right. But I
haven't worked out a test case where it will screwup. I'm all in favor of
doing something as long as we do it right.
> Assuming all the shared memory struct variables are declared volatile,
> preventing the compiler from reordering accesses, I think we can dispense
> with the odd/even concept as well, and simply rely on count's update being
> atomic. That does imply we need to allocate 64 bits for count, so that
> access _is_ atomic on 64-bit, while using only the low order 32 bits from
> 32-bit code.
I think it's more complicated than that. (I used to work with wizards that
did this stuff. I don't claim to be much of a wizard in that area.)
The details may depend on your machine architecture. What works for x86
and/or IA-64 may not work for ARM or vice versa
The trick is that storing a new sample may take more than 1 atomic write. So
there are 3 states during an update: old, broken, new. The
flag/count/whatever needs to indicate that (potentially) broken state.
I like the idea of using an odd/even count. But things might be clearer if
the "broken" state was an explicit bit.
> > count++;
> > count |= 1;
> > ... do updating ...
> > count++;
I think the first count++ should be omitted. If it was even, the |= will
bump it. If it was odd (because the old provider crashed during an update)
the count++ might bump it to an even/valid state and the consumer could check
after the count++ and before the |=.
I think an important idea for this discussion is to figure out how to
document what a provider should do.
We have to consider the initialization step as well as the new_sample step.
Either side can restart at any time.
I think code samples are better than words. We probably need the words too,
but maybe they are comments in the code. I'll go as far as to propose that
ntp distribute either a module that can be compiled and linked in or a chunk
of code that can be #include-d
I think the API should be something like
These are my opinions, not necessarily my employer's. I hate spam.
More information about the hackers