[ntp:hackers] ntpd shm changes

Hal Murray hmurray at megapathdsl.net
Sat Mar 19 02:40:05 UTC 2011


Do we want to be having this discussion here or on questions/usenet where it 
started?

Do the gpsd guys want to see all of ntp's dirty laundry?  (as compared to get 
notified when we have a reasonable proposal that needs checking/testing)

I don't care what the answers are.  I just want to have the discussion in the 
right place and/or not clutter up mailboxes that don't want it.

----------

> I am not suggesting changing the interpretation or protocol of mode 0 or
> mode 1, but in an imaginary mode 2 with volatile keywords on all the shm
> members, I think we can safely share the memory relying only on 32-bit
> operations on count being atomic. 

The word "mode" is ambigious.  It refers to either the mode keyword on the 
server line that sets up the SHM refclock, or the mode which is the first 
word of the chunk in shared memory as described in driver28.html

----------

Juergen Perlinger is also working on this area.  Check the hackers archives 
from 9-feb or so.
  Subject: Re: [ntp:hackers] SHM clock improvements

I can't convince myself that the current code will always work right.  But I 
haven't worked out a test case where it will screwup.  I'm all in favor of 
doing something as long as we do it right.


> Assuming all the shared memory struct variables are declared volatile,
> preventing the compiler from reordering accesses, I think we can dispense
> with the odd/even concept as well, and simply rely on count's update being
> atomic.  That does imply we need to allocate 64 bits for count, so that
> access _is_ atomic on 64-bit, while using only the low order 32 bits from
> 32-bit code. 

I think it's more complicated than that.  (I used to work with wizards that 
did this stuff.  I don't claim to be much of a wizard in that area.)

The details may depend on your machine architecture.  What works for x86 
and/or IA-64 may not work for ARM or vice versa

The trick is that storing a new sample may take more than 1 atomic write.  So 
there are 3 states during an update: old, broken, new.  The 
flag/count/whatever needs to indicate that (potentially) broken state.

I like the idea of using an odd/even count.  But things might be clearer if 
the "broken" state was an explicit bit.

----------

>  > count++;
>  > count |= 1;
>  > ... do updating ...
>  > count++;

I think the first count++ should be omitted.  If it was even, the |= will 
bump it.  If it was odd (because the old provider crashed during an update) 
the count++ might bump it to an even/valid state and the consumer could check 
after the count++ and before the |=.

---------

I think an important idea for this discussion is to figure out how to 
document what a provider should do.

We have to consider the initialization step as well as the new_sample step.  
Either side can restart at any time.

I think code samples are better than words.  We probably need the words too, 
but maybe they are comments in the code.  I'll go as far as to propose that 
ntp distribute either a module that can be compiled and linked in or a chunk 
of code that can be #include-d

I think the API should be something like
  open/start/init(blah...)
  new_sample(xxx)
  close...



-- 
These are my opinions, not necessarily my employer's.  I hate spam.





More information about the hackers mailing list