[ntp:questions] Re: NTP clients not throttling back is this behaviour RFC compliant?

David L. Mills mills at udel.edu
Thu Dec 4 02:39:04 UTC 2003


Terje Mathisen wrote:
> 
> David L. Mills wrote:
> 
> > Terje,
> >
> > You describe in fact what I did for the Alpha kernel now in Tru64. On
> > the assumption not all processors on a single machine may not run at the
> > same clock rate, each one has to be disciplined separately, in my case
> > relative to the timer seconds rollover. As you cannot predict which
> > processor actually reads the time and each one has its own idiosyncratic
> > PCC (ALpha TSC), each processor has its own estimators and all have to
> > be disciplined once per second using interprocessor interrupts.
> 
> _Very_ nice!
> 
> With just a single sync/sec, the overhead should still stay way down in
> the noise, right?
 
...
 
> I'd consider stuff like a globally shared variable containing the last
> results from a timestamp request, but this would _definitely_ introduce
> a bad chokepoint. How did you handle it?

Well, the kernel time variable is shared and is not hard to update as an
atomic action in a 64-bit machine, as long as only one processor runs
the timer interrupt routine. The problem is all those cantankerous PCCs,
which swish and sway maybe a few dozen nanoseconds each second. On the
other hand, the only things the PCCs have to do is to interpolate the
tick at a rate determined once per second and all have pretty much the
same (corrected) value at each time during the tick. There must be a
base value to count from, of course, and that's what the interprocessor
interrupt is for.

The bottom line is classic Lamport: take the latest value found for each
read and force the time to be at least one nanosecond later for each
read. This will break, of course, should some process attempt to read
the clock more than once in each PCC tick. Not likely.

> >
> > I didn't see the need for further filtering the PCC estimates, but if I
> > did I wouldn't use simple averaging. A median filter is much better for
> > tossing out popcorn spikes. What I find scary now is the little things I
> 
> I've thought a lot about this, and I'd really like to see a comparison
> with the result of taking just the _earliest_ stamp in a series, instead
> of the median.
> 
> My thinking is like this: Barring hw/sw bugs, the earliest timestamp
> should correspond to the sample with the lowest interrupt latency, right?
> 
> If so, then this sample is the best estimate of the real time a PPS
> signal actually occured.
> 
> OTOH, the median (or maybe a smaller percentile like 10-25) might behave
> in a more repeatable manner?

Well, you can't do an awful lot in a clock read for the same reason the
nanokernel can't do a lot with only a three-stage median filter. Shift
and sort is not trivial for longer lengths. And, taking the minimum is
not possible, since we are measuring actual times and not time
differences.

...

> The code I posted would take less than 20 ns to return a timestamp,
> Linus will probably accept something like it for Linux to allow a
> fast-path/userland timestamp.

It has been suggested that a cleaner approach to clockwatching is to let
a counter run continuously and provide an computable offset to add
during a very quick tour of the kernel or perhaps in a shared segment.
Using a shared segment would not require a kernel tour, of course.
Scratch that awhile and you bump up against the same issued trying to
keep the offset(s) current for each processor and avoid glitches. You
just face the same problems in different places. My approach was to
provide nanosecond time without major redesign of the existing kernel
code. On the other hand, Solaris has a quite different design, not
better or worse, just different.

> >
> > Fifteen years ago the ambitions goal was to the millisecond and that
> > required special hardware for the Fuzzball. Now it is to the nanosecond,
> > which is a thousandfold improvement in resolution and we do that even
> 
> Actually, ns is a millionfold better than ms, but you know that.
> 
> > with junkbox PCs.
> 
> Those junkboxes are indeed limited to about 1 us, which makes the
> previous 1000 factor correct.
> :-)

Be careful when you say that. Indeed, the PPS signal connected to either
a serial or parallel port has inherent jitter from 1 us (ISA bus) to 20
us (UART), but has a wonderful uniform probability which yields nicely
to the kind of integration used in the NTP discipline loops in either
ntpd or the kernel. So long as a PCC or TSC is available and the gods of
wiggle and wander cooperate, timekeeping can be very good. See the PTTI
papers at www.eecis.udel.edu/~mills/papers.html and for that matter the
nanokernel distribution.

Dave



More information about the questions mailing list