[ntp:questions] NTP vs RADclock?
julien at synclab.org
Thu Jun 7 07:37:13 UTC 2012
On Thursday, June 7, 2012 4:19:31 PM UTC+10, unruh wrote:
> On 2012-06-07, skillzero at gmail.com <skillzero at gmail.com> wrote:
> > On Jun 5, 6:46?pm, Julien Ridoux <jul... at synclab.org> wrote:
> >> On Tuesday, June 5, 2012 9:12:42 AM UTC+10, E-Mail Sent to this address will be added to the BlackLists wrote:
> > Thanks for response. It would be great to have a simplified
> > description of the algorithm. I've read most of the docs on the
> > synclab site.
That is a very impressive effort :)
> > I'm trying to synchronize several devices to a reference
> > clock of a similar device rather than to a true wall clock time of a
> > real NTP server. I can't use the RADclock code directly because it's
> > GPL so I'd like distill the algorithm down to something I can
> > implement from scratch. I'd like to adapt my current NTP client and
> > server code to use RADclock so I can compare performance. I'm aiming
> Why not just use the reference implimentation of radclock for the tests?
The next version of RADclock is likely to be released under a BSD licence. This should save you the trouble of reimplementing the algorithm.
> > for 10 microseconds or less of sync on a wireless LAN with a fast
> Not going to work.
> > initial sync time (less than 5 seconds, but I can send many packets
> In 5 sec you could set the time, but you cannot get a good estimate of
> the rate.
> > close together if needed). I haven't been able to get even close to
> > this with my current NTP code (Ethernet is close, but WiFi introduces
> > a lot of variability). Some of the key points seem to be:
> Precisely. And Radclock will not help with that. It uses the same ( or
> perhaps much slower) synchronization algorithm.
I agree with Unruh, you are going to bump into quite a few issues and the much noisier characteristic of the WiFi channel makes it much harder for any synchronisation algorithm. RADclock however does a pretty good job at filtering noise but there is no magic involved. If the overall noise characteristics of all packets is bad, then you cannot do much about it.
I have run RADclock over WiFi, but I would not dare quoting any achievable clock error value at this stage: we do not have an independent hardware time reference over WiFi at this stage so we would have to rely on more convoluted setups to do an objective assessment of RADclock there. I'll put this on my TODO list.
> > 1. Feed forward: Use raw TSC instead of the time value adjusted by the
> > algorithm. Is there more to this? Should raw TSC values be used for
> > offset sync calculations or should the raw TSC be scaled by the latest
> > adjusted rate (which would seem to be partially feed-back vs fully
> > feed forward)?
> The "time" of the remote clock is measured with the TSC/HPET/.... raw
> clock. The packets exchanged are filtered to get the best ones (takes
> time) and the rate and offset of the TSC calculated. If you want a
> difference clock, only the rate is used. to make the transfer from TSC
> differences to time differences.
Actually, the time of the local clock is measured in counter units (that is Ta and Tf in NTP jargon). The RTT of NTP request/reply is measured in counter units, and filtering of RTTs is made in counter units. This prevents new inputs to the algorithm from being "polluted' by wrong estimates of drift based on past NTP exchanges. This is a simplified example of the feed-forward approach.
Of course, raw counter values have to be converted to time in seconds at one stage, but this is done when a consumer of time request the creation of a timestamp (a kernel sub-system or a userland application).
Note that you do not have to convert the counter values. You can store them, as well as radclock outputs and delay the conversion.
> > 2. Separate offset sync from rate sync. What I wasn't sure about is
> > why the rate sync would be so much more stable than offset sync since
> > they seem to both come from the same packet exchanges. Is the
> > advantage that the rate sync is smoothed over time whereas the offset
> > sync isn't?
> If you correct the clock for offset, the only way you can do that is to
> somehow shift the time. Thus if you want the difference in times from t0
> to t2 when you corrected the offset at t1, that offset correction will
> affect the time difference between t2 and t0. Thus, if you only correct
> determine the rate, the difference in raw times times the rate will
> directly give you the time difference.
Unruh's explication is pretty much accurate. I would add that the algorithm is layered. The rate is estimated first in a very reliable way (that is the easy task) and is used for the difference clock, where drift error cancel out if the interval measured is small enough. For the purpose of absolute time (that is the hard bit, and as Unruh pointed out, very much error prone), the drift estimation algorithm is built on top of the rate estimate one.
Last piece of information: the rate estimation has been designed to change very slowly to guarantee that the case describe by Unruh does not happen (again if you measure small intervals).
A little disclaimer, the 'rate' defined in radclock is computed and defined in a very different way than the way ntpd does. Taking the risk to oversimplify, radclock sees the rate and drift as 2 different components of the clock. Radclock does not 'adjust the rate'. I am well aware that this can create confusion, but it is hard to come up with a better wording. Please refer to the papers mentioned earlier for the nitty gritty details.
> > 3. Use only the lowest RTT samples. NTP also does this so I wasn't
> > sure where the advantage was here.
> Their algorithm seems, in the hints in the paper, to be more
> sophisticates/complex than the simple "best of 8" algorithm of NTP.
Indeed. Radclock maintains a much longer history of NTP exchanges. This is particularly useful to provide a stable rate estimate and adjust the minimum RTT filtering if it 'jumps up'. A typical case would be a network reconfiguration where the previous minimum RTT is never observed again.
The obvious tradeoff here is memory. However, on Linux and FreeBSD, radclock and ntpd have comparable memory footprint. I am well aware that this could be a limitation for embedded systems, and something we will work on if/when someone shows interest with such needs.
Coming back to one of Unruh's comment, the algorithm is not "slower" than ntpd. Unruh, I am actually not sure what you meant by this, but if that clarifies things, radclock does need to wait for the history to fill in to provide rate and drift estimates.
> > 4. Weight samples by their RTT. There's some mention of this in the
> > documents, but I wasn't sure what the algorithm was for determining
> > the weights or adapting them as conditions change (e.g. if network
> > load increased for a long period of time, which would also cause RTT's
> > to increase due to queuing in routers, etc.).
> Then those samples would be thrown out, and the clock allowed to
> freewheel until the samples settled down again.
Pretty much. The weight reflect the increase of RTT and translate into a quality measure. See page 9 of "Robust Synchronization of Absolute and Difference Clocks over Networks" for the details of the exponential weights assigned.
Then, there are a few different cases. If the input is very bad, then freewheel for a little while (based on older but good input) is better than believing bad input. But of course, after a long time, poor input is better than on input at all. This is really getting into the details of the algo, calibration of quality thresholds and the like. Here I would refer you to the code for the details.
> > Are there other key aspects of the algorithm that I'm missing?
I think these are the key points.
> I do not know it, the above are my conclusions from reading the docs I
> have read. They could be all wrong.
Unruh, you definitely did a pretty good job ;-).
More information about the questions