[ntp:questions] Re: Clock selection?
David L. Mills
mills at udel.edu
Mon Jan 19 15:51:15 UTC 2004
The scenario you suggest has been the case many, many times over the
last 20 years, beginning with the DARPA Atlantic SATNET using the
Intelsat IV system and continuing with the UCAR satellite system, as
well as the original Chilean and Australian satellite links. The UCAR
and Australian links were one-way satellite, the other way landline or
undersea cable. The Atlantic SATNET had all sorts of combined
land/sea/satellite paths, some links of which paralleled one of the two
US-Moscow hotlines. (The other uses the Russian Molnyia satellite system
and Ft. Detrick uplink.) NTP was running in all those scenarios using a
variety of selection and clustering algorithms.
The DARPA guys and I have had lots and lots of experience with these
hybrid systems, as well as much experience using the NTP simulator and
space paths beteen moving spacecraft, Earth, Moon and Mars. This
experience was the most important single factor that influenced the
clustering algorithm design and its relative insensitivity to delays.
Once upon a time when the Internet was tiny and slow, delays could be a
useful quality metric, at least when satellite paths were not involved.
Such is no longer the case and now the most important metric is probably
jitter. In many cases reported to this group the major errors are due to
asymmetric paths, but in general it is not possible to measure and
correct errors of this type. However, it is possible to bound these
errors as a function of roundtrip delay as described in the briefing. If
indeed you wanted the lowest possible maximum-error bound rather than
the lowest possible expected (maximum-likelihood) error, then you would
use delay as metric, but the latter using jitter metric is the NTP
As for the original posting, you don't know that the winner selected by
the algorithm contravenes the intended design or not. Selection is a
dynamic process and can change from update to update. Every time a new
update is received the selection algorithm runs with the current
offset/jitter data. To conclude the clustering algorithm is or is not
losing its mind, you have to look at the peerstats data on the occasion
of each update and evaluate the same data the algorithm does. I have
done this many, many times and found many cases where the algorithm
seems to be doing odd things that under analysis were completely correct
according to principle. I'm not interested in replicating your suggested
scenario, since the original poster can do that so easily using the
peerstats data and the algorithms described in the briefing.
David J Taylor wrote:
> > David,
> > Not quite. While the list is ordered by increasing stratum/distance, it
> > could be and sometimes happens that the lowest one gets torpedoed as the
> > cluster is whittled down. Selection error as defined in the briefings is
> > the RMS offset differences between a each survivor and all the others.
> > Dave
> OK, thanks for that. So whilst it may all be working as intended, we do
> have the original poster's observation that the worst server was
> apparently chosen (but of course we didn't see the history of the
> algorithm at work that led up to that decision) and my own observation
> that if I have a mixture of landline and satellite servers, it seems to be
> the satellite servers which are chosen.
> May I suggest that if you have a spare box you might like to try a mixture
> of same-stratum landline (10-40ms delay) and satellite (>150ms) servers,
> see if you get the same effect, and see if it is in accordance with how
> the system should work.
More information about the questions