[ntp:hackers] ntpd 4.2.8p1 startup behaviour

Ronan Flood ronan at nosc.ja.net
Thu Feb 12 11:47:28 UTC 2015


Harlan,

> > Note that "peer->disp" is no longer used -- should it be?
> > Putting it back in appears to return to the previous behaviour.
>
> See http://bugs.ntp.org/show_bug.cgi?id=2085

Hmm, complex and difficult.

Section 10 of RFC5905 says:

   The following scheme is used to ensure sufficient samples are in the
   filter and that old stale data are discarded.  Initially, the tuples
   of all stages are set to the dummy tuple (0, MAXDISP, MAXDISP, 0).
   As valid packets arrive, tuples are shifted into the filter causing
   old tuples to be discarded, so eventually only valid tuples remain.
   [...]
   The observer should note (a) if all stages contain the dummy tuple
   with dispersion MAXDISP, the computed dispersion is a little less
   than 16 s, (b) each time a valid tuple is shifted into the register,
   the dispersion drops by a little less than half, depending on the
   valid tuples dispersion, and (c) after the fourth valid packet the
   dispersion is usually a little less than 1 s, which is the assumed
   value of the MAXDIST parameter used by the selection algorithm to
   determine whether or not the peer variables are acceptable.


clock_select() checks for peer_unfit(), which has:

        /*
         * A distance error for a remote peer occurs if the root
         * distance is greater than or equal to the distance threshold
         * plus the increment due to one host poll interval.
         */
        if (!(peer->flags & FLAG_REFCLOCK) && root_distance(peer) >=
            sys_maxdist + clock_phi * ULOGTOD(peer->hpoll))
                rval |= TEST11;         /* distance exceeded */


sys_maxdist is MAXDISTANCE (1.5) by default, so non-refclocks need their
root distance to be below that or else they fail test11.

With "peer->disp" included in the calculation, you can see root_distance()
fall approx 16->8->4->2->1 as the first four packets come in, and only
then is it low enough to be acceptable.

As I read it, this test expects the four-poll peer-dispersion drop to
be factored-in, so looks like either root_distance() should include it,
or if that affects other things then peer_unfit() should do it directly.
Unless Dave Mills changed his mind about this requirement.


On a related issue, RFC5905 section 10 also says (snipped above):

   If the three low-order bits of the reach register are zero,
   indicating three poll intervals have expired with no valid packets
   received, the poll process calls the clock filter algorithm with a
   dummy tuple just as if the tuple had arrived from the network.


That's in transmit() in 4.2.6p5:

                /*
                 * Update the reachability status. If not heard for
                 * three consecutive polls, stuff infinity in the clock
                 * filter. 
                 */
                [...]
                peer->reach <<= 1;
                if (!(peer->reach & 0x0f))
                        clock_filter(peer, 0., 0., MAXDISPERSE);


In 4.2.8p1, the comment is still there but the 0x0f check has gone,
and clock_filter(MAXDISPERSE) is only called when the peer has been
unreachable for seven poll intervals.

-- 
Ronan Flood



More information about the hackers mailing list