[ntp:questions] very slow convergence of ntp to correct time.

Eric nospam-01 at jensenresearch.com
Mon Jan 28 16:47:44 UTC 2008

On Sun, 20 Jan 2008 17:50:41 GMT, Unruh <unruh-spam at physics.ubc.ca> wrote
for the entire planet to see:

>david at ex.djwhome.demon.co.uk.invalid (David Woolley) writes:
>>In article <Hiqkj.8953$yQ1.2617 at edtnps89>,
>>Unruh <unruh-spam at physics.ubc.ca> wrote:
>>> I would assume that ntp is giving these samples with long round trip very low weight, or even
>>> eliminating them.
>>Note: if these spikes are positive, they may be the result of lost ticks.
>Don't think so. I think they are 5-10ms transmission delays. The delays disappear if I run at
>maxpoll 7 rather than 10, so I suspect the router is forgetting the
>addresses and taking its own sweet time about finding them if the time
>between transmissions is many minutes.
>chrony has a nice feature of being able to send an
>echo datagram to the other machine if you want (before the ntp packet), to
> wake up the routers along the way. 

There are several related effects here that I have experienced in my NTP

First is the possible ARP resolution overheads.  If the IP addresses of
your host and of the destination or default gateway are not passing traffic
frequently the ARP cache in your host or the local router can time out and
need to be reloaded on each poll.  These can be on the order of 5-10ms and
will affect only one side of the transaction's transmission delay. 

Unfortunately ARP often uses a 15 minute TTL, and default NTP uses a 17
minute poll interval.  

Then there is the whole problem that many routers all along the path
experience extra overhead on the first packet of a "flow".  Route table
look ups are done by destination IP of course, but generally have to be
installed into the cache, or FIB, the first time a new source/dest IP pair
shows up.  This is often a 1-3ms overhead.  And that entry doesn't last
forever either.

Then there is the MAC cache in your switches, which generally purge after
1-5 minutes.  This can often be adjusted higher, but that can sometimes
cause issues for others when they are reconfiguring part of the network.

Another issue is NATing or statefull firewalls.  There is often outbound
(or inbound) connection setup time.  Without special configuration this
often "times out" before twenty minutes, leading to more asymmetric delay.

I think the suggestion of a pre-poll ICMP echo is kinda interesting.  It
might be possible to limit the packet TTL to five hops or so, just "warming
up" your side of the network.  It might also be better to make it a mostly
standard UDP NTP packet so it matches whatever "rules" the intermediate
devices are applying (and you want them to remember).  QoS and policy
routing are both sensitive to port numbers, and certainly most firewalls
are protocol sensitive, so matching the initial packet attributes to the
desired high-performance packet attributes would probably help this
technique work.

To mitigate some of these effects it might not have to be done that often.
In many hierarchical network topologies it might serve just to send one
extra packet every 3-5 minutes using the same source IP/port that NTP
normally uses, to any configured server.  And it could still have a limited
TTL if desired.  That would at least keep the switch and ARP caches fresh
and depending on the design, the policy and NAT caches as well.  

- Eric

More information about the questions mailing list