[ntp:questions] very slow convergence of ntp to correct time.

David L. Mills mills at udel.edu
Mon Jan 28 19:19:12 UTC 2008


Eric,

Many years ago the Proteon routers dropped the first packet after the 
cache timed out; that was a disaster. That case and the ones you 
describe are exactly what the NTP burst mode is designed for. The first 
packet in the burst carves the caches all along the route and back. The 
clock filter algorithm tosses it out in favor of the remaining packets 
in the burst. No ICMP is needed or wanted.

Dave

Eric wrote:
> On Sun, 20 Jan 2008 17:50:41 GMT, Unruh <unruh-spam at physics.ubc.ca> wrote
> for the entire planet to see:
> 
> 
>>david at ex.djwhome.demon.co.uk.invalid (David Woolley) writes:
>>
>>
>>>In article <Hiqkj.8953$yQ1.2617 at edtnps89>,
>>>Unruh <unruh-spam at physics.ubc.ca> wrote:
> 
> <snip>
> 
>>>>I would assume that ntp is giving these samples with long round trip very low weight, or even
>>>>eliminating them.
>>
>>>Note: if these spikes are positive, they may be the result of lost ticks.
>>
>>Don't think so. I think they are 5-10ms transmission delays. The delays disappear if I run at
>>maxpoll 7 rather than 10, so I suspect the router is forgetting the
>>addresses and taking its own sweet time about finding them if the time
>>between transmissions is many minutes.
>>chrony has a nice feature of being able to send an
>>echo datagram to the other machine if you want (before the ntp packet), to
>>wake up the routers along the way. 
> 
> 
> There are several related effects here that I have experienced in my NTP
> network.  
> 
> First is the possible ARP resolution overheads.  If the IP addresses of
> your host and of the destination or default gateway are not passing traffic
> frequently the ARP cache in your host or the local router can time out and
> need to be reloaded on each poll.  These can be on the order of 5-10ms and
> will affect only one side of the transaction's transmission delay. 
> 
> Unfortunately ARP often uses a 15 minute TTL, and default NTP uses a 17
> minute poll interval.  
> 
> Then there is the whole problem that many routers all along the path
> experience extra overhead on the first packet of a "flow".  Route table
> look ups are done by destination IP of course, but generally have to be
> installed into the cache, or FIB, the first time a new source/dest IP pair
> shows up.  This is often a 1-3ms overhead.  And that entry doesn't last
> forever either.
> 
> Then there is the MAC cache in your switches, which generally purge after
> 1-5 minutes.  This can often be adjusted higher, but that can sometimes
> cause issues for others when they are reconfiguring part of the network.
> 
> Another issue is NATing or statefull firewalls.  There is often outbound
> (or inbound) connection setup time.  Without special configuration this
> often "times out" before twenty minutes, leading to more asymmetric delay.
> 
> I think the suggestion of a pre-poll ICMP echo is kinda interesting.  It
> might be possible to limit the packet TTL to five hops or so, just "warming
> up" your side of the network.  It might also be better to make it a mostly
> standard UDP NTP packet so it matches whatever "rules" the intermediate
> devices are applying (and you want them to remember).  QoS and policy
> routing are both sensitive to port numbers, and certainly most firewalls
> are protocol sensitive, so matching the initial packet attributes to the
> desired high-performance packet attributes would probably help this
> technique work.
> 
> To mitigate some of these effects it might not have to be done that often.
> In many hierarchical network topologies it might serve just to send one
> extra packet every 3-5 minutes using the same source IP/port that NTP
> normally uses, to any configured server.  And it could still have a limited
> TTL if desired.  That would at least keep the switch and ARP caches fresh
> and depending on the design, the policy and NAT caches as well.  
> 
> - Eric
>  




More information about the questions mailing list