[ntp:questions] ntp server with two default routes misbehaving after upgrade
nospam at spamless.invalid
Mon Apr 21 18:45:06 UTC 2014
On Sat, 19 Apr 2014 17:43:30 +0100, Caecilius
<nospam at spamless.invalid> wrote:
>I've got a Debian Linux system which uses ntp to synchronise its
>clock, and is itself used by my internal systems for time
>synchronisation. This system has two Internet connections for
>resilience and load balancing, configured using Linux advanced routing
>and with two default routes to the two ISP routers.
>Up until about a week ago, I was running ntp 4.2.4p4 (Debian Lenny),
>which was running without any issues. I've recently upgraded to
>4.2.6p2 (Debian Squeeze), and the ntp server seems to be misbehaving.
>Every ten minutes, the ntp server seems to switch between the two
>external routes, and logs the following messages to syslog:
A progress update:
1. tcpdump shows that the packet-level behaviour is the same for ntp
4.2.4p4 and 4.2.6p2. Namely packets to a given peer switch from one
source IP to the other approximately every 10 minutes.
So the packet-level behaviour hasn't changed between NTP versions, but
what has changed is that 4.2.4p4 didn't seem to care wheras 4.2.6p2
resets the peer when this happens.
2. Binding to a single external interface with:
interface ignore all
interface listen lo
interface listen eth0
interface listen eth1
Doesn't work, and actually makes things worse because ntp doesn't sync
at all. I suspect that if the packets go out of the "wrong" interface,
then NTP doesn't see the responses at all.
3. Reducing maxpoll to 8 in an attempt to keep the peers in the route
cache doesnt' work.
I though that by reducing maxpoll below the default route cache
garbage collection timeout of 300 seconds would ensure that the routes
for each peer would stay in the route cache and thus never change.
This didn't seem to have the desired effect though, as the routing
still changes. Perhaps the load-balancing periodically invalidates
the route cache or something.
4. Bug #1378 appears to be related to this behaviour
This implies that the main reason for resetting peers if the local
address changes is because of crypto authentication, but does this
really require a full reset of the peer?
More information about the questions