[ntp:questions] ntp server with two default routes misbehaving after upgrade

Caecilius nospam at spamless.invalid
Sat Apr 19 16:43:30 UTC 2014


I've got a Debian Linux system which uses ntp to synchronise its
clock, and is itself used by my internal systems for time
synchronisation.  This system has two Internet connections for
resilience and load balancing, configured using Linux advanced routing
and with two default routes to the two ISP routers.

Up until about a week ago, I was running ntp 4.2.4p4 (Debian Lenny),
which was running without any issues.  I've recently upgraded to
4.2.6p2 (Debian Squeeze), and the ntp server seems to be misbehaving.

Every ten minutes, the ntp server seems to switch between the two
external routes, and logs the following messages to syslog:

Apr 19 15:09:06 mercury ntpd[2188]: 82.219.4.30 interface 82.68.22.189
-> 82.68.75.137
Apr 19 15:09:06 mercury ntpd[2188]: 94.228.40.3 interface 82.68.22.189
-> 82.68.75.137
Apr 19 15:19:06 mercury ntpd[2188]: 82.219.4.30 interface 82.68.75.137
-> 82.68.22.189
Apr 19 15:19:06 mercury ntpd[2188]: 94.228.40.3 interface 82.68.75.137
-> 82.68.22.189
Apr 19 15:39:06 mercury ntpd[2188]: 82.219.4.30 interface 82.68.22.189
-> 82.68.75.137
Apr 19 15:39:06 mercury ntpd[2188]: 94.228.40.3 interface 82.68.22.189
-> 82.68.75.137
Apr 19 15:49:06 mercury ntpd[2188]: 176.74.25.227 interface
82.68.75.137 -> 82.68.22.189
Apr 19 15:49:06 mercury ntpd[2188]: 5.2.16.107 interface 82.68.75.137
-> 82.68.22.189
Apr 19 15:59:06 mercury ntpd[2188]: 176.74.25.227 interface
82.68.22.189 -> 82.68.75.137
Apr 19 15:59:06 mercury ntpd[2188]: 5.2.16.107 interface 82.68.22.189
-> 82.68.75.137
Apr 19 16:19:06 mercury ntpd[2188]: 82.219.4.30 interface 82.68.75.137
-> 82.68.22.189
Apr 19 16:19:06 mercury ntpd[2188]: 94.228.40.3 interface 82.68.75.137
-> 82.68.22.189
Apr 19 16:29:06 mercury ntpd[2188]: 82.219.4.30 interface 82.68.22.189
-> 82.68.75.137
Apr 19 16:29:06 mercury ntpd[2188]: 94.228.40.3 interface 82.68.22.189
-> 82.68.75.137

When it does this, it seems to become unsynchronised, as I get
messages like this from internal systems trying to synchronise off of
it:

Apr 19 14:49:19 krypton ntpd[1263]: no servers reachable
Apr 19 14:51:11 sodium ntpd[1609]: no servers reachable
Apr 19 14:53:04 cat ntpd[1189]: no servers reachable
Apr 19 14:53:50 lithium ntpd[1541]: no servers reachable
Apr 19 14:54:38 oxygen ntpd[1254]: no servers reachable
Apr 19 14:55:46 xenon ntpd[1672]: no servers reachable
Apr 19 14:56:38 helium ntpd[1917]: no servers reachable
Apr 19 15:02:38 dog ntpd[1380]: no servers reachable
Apr 19 15:08:43 dubnium ntpd[1548]: no servers reachable

I guess that something has been added between 4.2.4p4 and 4.2.6p2
that's making ntp take notice of the two different routes. But I don't
understand why it should care: that's the network layer's problem, and
there will often be multiple routes between the source and destination
that are invisible to ntp anyway.

In my case, the two routes are both using the same type of connection
(FTTC DSL) with the same ISP, so the latency of each path is going to
be quite similar.

Any thoughts on what's going on, and how to cure it or at least get
some more information in order to understand the problem better?



More information about the questions mailing list