[ntp:questions] ntpd behavior with multiple pool definitions

Dan Geist dan at polter.net
Tue Jun 13 21:38:00 UTC 2017


I'm working on deploying a private "pool" system that will allow any ntpd client in my company to use a single directive for config (this is thousands of hosts). The intent is to allow the round-robin nature of multiple A records in DNS naturally distribute the load across several of around 10 datacenters. The problem is that with the number of hosts (20+), the entire response when you "dig" the pool is over the max capacity of a UDP DNS response. Some places on the network aren't TCP53 friendly, so this is a blocker.

I tried breaking the hosts up into three equal sized blocks (say, pool1, pool2, and pool3 for example). and found that if I listed all three pools in the ntpd directive. i.e.:

pool pool1 iburst
pool pool2 iburst
pool pool3 iburst

To my surprise, I got an even distribution of nodes from each of the defined pools (X is for anonymity):

# ntpq -pn
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 pool1.domain .POOL.          16 p    -   64    0    0.000    0.000   0.000
 pool2.domain .POOL.          16 p    -   64    0    0.000    0.000   0.000
 pool3.domain .POOL.          16 p    -   64    0    0.000    0.000   0.000
-X.186.131.10  172.26.0.53      2 u    -   64    1   33.685   -1.302   0.103
-X.186.131.11  172.24.4.33      2 u    1   64    1   33.352   -0.792   0.358
+X.186.136.11  172.27.0.33      2 u    1   64    1   36.886    0.157   3.880
-X.186.132.10  172.28.0.33      2 u    -   64    1   58.177   -2.953   0.061
-X.186.132.11  172.28.0.33      2 u    1   64    1   58.468   -2.964   0.057
-X.186.137.11  172.31.1.53      2 u    1   64    1   60.387    0.087   1.160
*X.186.128.10  172.24.4.33      2 u    -   64    1   11.857    0.263   0.028
+X.186.133.10  172.24.4.33      2 u    1   64    1   25.973    0.337   0.190
-X.186.138.10  172.30.1.33      2 u    1   64    1   45.686   -0.070   0.019


Eventually, I'd like to implement active monitoring and geoDNS similar to the public pool, but until then, this sort of gets me what I want (smaller record sizes and reliable load distribution).

Two questions:
- Is this intended behavior that anyone knows of or did I just get lucky (i.e. simultaneous resolving of pools to get balanced answers)?
- Can I manually expand the count of active peers so that I get more than 10? With the pool directives, this gives gives me up to 9 after startup, but then it somehow drops down to 7 actual sources after a few minutes.

Dan



More information about the questions mailing list