[ntp:questions] NTP sync problems

martin.tengklint at spray.se martin.tengklint at spray.se
Tue Jul 1 14:10:51 UTC 2008


Hello,

My name is Martin Tengklint and I have an NTP problem that I have
solved. However, I cannot explain why it didn't work with the original
configuration. Is there anyone here that can help me understand the
logic of NTP in my case explained below?

The topology looks like this:

Ext.NTP Server A
         |
         |
Ext.NTP Server B           Ext.NTP Server C
         |                                       |
         |---------------------------------------|
                              |
                              |
                     NTP Server D
                              |
                              |
                      NTP Client E

The problem is that my NTP client E rejected its selected NTP server
D, which lead to not syncing, leading to offset drifting on NTP Client
E. I think I have located the lack of sync to a too large "root
dispersion" value sent from the NTP server D. Its value is 1991 as
seen below:

# ntpq -c"rv 51316"
status=9014 reach, conf, 1 event, event_reach,
srcadr=cliente, srcport=123, dstadr=169.254.5.34, dstport=123,
leap=00, stratum=2, precision=-16, rootdelay=1.785,
rootdispersion=1991.028, refid=10.112.1.14, reach=377, unreach=0,
hmode=3, pmode=4, hpoll=6, ppoll=6, flash=00 ok, keyid=0,
offset=3466396.411, delay=0.567, dispersion=0.956, jitter=37.305,
reftime=cc0328d1.feabf9bf  Wed, Jun 18 2008  9:25:21.994,
org=cc0329cb.5b962c81  Wed, Jun 18 2008  9:29:31.357,
rec=cc031c40.f62d86e1  Wed, Jun 18 2008  8:31:44.961,
xmt=cc031c40.f5f9b77c  Wed, Jun 18 2008  8:31:44.960,
filtdelay=     0.57    0.53    0.57    0.52    0.56    0.68    0.52
1.11,
filtoffset= 3466396 3466359 3466320 3466282 3466235 3466198 3466160
3466123,
filtdisp=      0.03    0.98    1.95    2.93    3.92    4.86    5.81
6.77

Upon looking at ntpq -c "as" command on the Client E, the server is in
condition reject, most likely due to the high root dispersion.
Correct?

# ntpq -c"as"

ind assID status  conf reach auth condition  last_event cnt
===========================================================
  1 51316  9014   yes   yes  none    reject   reachable  1

The problem exists when having the NTP server D to sync with an
external NTP server C (stratum 1) having its own system clock as
reference.

On NTP Server D:

# ntpq -c "as"
ind assID status conf reach auth condition last_event cnt
===========================================================
1 62852 9414 yes yes none candidat reachable 1
2 62853 9614 yes yes none sys.peer reachable 1

Upon looking in more detail at the two associations above:

 # ntpq -c "rv 62853"
status=9614 reach, conf, sel_sys.peer, 1 event, event_reach,
srcadr=10.112.1.14, srcport=123, dstadr=10.112.2.90, dstport=123,
leap=00, stratum=1, precision=-17, rootdelay=0.000,
rootdispersion=10.284, refid=LCL, reach=377, unreach=0, hmode=3,
pmode=4, hpoll=10, ppoll=10, flash=00 ok, keyid=0, offset=-1128.193,
delay=1.226, dispersion=14.849, jitter=224.514,
reftime=cc12fb96.0a522000 Mon, Jun 30 2008 9:28:38.040,
org=cc12fbad.30179000 Mon, Jun 30 2008 9:29:01.187,
rec=cc12fbae.5110fdd4 Mon, Jun 30 2008 9:29:02.316,
xmt=cc12fbae.50bd8b10 Mon, Jun 30 2008 9:29:02.315,
filtdelay= 1.23 1.40 1.68 1.50 1.19 1.28 1.10 1.27,
filtoffset= -1128.1 -903.68 -1144.7 -1133.5 -814.17 -1125.2 -1125.2
-921.92,
filtdisp= 0.04 15.38 30.73 46.10 61.46 76.82 92.21 107.59

# ntpq -c "rv 62852"
status=9414 reach, conf, sel_candidat, 1 event, event_reach,
srcadr=10.112.1.13, srcport=123, dstadr=10.112.2.90, dstport=123,
leap=00, stratum=2, precision=-17, rootdelay=6.454,
rootdispersion=15.533, refid=10.109.1.164, reach=377, unreach=0,
hmode=3, pmode=4, hpoll=10, ppoll=10, flash=00 ok, keyid=0,
offset=1147.347, delay=1.298, dispersion=14.874, jitter=0.641,
reftime=cc12f9fa.ed579000 Mon, Jun 30 2008 9:21:46.927,
org=cc12fbd3.785bc000 Mon, Jun 30 2008 9:29:39.470,
rec=cc12fbd2.52cdc1fb Mon, Jun 30 2008 9:29:38.323,
xmt=cc12fbd2.52726f6f Mon, Jun 30 2008 9:29:38.322,
filtdelay= 1.30 1.15 1.47 1.24 1.29 2.20 1.54 1.45,
filtoffset= 1147.35 1147.99 1371.63 1132.04 1143.24 1460.54 1150.79
1150.61,
filtdisp= 0.04 15.41 30.79 46.18 61.57 76.91 92.26 107.63

...I can see that the one selected (NTP server C, i.e. AssId: 62853)
has a ref.id of LCL (meaning it is syncing to its local system clock?)
while the other one, the candidate (NTP server B, stratum 2) is having
NTP server A as ref.id, meaning syncing it syncs to NTP server A.

Again, when having NTP server D to primarily sync with NTP server C,
the "root dispersion" apparently gets too high, while having the NTP
server D to sync with NTP server B is fixing the problem.

My question is why the root dispersion becomes too high upon syncing
to an external server having its own local system clock as reference
(i.e. NTP server C)?

Many thanks in advance!

/eztoril




More information about the questions mailing list