[ntp:questions] Odd behaviour in a multicast setting (bug in ntpd???)

Marco Marongiu brontolinux at gmail.com
Thu Sep 1 13:37:30 UTC 2011


Il 30/08/2011 11:04, Harlan Stenn ha scritto:
>> Does this add any useful information?
> Yes, and I still think it would be Interesting to see if you can
> replicate the problem with a recent -stable release (or -dev even, as we
> are close to starting the release cycle for 4.2.8).

I grabbed a freshly installed server in the same datacenter. Since it
was just installed with FAI, it had all its configurations ready, and
ntpd automatically started and all.

the starting point was even worse than other nodes...

> # ntpq -c pe -c as
>      remote           refid      st t when poll reach   delay   offset  jitter
> ==============================================================================
> +s52             131.107.13.100   2 m  335 1024  377    0.197    3.783   0.595
> +s12             204.123.2.5      2 m  461 1024  377    0.098   -0.719   1.093
> *s51             69.25.96.13      2 m   42   64  376    0.269   -1.418   0.267
> -s11             128.9.176.30     2 m    3   64  377    0.129   -6.681   0.030
> 
> ind assID status  conf reach auth condition  last_event cnt
> ===========================================================
>   1 63615  7424    no   yes   ok   candidat   reachable  2
>   2 63616  7414    no   yes   ok   candidat   reachable  1
>   3 63617  7614    no   yes   ok   sys.peer   reachable  1
>   4 63618  7314    no   yes   ok    outlyer   reachable  1

As you can see, there are TWO multicast servers behaving unicast, and
the one that was previously picked in unicast by all machines is now OK.

After compiling and installing 4.2.6p3 from sources, and running that,
ntpq didn't show any misbehaviour in more than 60 minutes. No oddities
appear in the logs.

That tells us nothing, as it is exactly the same behaviour we had with
the stock 4.2.4p4 bundled with Debian Lenny.

The output of ntpq now is:

> # ntpq -c pe -c as
>      remote           refid      st t when poll reach   delay   offset  jitter
> ==============================================================================
> *s51             69.25.96.13      2 m   21   64  376   -1.051   -1.324   0.527
> -s11             132.163.4.101    2 m    6   64  377   -0.621   -7.521   0.469
> +s52             131.107.13.100   2 m   62   64  376   -0.538    6.061   0.742
> +s12             204.123.2.5      2 m   48   64  376   -0.426   -0.002   0.611
> 
> ind assID status  conf reach auth condition  last_event cnt
> ===========================================================
>   1 47272  761a    no   yes   ok   sys.peer              1
>   2 47273  7324    no   yes   ok    outlyer   reachable  2
>   3 47274  7424    no   yes   ok   candidat   reachable  2
>   4 47275  741a    no   yes   ok   candidat              1

I am not sure all this adds anything to what we already know (or,
rather, don't know).

Regarding the stock 4.2.4p4, I've noticed one more thing. As said, this
machine was reinstalled (via FAI) today, so I had the chance to get to
the very first logs of this system. When ntpd started for the very first
time it spit out an error on the sixth line below:

> Aug 31 18:40:02 <SERVERNAME> ntpd[3979]: ntpd 4.2.4p4 at 1.1520-o Sun Nov 22 16:14:34 UTC 2009 (1)
> Aug 31 18:40:02 <SERVERNAME> ntpd[3980]: precision = 1.000 usec
> Aug 31 18:40:02 <SERVERNAME> ntpd[3980]: Listening on interface #0 wildcard, 0.0.0.0#123 Disabled
> Aug 31 18:40:02 <SERVERNAME> ntpd[3980]: Listening on interface #1 wildcard, ::#123 Disabled
> Aug 31 18:40:02 <SERVERNAME> ntpd[3980]: Listening on interface #2 lo, ::1#123 Enabled
> Aug 31 18:40:02 <SERVERNAME> ntpd[3980]: bind() fd 19, family 10, port 123, scope 4, addr <PUBLIC_IPv6_ADDRESS>, in6_is_addr_multicast=0 flags=0x11 fails: Cannot assign requested address
> Aug 31 18:40:02 <SERVERNAME> ntpd[3980]: unable to create socket on bond0 (3) for <PUBLIC_IPv6_ADDRESS>#123
> Aug 31 18:40:02 <SERVERNAME> ntpd[3980]: failed to initialize interface for address <PUBLIC_IPv6_ADDRESS>
> Aug 31 18:40:02 <SERVERNAME> ntpd[3980]: Listening on interface #4 bond0, <LOCAL_IPv6_ADDRESS>#123 Enabled
> Aug 31 18:40:02 <SERVERNAME> ntpd[3980]: Listening on interface #5 lo, 127.0.0.1#123 Enabled
> Aug 31 18:40:02 <SERVERNAME> ntpd[3980]: Listening on interface #6 bond0, <PUBLIC_IPv4_ADDRESS>#123 Enabled
> Aug 31 18:40:02 <SERVERNAME> ntpd[3980]: kernel time sync status 0040
> Aug 31 18:40:02 <SERVERNAME> ntpd[3980]: Listening on interface #7 multicast, 224.0.1.1#123 Enabled
> Aug 31 18:40:02 <SERVERNAME> ntpd[3980]: Added Multicast Listener 224.0.1.1 on interface #7 multicast
> Aug 31 18:40:03 <SERVERNAME> ntpd[3980]: Listening on interface #8 bond0, <PUBLIC_IPv6_ADDRESS>#123 Enabled

The same error did NOT appear again when I restarted 4.2.4p4. It's been
running 25 minutes now, and it is OK.

Does that tell something?

I suspect FAI does something strange when installing and configuring
ntp, but I can't figure out what. I may try and check it out more in deep.

Thanks for any help

Ciao
-- bronto



More information about the questions mailing list