[ntp:questions] NTP Daemon Stuck in Resolver Jail When Run from Connman

Grant Erickson marathon96 at gmail.com
Wed Apr 27 00:15:14 UTC 2011


On 4/26/11 9:44 AM, Marcel Holtmann wrote:
>>>> I've been experimenting recently, in version 0.71, with connman's ability
>>>> to
>>>> launch the NTP daemon in one-shot mode when the service state transitions
>>>> to
>>>> online mode.
>>>> 
>>>> However, what I notice in practice is that when the time server is any
>>>> named
>>>> entity (e.g. time.ubuntu.org or pool.ntp.org), the NTP daemon fails to
>>>> resolve the name, ntpd_intres gets forked and launched and then
>>>> indefinitely
>>>> fails to resolve the name.
>>>> 
>>>> If I rework the plugin to delay NTP start up and then periodically retry
>>>> several times over time, I see the same results: NTP never successfully
>>>> resolves the name.
>>>> 
>>>> If I run tcpdump while connman is starting up and bringing up the service,
>>>> I
>>>> see requests going out for wpad.<mydomain> to the name server; however, I
>>>> never see a resolve request going out on the wire for the time server,
>>>> ever.
>>>> 
>>>> If I manually run the NTP daemon in one-shot mode, the name resolves
>>>> immediately and time is updated. If I run nslookup on the time server while
>>>> all these retries and failures are occurring from the instance dispatched
>>>> in
>>>> connman, the name resolves successfully.
>>> 
>>> So what I am thinking is that we might have a small race condition here
>>> where the built-in DNS proxy is not yet ready to answer DNS queries
>>> properly. Can you try to run connmand with --nodnsproxy and see if this
>>> changes at all.
>>> 
>>> We could still have the problem that we do not write /etc/resolv.conf early
>>> enough.
>> 
>> I thought the same thing initially, hence the experiment to introduce a
>> delay (I tried up to 30 seconds) in starting the NTP daemon once the sync
>> method is invoked. No matter the length, the delay made no difference.
>> 
>> My default /etc/resolv.conf is the same that connman writes out anyway, so
>> the only "hole" that might get hit is truncate before the new file gets
>> written (see below regarding res_int why this shouldn't matter though).
>> 
>> Other experiments I tried with no success:
>> 
>> [ ... experiments omitted ... ]
>> 
>> In all seven cases, running the NTP daemon in one-shot mode indefinitely
>> fails to resolve a name. In all seven cases, running the daemon outside of
>> connman or running nslookup on the name succeeds immediately.
>> 
>> Looking at the ntpd_intres code, it repeatedly calls res_init(), so if there
>> were a race or some such, I would expect it'd eventually recover on one of
>> it's subsequent retry attempts.
> 
> maybe you are hitting a GLibc bug. The res_init() with a
> static /etc/resolv.conf should just work fine.
> 
>> Any suggestions on other experiments to run would be welcomed.
>> 
>> At this point, my next step may be to try calling the connman
>> g_resolve_lookup_hostname while building the temporary configuration peer
>> list.
> 
> Try to have ConnMan resolve the name before calling the ntpd one-shot
> command line and give the IP address to that one. I am curious if that
> would work.

Marcel,

As suggested above, if I set up a new gresolv instance and call
g_resolv_lookup_hostname from ntpdate_add_peer, it works consistently (see
output below). Would you like a patch for this?

# /etc/init.d/connman stop
# ip addr flush dev wlan0
# ip link set wlan0 down
# ip link set wlan0 up
# ./connmand -n -W wext & tail -f /var/log/messages | grep -i ntp
connmand[274]: Connection Manager version 0.71
connmand[274]: Checking loopback interface settings
connmand[274]: System hostname is fish
connmand[274]: System domainname is domain.actdsltmp
connmand[274]: Adding profile default
connmand[274]: Failed to open RFKILL control device
connmand[274]: lo {newlink} index 1 operstate 0 <UNKNOWN>
connmand[274]: wlan0 {create} index 2 type 1 <ETHER>
connmand[274]: wlan0 {RX} 104 packets 14594 bytes
connmand[274]: wlan0 {TX} 29 packets 6122 bytes
connmand[274]: wlan0 {update} flags 4099 <UP>
connmand[274]: wlan0 {newlink} index 2 address 00:0c:29:4c:56:a2 mtu 1500
connmand[274]: wlan0 {newlink} index 2 operstate 2 <DOWN>
connmand[274]: Create interface wlan0 [ wifi ]
connmand[274]: mmap error Invalid argument for
/var/lib/connman/stats/wifi_000c294c56a2_54616e6b_managed_psk.data
connmand[274]: 
/var/lib/connman/stats/wifi_000c294c56a2_54616e6b_managed_psk.data might be
on a file system, such as JFFS2, that does not allow shared writable
mappings.
connmand[274]: wlan0 up
connmand[274]: wlan0 lower down
connmand[274]: wlan0 up
connmand[274]: wlan0 lower down
connmand[274]: Skipping disconnect of 54616e6b_managed_psk
connmand[274]: wlan0 {add} route ff00:: gw :: scope 0 <UNIVERSE>
connmand[274]: wlan0 {add} route fe80:: gw :: scope 0 <UNIVERSE>
connmand[274]: wlan0 {update} flags 69635 <UP,LOWER_UP>
connmand[274]: wlan0 {newlink} index 2 address 00:0c:29:4c:56:a2 mtu 1500
connmand[274]: wlan0 {RX} 104 packets 14594 bytes
connmand[274]: wlan0 {TX} 29 packets 6122 bytes
connmand[274]: wlan0 {newlink} index 2 address 00:0c:29:4c:56:a2 mtu 1500
connmand[274]: wlan0 {newlink} index 2 operstate 5 <DORMANT>
connmand[274]: wlan0 {RX} 106 packets 14872 bytes
connmand[274]: wlan0 {TX} 32 packets 6460 bytes
connmand[274]: wlan0 {update} flags 69699 <UP,RUNNING,LOWER_UP>
connmand[274]: wlan0 lower up
connmand[274]: wlan0 lower up
connmand[274]: wlan0 {newlink} index 2 address 00:0c:29:4c:56:a2 mtu 1500
connmand[274]: wlan0 {newlink} index 2 operstate 6 <UP>
connmand[274]: mmap error Invalid argument for
/var/lib/connman/stats/wifi_000c294c56a2_54616e6b_managed_psk.data
connmand[274]: 
/var/lib/connman/stats/wifi_000c294c56a2_54616e6b_managed_psk.data might be
on a file system, such as JFFS2, that does not allow shared writable
mappings.
connmand[274]: Setting domainname to domain.actdsltmp
connmand[274]: Deleting host route failed (No such process)
connmand[274]: Adding DNS server 10.2.0.1
connmand[274]: Enabling DNS server 10.2.0.1
connmand[274]: Enabling DNS server 10.2.0.1
connmand[274]: 946759971: Will start NTPD in 5 seconds.
connmand[274]: Enabling DNS server 10.2.0.1
connmand[274]: Removing DNS server 10.2.0.1
connmand[274]: Adding DNS server 10.2.0.1
connmand[274]: Enabling DNS server 10.2.0.1
connmand[274]: wlan0 {add} address 10.2.0.30/16 label wlan0 family 2
connmand[274]: wlan0 ip bound
connmand[274]: wlan0 ip bound
connmand[274]: wlan0 {add} route 10.2.0.0 gw 0.0.0.0 scope 253 <LINK>
connmand[274]: wlan0 {add} route 10.2.0.1 gw 0.0.0.0 scope 253 <LINK>
connmand[274]: wlan0 {del} route fe80:: gw :: scope 0 <UNIVERSE>
connmand[274]: wlan0 {del} route ff00:: gw :: scope 0 <UNIVERSE>
connmand[274]: wlan0 ip bound
connmand[274]: wlan0 ip bound
connmand[274]: wlan0 {add} route 0.0.0.0 gw 10.2.0.1 scope 0 <UNIVERSE>
Jan  1 20:52:51 fish daemon.debug connmand[274]: 946759971: Will start NTPD
in 5 seconds.
connmand[274]: 946759977: Launching NTPD.
connmand[274]: 946759977: ntpdate_add_peer: hostname pool.ntp.org
connmand[274]: Adding resolved peer 69.50.219.51
connmand[274]: Adding resolved peer 208.94.240.2
connmand[274]: Adding resolved peer 128.10.19.24
Jan  1 20:52:57 fish daemon.debug connmand[274]: 946759977: Launching NTPD.
Jan  1 20:52:57 fish daemon.debug connmand[274]: 946759977:
ntpdate_add_peer: hostname pool.ntp.org
Jan  1 20:52:57 fish daemon.notice ntpd[277]: ntpd 4.2.6p2 at 1.2194-o Tue Apr
26 20:43:58 UTC 2011 (4)
Jan  1 20:52:57 fish daemon.debug ntpd[277]: set_process_priority: Leave
priority alone: priority_done is <2>
Jan  1 20:52:57 fish daemon.notice ntpd[277]: proto: precision = 30.517 usec
Jan  1 20:52:57 fish daemon.debug ntpd[277]: ntp_io: estimated max
descriptors: 768, initial socket boundary: 16
Jan  1 20:52:57 fish daemon.info ntpd[277]: Listen and drop on 0 v4wildcard
0.0.0.0 UDP 123
Jan  1 20:52:57 fish daemon.info ntpd[277]: Listen normally on 1 lo
127.0.0.1 UDP 123
Jan  1 20:52:57 fish daemon.info ntpd[277]: Listen normally on 2 wlan0
10.2.0.30 UDP 123
Jan  1 20:52:57 fish daemon.info ntpd[277]: getaddrinfo v4 69.50.219.51
Jan  1 20:52:57 fish daemon.info ntpd[277]: getaddrinfo 0
Jan  1 20:52:57 fish daemon.info ntpd[277]: Resolution for 69.50.219.51 with
status 1
Jan  1 20:52:57 fish daemon.info ntpd[277]: getaddrinfo v4 208.94.240.2
Jan  1 20:52:57 fish daemon.info ntpd[277]: getaddrinfo 0
Jan  1 20:52:57 fish daemon.info ntpd[277]: Resolution for 208.94.240.2 with
status 1
Jan  1 20:52:57 fish daemon.info ntpd[277]: getaddrinfo v4 128.10.19.24
Jan  1 20:52:57 fish daemon.info ntpd[277]: getaddrinfo 0
Jan  1 20:52:57 fish daemon.info ntpd[277]: Resolution for 128.10.19.24 with
status 1
Apr 26 23:46:34 fish daemon.notice ntpd[277]: ntpd: time set
+357101610.677685 s
connmand[274]: Setting domainname to domain.actdsltmp

Regards,

Grant





More information about the questions mailing list