[ntp:hackers] FreeBSD 6.3 hiccup with ntp-dev and LDAP

Dave Hart davehart at gmail.com
Wed Sep 2 05:41:58 UTC 2009


Since there is a depth of FreeBSD experience represented here, I'm
hoping someone will have insight to help me resolve a problem running
very recent ntpd on not-so-recent FreeBSD.

I am attempting to test a backwards compatibility mechanism for bug
1243.  Steve Kostecke kindly provided some assistance getting an
Autokey configuration going on three developer test machines machines
at ntp.org, two of which are running FreeBSD 6.3.  Neither of these
machines is happy running stock ntp-dev code, crashing during the
processing of ntp.conf in a call to getaddrinfo().  The crash is
happening while the name resolution code is parsing
/etc/nsswitch.conf, which has one external provider, ldap.  It appears
to fault inside dlopen() of nss_ldap.so.1.  Steve indicated the
package version providing this on host psp-fb1:

psp-fb1:  nss_ldap-1.264_2    RFC 2307 NSS module

Here's a backtrace from gdb, fighting against lack of system symbols:

#0  0x280c1ddc in memset () from /libexec/ld-elf.so.1
#1  0x280da100 in ?? ()
#2  0x280b752d in map_object () from /libexec/ld-elf.so.1
#3  0x280b42d5 in elf_hash () from /libexec/ld-elf.so.1
#4  0x280b4404 in elf_hash () from /libexec/ld-elf.so.1
#5  0x280b62ba in dlopen () from /libexec/ld-elf.so.1
#6  0x2829ab18 in _nsdbtaddsrc () from /lib/libc.so.6
#7  0x2829629e in ___toupper () from /lib/libc.so.6
#8  0x282968c0 in _nsyyparse () from /lib/libc.so.6
#9  0x2829aeaa in nsdispatch () from /lib/libc.so.6
#10 0x2828d794 in getservbyname_r () from /lib/libc.so.6
#11 0x2828d954 in endservent () from /lib/libc.so.6
#12 0x2828da19 in endservent () from /lib/libc.so.6
#13 0x2828996c in freeaddrinfo () from /lib/libc.so.6
#14 0x28289a69 in freeaddrinfo () from /lib/libc.so.6
#15 0x28289dbe in getaddrinfo () from /lib/libc.so.6
#16 0x0804d0dc in get_multiple_netnums (num=0x8109500 "0.0.0.0", addr=0x1,
   res=0x0, complain=1, a_type=t_UNK) at ntp_config.c:2776
#17 0x0804d1a5 in getnetnum (num=0x8109500 "0.0.0.0", addr=0xbfbfe440,
   complain=1, a_type=t_UNK) at ntp_config.c:2726
#18 0x0804de8a in config_ntpd () at ntp_config.c:1422
#19 0x0804f347 in getconfig (argc=0, argv=0xbfbfea8c) at ntp_config.c:2439
#20 0x08056ef8 in ntpdmain (argc=0, argv=0xbfbfea8c) at ntpd.c:847

There's actually a missing frame in the ntpd part of the trace as
well, presumably due to optimization.  ntpd_config() calls
config_access() which calls getnetnum().

I was unable to reproduce the crash with a simple test program which
calls getaddrinfo().  To make progress on 1243, I was able to work
around this crash by adding code to call getaddrinfo() once earlier in
ntpd startup, in init_logging().  For some reason, the dlopen()
succeeds in that case, quite possibly due to some other
as-yet-unidentified bug in ntpd.

Here's the /etc/nsswitch.conf:
====
passwd: files ldap
group: files ldap

hosts: files dns
networks: files

shells: files
====

If you are using LDAP with FreeBSD of any vintage, it would be helpful
if you would try building and running ntp-dev p200 or later to help us
understand how widespread the problem is.  To reproduce the problem it
would help if your ntp.conf contains a "restrict -4 default" line, as
the crash happens processing that directive, asking getaddrinfo() to
look up host "0.0.0.0" service "ntp" with hints.ai_socktype =
SOCK_DGRAM.  Any "restrict -4" line should exercise the same code
regardless of network address or restriction bits, but here are the
lines that we're using:

restrict -4 default kod notrap nomodify nopeer
restrict -4 127.0.0.1

(Tangentially, Danny Mayer has changed getnetnum() in his pending
listen-on changes to provide hints.ai_flags = AI_NUMERICHOST in this
codepath for numeric address strings, which would avoid attempting to
resolve numeric addresses as DNS names and might further defer
/etc/nsswitch.conf processing.)

I'm hoping someone here already has ldap mentioned in their
/etc/nsswitch.conf or is willing to set things up to add it, and has
symbols for libc and ld_elf that will provide a more useful backtrace,
assuming the crash is not specific to these boxes.

Cheers,
Dave Hart


More information about the hackers mailing list