[ntp:questions] ntpd: malloc deadlock with signal io

juergen perlinger juergen.perlinger at t-online.de
Wed Oct 25 06:47:00 UTC 2017


Hello Anthony!

I don't have an answer to all of your questions, but see below:

On 10/21/2017 01:00 AM, Anthony Amaro Jr wrote:
> Hello! I am compiling ntp for QNX 6.5 and have run into a deadlock issue.
> 
> Here is how I am compiling ntp:
> 
> LD=ntoppc-ld ./configure CC="qcc -Vgcc_ntoppcbe_cpp-ne" CFLAGS="-O0
> -g" --host=powerpc-unknown-nto-qnx6.5.0 CXX=ntoppc-g++
> --with-yielding-select=yes --build=i386-pc-nto-qnx6.5.0 --with-sntp=no
> 
> Then I call "make -i" so it doesn't stop (expectedly) when making the man pages.
> 
> Here is the call stack:
> 
> 20 SyncMutexLock_r()  0xfe33fde0
> 19 T.61()  0xfe32ca0c
> 18 malloc()  0xfe32bf74
> 17 recvmsg()  0xfe3a610c
> 16 read_network_packet() D:\ntpd\ntp-4.2.8p10\ntpd\ntp_io.c:3465 0x480549c4
> 15 input_handler_scan() D:\ntpd\ntp-4.2.8p10\ntpd\ntp_io.c:3790 0x480554fc
> 14 input_handler() D:\ntpd\ntp-4.2.8p10\ntpd\ntp_io.c:3649 0x48055058
> 13 sigio_handler() D:\ntpd\ntp-4.2.8p10\libntp\iosignal.c:303 0x480c18f8
> 12 <signal handler called>()  0xfe31b6dc
> 11 __flist_enqueue_bin()  0xfe32bcc0
> 10 _list_resize()  0xfe32a81c
> 9 __realloc()  0xfe32c288
> 8 realloc()  0xfe32cea4
> 7 ereallocz() D:\ntpd\ntp-4.2.8p10\libntp\emalloc.c:43 0x480c1024
> 6 clock_select() D:\ntpd\ntp-4.2.8p10\ntpd\ntp_proto.c:3105 0x4807bf74
> 5 clock_filter() D:\ntpd\ntp-4.2.8p10\ntpd\ntp_proto.c:3039 0x4807bdcc
> 4 process_packet() D:\ntpd\ntp-4.2.8p10\ntpd\ntp_proto.c:2455 0x4807a408
> 3 receive() D:\ntpd\ntp-4.2.8p10\ntpd\ntp_proto.c:2114 0x480792dc
> 2 ntpdmain() D:\ntpd\ntp-4.2.8p10\ntpd\ntpd.c:1331 0x4805ead8
> 1 main() D:\ntpd\ntp-4.2.8p10\ntpd\ntpd.c:394 0x4805dd0c
> 
>>From what I can tell it seems that clock_select calls realloc() which
> takes out the libc heap lock. While it's got the lock and performing
> heap operations, we receive a signal for a network packet. This signal
> handler then calls recvmsg() which calls malloc() which tries to take
> out the heap lock again... causing the deadlock.
> 
> So a few questions:
> 
> 1) I can probably get around this by not using signalled io. Is there
> a way to compile ntpd without using signalled IO via the configure
> script? I really do not want to modify any code (config.h) for this
> since I don't know what other bugs I could be introducing by touching
> the code. Maybe there is a command line switch somewhere I can use to
> ensure HAVE_SIGNALLED_IO is not defined?
> 
> 2) It seems like this must have been hit before, even in Linux,
> because recvmsg() is not signal handler safe. Is there some other
> guard I am not seeing here to protect against this?

That's the bummer. POSIX requires recvmsg to "be either reentrant or
non-interruptible by signals and be async-signal-safe". See
http://pubs.opengroup.org/onlinepubs/007904975/functions/xsh_chap02_04.html

Both Linux and BSD confirm in their docs this is the case. But if the
QNX implementation calls malloc, then this implementation would require
'malloc()' to be async-signal-safe, which is *not* required by
ANSI/POSIX. (And a signal-safe malloc would be really hard to implement,
IMHO!)

So, IMHO the recvmsg implementation on QNX violates the POSIX spec, and
you should indeed refrain from using signal-driven IO with ntpd.

> 
> 3) I tried creating an account on bugzilla to report this but it seems
> that new account creation has been disabled :(

Harlan Stenn just told me he created an account for you -- I guess you
got a direct mail along that line.

> 4) What is the best avenue to contribute to helping make ntpd cross
> compile for QNX? I'd love to help out!
> 
> Anyway, thank you all so much for your time and effort on this!
> _______________________________________________
> questions mailing list
> questions at lists.ntp.org
> http://lists.ntp.org/listinfo/questions
> 



More information about the questions mailing list