[ntp:security] [Bug 527] ntpd frequently crashes on Windows systems

bugzilla at ntp.isc.org bugzilla at ntp.isc.org
Thu Apr 6 08:56:06 UTC 2006


http://bugs.ntp.isc.org/show_bug.cgi?id=527



----------------------------------------------------------------------------
Additional Comments From martin.burnicki at meinberg.de (Martin Burnicki)
Submitted on 2006-04-06 08:56
Folks, this is basically what I've already sent to some of you by email.

I've recently tested the RC1 tarball ntp-dev-4.2.0b-rc1-20060404 under Linux, 
Windows, and Solaris 9.

Under Linux everything seems to work fine, and under Windows there's still the 
trap under heavy load. However, I've observed similar effects under Solaris 9 
as under Windows, so this bug doesn't seem to be in fact a Windows-specific 
bug.

Here's a short summary what has been observed under Windows:

We have already seen that the effects we could observe depend on the memory 
layout of the compiled binary, e.g. depending on small modifications of the 
source code, or changed compiler switches, the NTP service became just 
unresponsive on a specific network interface, or even caused a trap.

Obviously there's a piece of code which occasionally overwrites some piece of 
heap or stack where it shouldn't, and thus causes unexpected behaviour when 
some other code is executed which relies on the piece of memory which has been 
overwritten.

The probability this happens increases under heavy load by a high number of 
requests from the network, but it also happens under light load. Occasionally 
I've seen that the service ran under light load without problems for some 
hours, but suddenly trapped when I ran "ntpq -p" against it.

The xmas edition of the Windows binary we've published seems to run rock solid 
on many Windows systems (including my test machine), but this seems to be only 
accidentally, and right yesterday I received a report from a customer where 
this version trapped, too.

It seems the problem has been introduced between ntp-dev-4.2.0a-20050723.tar.gz 
and ntp-dev-4.2.0a-20050726.tar.gz, but it can also be possible that the 
problem just became more obvious with those changes, since the memory location 
which is overwritten just may have changed due to the modified code.

Here's of what I have just observed under Solaris 9:

I've built ntpd from the tarball using

./configure --enable-MEINBERG

After ntpd has been startet it basically works correctly. The problems I've 
observed happened whether I had configured a refclock, or not, and the 
behaviour was slightly different on different tries.

The initial request rate which could be handled by the test machine dropped 
after a few seconds. When I restarted ntpd the rate was high again.

Once I've observed that when I ran "ntpq -p" locally, every single line of the 
output billboard appeared after a few seconds delay. This is the same behaviour
we can occasionally observe under Windows, and which has recently reported by
David J. Taylor. Maybe things are related.

Several times I had ntpd under Solaris get unresponsive at all, i.e. no 
response packets were received anymore, neither client request, nor "ntpq -p" 
requests. "ntpq -p" didn't even work locally from the same machine.

Sometimes ntpd aborted with a segmentation fault. Running ntpd under ddd/gdb, 
the debug info after the trap is:

(gdb) run -n
Program received signal SIGSEGV, Segmentation fault.

The backtrace lists:
#2  0x00033a5c in main () at ntpd.c:274
#1  0x00034010 in ntpdmain () at ntpd.c:969
#0  0x00000000 in ?? ()

The relevant code is:

  while (full_recvbuffs())
  {
    /*
     * Call the data procedure to handle each received
     * packet.
     */
    rbuf = get_full_recv_buffer();
    if (rbuf != NULL)       /* This should always be true */
    {
-->   (rbuf->receiver)(rbuf);
      freerecvbuf(rbuf);
    }
  }

So I think there's a big fat bug in ntpd which is not related to Windows, so 
maybe we should also change the summary of this bug.

Martin


-- 
Martin Burnicki <martin.burnicki at meinberg.de>



------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


More information about the security mailing list