[ntp:questions] Ntpd in uninterruptible sleep?

A C agcarver+ntp at acarver.net
Fri Nov 4 21:52:19 UTC 2011


Ok, so ntpd does not respond to a SIGQUIT with a core dump but I did 
manage to attach a trace to the process before killing it.  The output 
of ktrace is below.  The circumstances of this particular lockup I was 
actually able to observe.  I camped out at the system waiting.

A cron job fired off which does routine disk maintenance (diffs of 
config files, free space calculations, postfix cleanup, etc.)  In this 
case things got swapped around a bit while all the cleanup was occuring. 
  However, after those activities finished, ntpd never returned to 
normal.  It just spun out of control resulting in the trace below. Other 
programs that were running at the same time (gpsd, xclock, xterm) all 
recovered cleanly though the system was now bogged down by ntpd 
consuming almost all the processor time even though it was not set to 
high priority.  The capture below pretty much loops continuously until 
ntpd is finally killed.  I actually let the system run for an additional 
24 hours in this state just to see if it would bounce back but it never 
did.  I killed only one process, ntpd, and everything else was fine as 
the CPU load dropped to near zero immediately.

1210      1 ntpd     CALL  clock_gettime(0,0xefffd0e8)
1210      1 ntpd     RET   clock_gettime 0, -268447512/0xefffd0e8
1210      1 ntpd     CALL  select(0x1c,0xefffd05c,0,0,0xefffd0b4)
1210      1 ntpd     RET   select 1, -268447652/0xefffd05c
1210      1 ntpd     CALL 
recvfrom(0x16,0xefffcc74,0x3e8,0,0xefffd098,0xefffd0ec)
1210      1 ntpd     MISC  msghdr: 28, 
00000000f02cf7e0f25edeac00000001000000000001b58400000000
1210      1 ntpd     GIO   fd 22 read 12 bytes
      "\^V\^A\0\^A\0\0\0\0\0\0\0\0"
1210      1 ntpd     MISC  sockname: 16, 1002de040a00008d0000000000000000
1210      1 ntpd     RET   recvfrom 12/0xc, -268448652/0xefffcc74
1210      1 ntpd     CALL 
recvfrom(0x16,0xefffcc74,0x3e8,0,0xefffd098,0xefffd0ec)
1210      1 ntpd     MISC  msghdr: 28, 
00000000f02cf7e0f25edeac00000001000000000001b58400000000
1210      1 ntpd     RET   recvfrom -1 errno 35 Resource temporarily 
unavailable
1210      1 ntpd     CALL  setcontext(0xefffd248)
1210      1 ntpd     RET   setcontext JUSTRETURN
1210      1 ntpd     PSIG  SIGALRM caught handler=0x3ea2c mask=(): 
code=SI_TIMER sigval 0x3)
1210      1 ntpd     CALL  setcontext(0xefffd248)
1210      1 ntpd     RET   setcontext JUSTRETURN
1210      1 ntpd     PSIG  SIGALRM caught handler=0x3ea2c mask=(): 
code=SI_TIMER sigval 0x3)
1210      1 ntpd     CALL  setcontext(0xefffd248)
1210      1 ntpd     RET   setcontext JUSTRETURN
1210      1 ntpd     PSIG  SIGALRM caught handler=0x3ea2c mask=(): 
code=SI_TIMER sigval 0x3)
1210      1 ntpd     CALL  setcontext(0xefffd248)
1210      1 ntpd     RET   setcontext JUSTRETURN
1210      1 ntpd     PSIG  SIGALRM caught handler=0x3ea2c mask=(): 
code=SI_TIMER sigval 0x3)
1210      1 ntpd     CALL  setcontext(0xefffd178)
1210      1 ntpd     RET   setcontext JUSTRETURN
1210      1 ntpd     PSIG  SIGALRM caught handler=0x3ea2c mask=(): 
code=SI_TIMER sigval 0x3)
1210      1 ntpd     CALL  setcontext(0xefffd248)
1210      1 ntpd     RET   setcontext JUSTRETURN
1210      1 ntpd     PSIG  SIGALRM caught handler=0x3ea2c mask=(): 
code=SI_TIMER sigval 0x3)
1210      1 ntpd     CALL  setcontext(0xefffd178)
1210      1 ntpd     RET   setcontext JUSTRETURN
1210      1 ntpd     PSIG  SIGALRM caught handler=0x3ea2c mask=(): 
code=SI_TIMER sigval 0x3)
1210      1 ntpd     CALL  setcontext(0xefffd178)
1210      1 ntpd     RET   setcontext JUSTRETURN
1210      1 ntpd     PSIG  SIGALRM caught handler=0x3ea2c mask=(): 
code=SI_TIMER sigval 0x3)
1210      1 ntpd     CALL  setcontext(0xefffd178)
1210      1 ntpd     RET   setcontext JUSTRETURN
1210      1 ntpd     PSIG  SIGALRM caught handler=0x3ea2c mask=(): 
code=SI_TIMER sigval 0x3)
1210      1 ntpd     CALL  setcontext(0xefffd248)
1210      1 ntpd     RET   setcontext JUSTRETURN
1210      1 ntpd     PSIG  SIGIO caught handler=0x6b670 mask=(): 
code=[1], fd=22, band=41)
1210      1 ntpd     PSIG  SIGALRM caught handler=0x3ea2c mask=(23): 
code=SI_TIMER sigval 0x3)
1210      1 ntpd     CALL  setcontext(0xefffce00)
1210      1 ntpd     RET   setcontext JUSTRETURN
1210      1 ntpd     CALL  clock_gettime(0,0xefffd018)
1210      1 ntpd     RET   clock_gettime 0, -268447720/0xefffd018
1210      1 ntpd     PSIG  SIGALRM caught handler=0x3ea2c mask=(23): 
code=SI_TIMER sigval 0x3)
1210      1 ntpd     CALL  setcontext(0xefffcdf8)
1210      1 ntpd     RET   setcontext JUSTRETURN
>>  1210      1 ntpd     CALL  select(0x1c,0xefffcf8c,0,0,0xefffcfe4)
1210      1 ntpd     RET   select 1, -268447860/0xefffcf8c
1210      1 ntpd     CALL 
recvfrom(0x16,0xefffcba4,0x3e8,0,0xefffcfc8,0xefffd01c)
1210      1 ntpd     MISC  msghdr: 28, 
00000000f02cf7e0f25edeac00000001000000001039485c00000000
1210      1 ntpd     GIO   fd 22 read 12 bytes
      "\^V\^A\0\^A\0\0\0\0\0\0\0\0"
1210      1 ntpd     MISC  sockname: 16, 1002e7210a00008d0000000000000000
1210      1 ntpd     RET   recvfrom 12/0xc, -268448860/0xefffcba4
1210      1 ntpd     CALL 
recvfrom(0x16,0xefffcba4,0x3e8,0,0xefffcfc8,0xefffd01c)
1210      1 ntpd     MISC  msghdr: 28, 
00000000f02cf7e0f25edeac00000001000000001039485c00000000
1210      1 ntpd     RET   recvfrom -1 errno 35 Resource temporarily 
unavailable
1210      1 ntpd     CALL  setcontext(0xefffd178)
1210      1 ntpd     RET   setcontext JUSTRETURN
1210      1 ntpd     PSIG  SIGALRM caught handler=0x3ea2c mask=(): 
code=SI_TIMER sigval 0x3)
1210      1 ntpd     CALL  setcontext(0xefffd248)
1210      1 ntpd     RET   setcontext JUSTRETURN
1210      1 ntpd     PSIG  SIGALRM caught handler=0x3ea2c mask=(): 
code=SI_TIMER sigval 0x3)
1210      1 ntpd     CALL  setcontext(0xefffd248)
1210      1 ntpd     RET   setcontext JUSTRETURN
1210      1 ntpd     PSIG  SIGALRM caught handler=0x3ea2c mask=(): 
code=SI_TIMER sigval 0x3)
1210      1 ntpd     CALL  setcontext(0xefffd248)
1210      1 ntpd     RET   setcontext JUSTRETURN
1210      1 ntpd     PSIG  SIGIO caught handler=0x6b670 mask=(): 
code=[1], fd=22, band=41)
1210      1 ntpd     CALL  clock_gettime(0,0xefffd0e8)
1210      1 ntpd     RET   clock_gettime 0, -268447512/0xefffd0e8
1210      1 ntpd     PSIG  SIGALRM caught handler=0x3ea2c mask=(23): 
code=SI_TIMER sigval 0x3)
1210      1 ntpd     CALL  setcontext(0xefffcec8)
1210      1 ntpd     RET   setcontext JUSTRETURN
1210      1 ntpd     CALL  select(0x1c,0xefffd05c,0,0,0xefffd0b4)
1210      1 ntpd     RET   select 1, -268447652/0xefffd05c
1210      1 ntpd     CALL 
recvfrom(0x16,0xefffcc74,0x3e8,0,0xefffd098,0xefffd0ec)
1210      1 ntpd     MISC  msghdr: 28, 
00000000f02cf7e0f25edeac00000001000000000001b58400000000
1210      1 ntpd     GIO   fd 22 read 12 bytes
      "\^V\^A\0\^A\0\0\0\0\0\0\0\0"
1210      1 ntpd     MISC  sockname: 16, 1002e7210a00008d0000000000000000
1210      1 ntpd     RET   recvfrom 12/0xc, -268448652/0xefffcc74
1210      1 ntpd     CALL 
recvfrom(0x16,0xefffcc74,0x3e8,0,0xefffd098,0xefffd0ec)
1210      1 ntpd     MISC  msghdr: 28, 
00000000f02cf7e0f25edeac00000001000000000001b58400000000
1210      1 ntpd     RET   recvfrom -1 errno 35 Resource temporarily 
unavailable
1210      1 ntpd     CALL  setcontext(0xefffd248)
1210      1 ntpd     RET   setcontext JUSTRETURN
1210      1 ntpd     PSIG  SIGALRM caught handler=0x3ea2c mask=(): 
code=SI_TIMER sigval 0x3)
1210      1 ntpd     CALL  setcontext(0xefffd248)
1210      1 ntpd     RET   setcontext JUSTRETURN
1210      1 ntpd     PSIG  SIGALRM caught handler=0x3ea2c mask=(): 
code=SI_TIMER sigval 0x3)
1210      1 ntpd     CALL  setcontext(0xefffd178)
1210      1 ntpd     RET   setcontext JUSTRETURN
>


More information about the questions mailing list