[asterisk-users] puzzle

Tzafrir Cohen tzafrir.cohen at xorcom.com
Wed Nov 19 14:44:05 CST 2008


On Wed, Nov 19, 2008 at 07:57:33PM +0000, Jeff LaCoursiere wrote:
> 
> Sorry again for the only marginal relation to asterisk, but the issue does 
> affect the voice performance I am experiencing, so I am soothing my guilt 
> with that.
> 
> Bet you don't see this every day:
> 
> ast% uptime
>   13:48:08 up 981 days, 18:29,  1 user,  load average: 1.08, 1.02, 1.01
> ast%
> 
> I *REALLY* want this machine to see 1000 days uptime, if for nothing other 
> than bragging rights.  Its been through mysql and asterisk upgrades, a 
> horrible hacking nightmare that very nearly made me reboot, and several 
> power outages where the batteries lasted JUST long enough to keep her up.
> 
> After all of this, I find I may have to reboot after all.  Because there 
> is a !$@#% process running, consuming 100% CPU (note the load average), 
> and I cannot seem to kill it:
> 
> ast% ps auxw | grep modprobe
> root     17744 99.9  0.0  2688  412 ?        RN   Nov03 23223:01 modprobe 
> -r ipt_state

modprobe -r is basically rmmod . rmmod and insmod and nowdays mostly
wrappers to kernel code.

So while an strace of that process might give some more information
about it, I believe that the kernel-level backtrace would be more
interesting.

For that, try either the 'p' or 't' sysrq commands. 'p' gives a stack
trace of the current process. 't': of all the processes. You can give a
sysrq command either through the console (on x86: alt-sysrq-<key>) or:

  echo <key> >/proc/sysrq-trigger

The output goes to the kernel logs, e.g. in dmesg .

> ast% ps ealx | grep modprobe | grep -v grep
> 4     0 17744     1  39  19  2688  412 -      RN   ?        23223:38 
> modprobe -r ipt_state
> ast% sudo kill 17744
> ast% sudo kill 17744
> ast% sudo kill -9 17744
> ast% sudo kill -9 17744

This will probably apply when the process will leave whatever busy
context it is in.

> ast% !ps
> ps ealx | grep modprobe | grep -v grep
> 4     0 17744     1  39  19  2688  412 -      RN   ?        23224:41 
> modprobe -r ipt_state
> ast%
> 
> You may also notice that I tried "renice" to bump it all the way to +19 
> and still it consumes 100% of the CPU.  The result for asterisk is that I 
> hear bits of robot noise during conversations, which is annoying as hell 
> but not neccessarily show stopping.  But for another 19 days??  Argg!
> 
> I assume that because it is 'modprobe' it has tickled some kernel bug that 
> is merrily spinning away and won't respond to interrupts.  I even tried to 
> stop it with gdb and strace, both of which also hung and had to be killed 
> with -9.
> 
> It seems to be related to me screwing with the iptables a few weeks ago.
> 
> Any ideas other than rebooting?

BTW: what kernel? What ditsribution?

-- 
               Tzafrir Cohen
icq#16849755              jabber:tzafrir.cohen at xorcom.com
+972-50-7952406           mailto:tzafrir.cohen at xorcom.com
http://www.xorcom.com  iax:guest at local.xorcom.com/tzafrir



More information about the asterisk-users mailing list