[asterisk-users] puzzle
Tzafrir Cohen
tzafrir.cohen at xorcom.com
Wed Nov 19 14:44:05 CST 2008
On Wed, Nov 19, 2008 at 07:57:33PM +0000, Jeff LaCoursiere wrote:
>
> Sorry again for the only marginal relation to asterisk, but the issue does
> affect the voice performance I am experiencing, so I am soothing my guilt
> with that.
>
> Bet you don't see this every day:
>
> ast% uptime
> 13:48:08 up 981 days, 18:29, 1 user, load average: 1.08, 1.02, 1.01
> ast%
>
> I *REALLY* want this machine to see 1000 days uptime, if for nothing other
> than bragging rights. Its been through mysql and asterisk upgrades, a
> horrible hacking nightmare that very nearly made me reboot, and several
> power outages where the batteries lasted JUST long enough to keep her up.
>
> After all of this, I find I may have to reboot after all. Because there
> is a !$@#% process running, consuming 100% CPU (note the load average),
> and I cannot seem to kill it:
>
> ast% ps auxw | grep modprobe
> root 17744 99.9 0.0 2688 412 ? RN Nov03 23223:01 modprobe
> -r ipt_state
modprobe -r is basically rmmod . rmmod and insmod and nowdays mostly
wrappers to kernel code.
So while an strace of that process might give some more information
about it, I believe that the kernel-level backtrace would be more
interesting.
For that, try either the 'p' or 't' sysrq commands. 'p' gives a stack
trace of the current process. 't': of all the processes. You can give a
sysrq command either through the console (on x86: alt-sysrq-<key>) or:
echo <key> >/proc/sysrq-trigger
The output goes to the kernel logs, e.g. in dmesg .
> ast% ps ealx | grep modprobe | grep -v grep
> 4 0 17744 1 39 19 2688 412 - RN ? 23223:38
> modprobe -r ipt_state
> ast% sudo kill 17744
> ast% sudo kill 17744
> ast% sudo kill -9 17744
> ast% sudo kill -9 17744
This will probably apply when the process will leave whatever busy
context it is in.
> ast% !ps
> ps ealx | grep modprobe | grep -v grep
> 4 0 17744 1 39 19 2688 412 - RN ? 23224:41
> modprobe -r ipt_state
> ast%
>
> You may also notice that I tried "renice" to bump it all the way to +19
> and still it consumes 100% of the CPU. The result for asterisk is that I
> hear bits of robot noise during conversations, which is annoying as hell
> but not neccessarily show stopping. But for another 19 days?? Argg!
>
> I assume that because it is 'modprobe' it has tickled some kernel bug that
> is merrily spinning away and won't respond to interrupts. I even tried to
> stop it with gdb and strace, both of which also hung and had to be killed
> with -9.
>
> It seems to be related to me screwing with the iptables a few weeks ago.
>
> Any ideas other than rebooting?
BTW: what kernel? What ditsribution?
--
Tzafrir Cohen
icq#16849755 jabber:tzafrir.cohen at xorcom.com
+972-50-7952406 mailto:tzafrir.cohen at xorcom.com
http://www.xorcom.com iax:guest at local.xorcom.com/tzafrir
More information about the asterisk-users
mailing list