[asterisk-users] Asterisk on Debian / Sparc taking up 95%+ CPU with No calls on the system

Wed Jul 6 06:11:26 CDT 2011

On Wed, Jul 6, 2011 at 7:02 AM, Tzafrir Cohen <tzafrir.cohen at xorcom.com>wrote:

> On Wed, Jul 06, 2011 at 06:15:26AM -0400, A E [Gmail] wrote:
> > On Wed, Jul 6, 2011 at 3:21 AM, Tzafrir Cohen <tzafrir.cohen at xorcom.com
> >wrote:
> >
> > > On Tue, Jul 05, 2011 at 08:30:52PM -0400, A E [Gmail] wrote:
> > > > hello people,
> > > >
> > > > I am running v1.8.4.2 on debian squeeze on a sparc platform...and for
> > > some
> > > > reason I have noticed that only after a few test calls, the asterisk
> > > process
> > > > is running between 95% - 99.9% CPU when there's absolutely nothing on
> the
> > > > system. This is a clean Asterisk system in an internal network with
> > > nothing
> > > > else on it with no calls on it but it's still sitting with 96% CPU.
> > > >
> > > > I'm not a developer so not that ept with using debug tools etc to
> figure
> > > out
> > > > why it's doing that. Could anyone please tell me how I can figure out
> why
> > > > it's doing this and/or help debug this. Makes no sense for it to be
> using
> > > > CPU with nothing happening on the system
> > >
> > > The first thing I'd do is run 'top', press shift H, and see what is/are
> > > the offending thread(s).
> > >
> > > Is it a single thread? Two? More?
> > >
> > > Is it all "user" time? Much of it is "system" time?
> > >
> > > If you strace the PID of the top thread (strace -p PID), what do you
> > > see?
> > >
> > >
> > > Hi Tzafrir,
> >
> > thanks for the comments and suggestions. So I'd done all of that and what
> > I'd found was
> >
> > - After I'd done Shift-h, There was only one / single thread that was
> taking
> > all of the CPU
> > - 33% was Sser and 66% was System times
> > - when I'd run an strace on the PID of the offending thread it just
> rolled
> > some message past my screen which I couldn't capture and can't remember
> what
> > it said :(
>
> Just press ctrl-c .
>
> haha I did that but since that I did a 100 other things in my ssh window
which is only buffered for 5000 lines and those messages have gone past.

> >
> > Anyway I've killed that process, updated the packages the system,
> upgraded
> > to 1.8.4.4 and will give it another shot and see what happens. Would've
> > helped if I'd kept the system as it was so people could help me figure
> out
> > what was going on, but the fact that it stopped responding to commands
> which
> > were trying to kill the hung channels, reloading configs, or even trying
> to
> > stop the system wouldn't work is bizarre. I hope the developers pay
> > attention to that.
>
> Developers need some data to work with :-(
>
> Haha of course. Although I have a feeling it'll happen again as this is the
2nd time this has happened. Will keep the system in that state till we can
try and resolve this and capture enough info. if I had better memory, I'd
have actually remembered what the message was, but anyway, what I was trying
to say was that it's much more than just taking up all the CPU tells me
that some thread has just gone loco. But the fact the CLI and AMI commands
become unresponsive when trying to kill these zombie channels or trying to
do a "core reload" or "core stop now" etc. tells me that this is a bigger
issue than just some thread gone nuts and the channels being hung
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-users/attachments/20110706/59532996/attachment.htm>