[Asterisk-Users] Processor load spikes

Mon Feb 23 10:42:05 MST 2004

Thanks for the response. I plan on trying Slackware on my backup/test
asterisk server when I have a new backup server ready in a few weeks. I've
noticed in some database machine testing that Slackware starts up in about
half the time of RedHat and doesn't have all of that Redhat junk either.
I'll post my results running Slackware after I've had time to test it.

When I said crashed I meant that the whole operating system crashed, so no
backtrace possible.

Thanks,

MATT---

-----Original Message-----
From: Steven Critchfield [mailto:critch at basesys.com]
Sent: Monday, February 23, 2004 12:27 PM
To: asterisk-users at lists.digium.com
Subject: Re: [Asterisk-Users] Processor load spikes

On Mon, 2004-02-23 at 09:19, mattf wrote:
> I always keep a terminal window open with "top" running for my asterisk
> servers. Since we've had Asterisk in production, for about 9 months, I've
> noticed with every platform and every card we've tried that the load
average
> will be going along at about 0.1 to 0.5 with about 30 channels(15 SIP ->
> Zap conversations) going and then at seemingly random times the load
average
> will jump to over 2.0.
> 
> All the while the processor idle never goes below 50%.
> 
> Does anyone know what the asterisk process is doing that causes these load
> jumps?
> (I have determined that initiating new calls or hanging up calls is not a
> factor in the timing of these jumps)

First a word on load averages as opposed to percent idle of CPU. Load
average is the average number of processes awaiting cpu service. A
process could be idle if it has no real work to complete and has allowed
the CPU to skip on to another process. Percent idle is easier to
understand as it is how much of the CPU's time is spent waiting for a
process to need servicing. 

The problem of using top to monitor load is much like quantum physics,
you change the value when you observe it. So part of your spike may be
in timing of the observation. 

There are many operations that could affect the load average. Any new
threads loading would be in a high busy state until the loading period
id over and the process starts idle looping. Load mozilla up sometime
while watching the load on your system shoot up. Your percent idle may
still stay smallish since it is mostly exercising the disk subsystem and
the CPU is waiting most of that time. 

If you are seeing a load average climb, you should identify the
processes starting or running at that time. If it is falling, the
processes have either completed the busy cycle, or have gone away. 

It is still likely though that you are seeing some errant behavior in
RH9 caused by the new thread library. There may be a broken select
function or something similar that is causing your trouble. Maybe you
should try an older RH, or a different distribution and see if this
happens as well.

> I have loaded up the channels on a test server to see what will happen is
> the load spikes while it is already at 2.0 and with 100 channels(50 SIP->
> Zap conversations) it ran for 4 hours with the load averaging around
2.0(on
> non-SMP P4) and then I got a spike and the load went upto 8.0 and the
server
> crashed. 

Did the whole system crash or did just asterisk crash? If it was just
asterisk, did you get a core dump and did you do a backtrace on it?

-- 
Steven Critchfield  <critch at basesys.com>

_______________________________________________
Asterisk-Users mailing list
Asterisk-Users at lists.digium.com
http://lists.digium.com/mailman/listinfo/asterisk-users
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users