[Asterisk-Dev] spike / asterisk hang? <- On a 360MHz CPU - no,
440 :)
Matt Hess
mhess at livewirenet.com
Fri Jul 15 12:04:59 MST 2005
response inline..
Jim Van Meggelen wrote:
>asterisk-dev-bounces at lists.digium.com wrote:
>
>
>>We have a steadily growing dial plan for our users on a sparc
>>netra t105.. I have noticed that audio going through the
>>asterisk (stable) system
>>sometimes becomes choppy (cuts out and restores after a few
>>seconds) when new calls are being handled.. on the system
>>running a vmstat 1 I watch the traps sys counter jumps up
>>really high from the
>>normal run stats..
>>
>>
>
>I would say that is to be expected on such a platform. Running Asterisk
>on a system with a 360MHz CPU is going to limit the number of concurrent
>calls you can handle.
>
>Although at first glance this may seem a dev issue, it is not. It is a
>performance issue -- Asterisk performs all DSP work in the CPU, so as
>the load on the system increases, the chance that callers will hear the
>degradation increase. This should probably be submitted to the user's
>list for any further discussion (or taken off line).
>
>
>
It's a 440 actually ;) .. 1G ram. I surely thought a sparc would handle
more than it seems to.. but I feel this may very well be a dev issue if
looked into further.. I'll explain as I go..
>>An example dialplan entry for a user is as simple as:
>>exten => 201,1,Dial(SIP/201,,t)
>>(we've got a few hundred of these but note that we only have
>>around 15
>>calls active at any given time)
>>because of the transfer requirement media goes through the
>>asterisk server.. we are using sip on both side of the
>>asterisk server ulaw codec is
>>preferred at the endpoints.
>>
>>Here's a vmstat at 1 second intervals .. at the time there
>>are 8 active
>>calls:
>> procs memory page disks
>>traps cpu
>> r b w avm fre flt re pi po fr sr sd0 cd0
>> int sys cs us sy id 0 0 0 42288 868840 7 0 0 0
>>0 0 0 0 367 469 61 0 0 100
>> 0 0 0 42288 868840 7 0 0 0 0 0 0 0 395
>>802 92 1 0 99
>> 0 0 0 42288 868840 12 0 0 0 0 0 0 0 377
>>554 64 0 0 100
>> 0 0 0 42288 868840 7 0 0 0 0 0 0 0 359
>>476 62 0 0 100
>> 0 0 0 42288 868840 7 0 0 0 0 0 0 0 364
>>468 62 0 0 100
>> 0 0 0 42288 868840 7 0 0 0 0 0 0 0 375
>>459 61 0 0 100
>> 0 0 0 42288 868840 7 0 0 0 0 0 0 0 364
>>480 61 0 0 100
>>
>>But on processing a call vmstat's sys traps and cpu cs
>>counters both jump way up..
>>
>> procs memory page disks
>>traps cpu
>> r b w avm fre flt re pi po fr sr sd0 cd0
>> int sys cs us sy id 0 0 0 42280 868848 7 0 0 0
>>0 0 0 0 709 1429 160 3 1 96
>> 0 0 0 42288 868840 11 0 0 0 0 0 0 0 659
>>1395 168 0 0 100
>>
>>This issue has steadily gotten worse with the addition of
>>more dialplan
>>entries like the one above..
>>
>>
>
>That is to be expected.
>
>
>
>>Note that as long as all calls are active and setup (nothing
>>happening involving dialplan) audio is perfect.. only when
>>dialplan processing happens does asterisk seem to hang up for
>>a second..
>>
>>
>
>That's because the introduction of a new call to the system requires
>work on the part of the CPU. Once the channel is established there's
>very little for the CPU to do.
>
>
>
See that's the thing.. the idle indicator on vmstat never shows the cpu
bottom out.. 96 % idle should not be the culprit for the reason why
asterisk hangs the audio.. at least in my mind it should not be.. I
could understand a high influx of interrupts having an impact like this
but system calls and traps? That sure feels/seems less likely to me to
be able to literally stop the flow of packets from the system when a
call comes in or out.. and I've tested this out extensively.. it is only
asterisk packets that stop.. other processes on the system perform as
they should.. heck, a ping to a router keeps sending packets when
asterisk stops sending it's audio..
I strongly believe that there is something more sinister lurking beneath
this problem than simply cpu restrictions.
I've got an identical system acting as a router..
1 0 0 49592 166272 7 0 0 0 0 0 0 0 6194 82 26
54 46 0
1 0 0 49600 166264 13 0 0 0 0 0 0 0 6307 67 24
65 35 0
1 0 0 49600 166264 7 0 0 0 0 0 0 0 6525 58 22
72 28 0
1 0 0 49600 166264 7 0 0 0 0 0 0 0 6994 53 22
58 42 0
So that's 0 idle cpu.. 6k ints a second and it was with me running an
ntop process on it.. the thing didn't even bat an eyelash at the load..
7.4 Mbps in+out was being passed at that time.. yet pings through it
only had a 0.679 ms std-dev to a popular webserver across several ip
providers. (I should note again the reason the cpu was at zero idle was
that I was beating up the system with ntop)
As I understand it ints are the most cpu consuming of all.. and yet the
ints aren't changing too much on the system.. just traps and cs go sky
high while just going through the motions of a new call.. So hopefully
that explains why I am having a little trouble understanding why a
bigger system is needed.. especially when it seems that only asterisk is
affected on the system and everything else I try seems just ducky..
But heck.. I suppose I can just ask this question.. why the heck does
asterisk generate so many traps and system calls when processing a call?
When completely idle my asterisk system sees about an average of 70 sys
traps a second.. when a call comes in.. the sys traps alone increase
over a 1000% increase.. and that is to a simple dialplan of just:
exten => 401,1,Dial(SIP/401)
0 0 0 42128 868912 7 0 0 0 0 0 0 0 221 65 13
0 0 100
0 0 0 42144 868896 15 0 0 0 0 0 0 0 308 759 95
2 3 95
0 0 0 42152 868888 9 0 0 0 0 0 0 0 253 314 42
0 0 100
So I guess my question has morphed into something along the lines of:
Why is the dialplan so expensive to parse and has or is any work going
into making it more system friendly or efficient?
>>Am I going crazy or am I near the mark in troubleshooting
>>this.. and if
>>so (near the mark and not crazy) then how can I help improve
>>the situation?
>>
>>
>
>Perhaps there are some performance optimizations that could improve
>matters somewhat, but I suspect that the least expensive, most painless
>way to resolve it is to replace the system with something more powerful.
>
>
>
'least expensive' is to buy something more powerful? *boggle*
;)
>Oh, also, you are ONLY running Asterisk on that server, yes? No
>database, web server, GUI desktop, or such?
>
>
>
Yes, only asterisk.. it's all alone.
I definitely appreciate continued feedback as I'd love to get this
nailed down..
>Regards,
>
>Jim.
>
>
>--
>Jim Van Meggelen
>jim at vanmeggelen.ca
>www.oreillynet.com/cs/catalog/view/au/2177
>
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mhess.vcf
Type: text/x-vcard
Size: 279 bytes
Desc: not available
Url : http://lists.digium.com/pipermail/asterisk-dev/attachments/20050715/304057af/mhess.vcf
More information about the asterisk-dev
mailing list