[asterisk-dev] Help Solve the Mysterious Slowdown in Sip channel driver in Trunk!

Steve Murphy murf at digium.com
Mon Jun 30 16:10:45 CDT 2008


On Fri, 2008-06-27 at 17:44 -0600, Steve Murphy wrote:
> I am seeking help in solving a Mystery!!
> 
> It goes something like this:
> 
> In 1.4, the sip channel driver works fine.
> In trunk, it's slower than a dog. An old, old
> dog with a missing leg (or two).
> 
> ctooley has done tests, and the response
> time to an INVITE seems to be the culprit.
> 
> I have done several experiments, all of which
> are in the http://svn.digium.com/svn/asterisk/team/murf/mtxprof
> branch:
> 


OK, I've made some progress on this bug, I *think*.

Way back when, when I noted that the __ast_answer() was
waiting a long time for a lock, Russell asked an excellent
question on IRC, namely: Does the parent thread need to
lock the channel before calling pbx_start()?

Well, to me, that's opening that famous channel-locking
can of worms. 

Well, it looks like we are going to have to open 
that can after all.

Well, I wondered about the "apparently" heavy load during
the pbx_start call, what it might be caused by, and the
most obvious, the almost only thing possible, was the 
pbx_start() thread. I do see storms of find_channel
activity, where it is hitting the 199 iteration wall,
dozens of times in a row. Could that be the cause?
So, I chopped out the ast_channel_lock call in the 
ast_handle_request_invite() routine.

I expected to deadlock or crash, but to my pleasant
surprise, everything ran fine, and I started seeing
times like this:
                times in usec
pbx_start      | 593 | 507 | 526 | 529 | 544 | 687
pthread_create | 250 | 230 | 235 | 236 | 253 | 340
inc_call_count |   3 |   3 |   3 |   2 |   3 |   4
ast_log        | 153 |  88 |  89 |  88 |  86 | 152

Which is somewhat better than the PREVIOUS results:


pbx_start      |23533 |45376| 23743 | 50120 | 53767
pthread_create | 1219 |1447 |   306 |   605 |   800
inc_call_count |    2 |7592 |     2 |     2 |     3
ast_log        |  327 | 197 |   214 |   321 |   300


Well, then, I remembered, if you don't lock it in
ast_handle_request_invite(), then you best not
unlock it in handle_request_do(). So, I commented
out the code that unlocked the channel. I fully
expected lockups or crashes. But it ran just fine
with the same improved results.

The problem is this: does the removal of this
channel lock request defy the locking procedures
laid out in Russell's recent "Locking, coding
guidelines addition" thread on the asterisk-dev
mailing list???

One negative side-effect: I'm now locking up
in "stop gracefully". I also meant to check to
see if the other drivers lock the channel before
a pbx_start.... but I haven't done it yet.

murf


-- 
Steve Murphy
Software Developer
Digium
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3227 bytes
Desc: not available
Url : http://lists.digium.com/pipermail/asterisk-dev/attachments/20080630/6c3c2d08/attachment.bin 


More information about the asterisk-dev mailing list