[asterisk-dev] Help Solve the Mysterious Slowdown in Sip channel driver in Trunk!
Steve Murphy
murf at digium.com
Mon Jun 30 16:10:45 CDT 2008
On Fri, 2008-06-27 at 17:44 -0600, Steve Murphy wrote:
> I am seeking help in solving a Mystery!!
>
> It goes something like this:
>
> In 1.4, the sip channel driver works fine.
> In trunk, it's slower than a dog. An old, old
> dog with a missing leg (or two).
>
> ctooley has done tests, and the response
> time to an INVITE seems to be the culprit.
>
> I have done several experiments, all of which
> are in the http://svn.digium.com/svn/asterisk/team/murf/mtxprof
> branch:
>
OK, I've made some progress on this bug, I *think*.
Way back when, when I noted that the __ast_answer() was
waiting a long time for a lock, Russell asked an excellent
question on IRC, namely: Does the parent thread need to
lock the channel before calling pbx_start()?
Well, to me, that's opening that famous channel-locking
can of worms.
Well, it looks like we are going to have to open
that can after all.
Well, I wondered about the "apparently" heavy load during
the pbx_start call, what it might be caused by, and the
most obvious, the almost only thing possible, was the
pbx_start() thread. I do see storms of find_channel
activity, where it is hitting the 199 iteration wall,
dozens of times in a row. Could that be the cause?
So, I chopped out the ast_channel_lock call in the
ast_handle_request_invite() routine.
I expected to deadlock or crash, but to my pleasant
surprise, everything ran fine, and I started seeing
times like this:
times in usec
pbx_start | 593 | 507 | 526 | 529 | 544 | 687
pthread_create | 250 | 230 | 235 | 236 | 253 | 340
inc_call_count | 3 | 3 | 3 | 2 | 3 | 4
ast_log | 153 | 88 | 89 | 88 | 86 | 152
Which is somewhat better than the PREVIOUS results:
pbx_start |23533 |45376| 23743 | 50120 | 53767
pthread_create | 1219 |1447 | 306 | 605 | 800
inc_call_count | 2 |7592 | 2 | 2 | 3
ast_log | 327 | 197 | 214 | 321 | 300
Well, then, I remembered, if you don't lock it in
ast_handle_request_invite(), then you best not
unlock it in handle_request_do(). So, I commented
out the code that unlocked the channel. I fully
expected lockups or crashes. But it ran just fine
with the same improved results.
The problem is this: does the removal of this
channel lock request defy the locking procedures
laid out in Russell's recent "Locking, coding
guidelines addition" thread on the asterisk-dev
mailing list???
One negative side-effect: I'm now locking up
in "stop gracefully". I also meant to check to
see if the other drivers lock the channel before
a pbx_start.... but I haven't done it yet.
murf
--
Steve Murphy
Software Developer
Digium
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3227 bytes
Desc: not available
Url : http://lists.digium.com/pipermail/asterisk-dev/attachments/20080630/6c3c2d08/attachment.bin
More information about the asterisk-dev
mailing list