[asterisk-bugs] [Asterisk 0016085]: Crash in __ast_pthread_mutex_unlock chan_sip after park
Asterisk Bug Tracker
noreply at bugs.digium.com
Thu Oct 29 02:47:57 CDT 2009
A NOTE has been added to this issue.
======================================================================
https://issues.asterisk.org/view.php?id=16085
======================================================================
Reported By: Micc
Assigned To:
======================================================================
Project: Asterisk
Issue ID: 16085
Category: Channels/chan_sip/General
Reproducibility: have not tried
Severity: crash
Priority: normal
Status: acknowledged
Asterisk Version: 1.6.1.6
JIRA:
Regression: No
Reviewboard Link:
SVN Branch (only for SVN checkouts, not tarball releases): N/A
SVN Revision (number only!):
Request Review:
======================================================================
Date Submitted: 2009-10-16 13:09 CDT
Last Modified: 2009-10-29 02:47 CDT
======================================================================
Summary: Crash in __ast_pthread_mutex_unlock chan_sip after
park
Description:
Looking at the back trace, it looks like someone was parking and hung up
too soon or something.
======================================================================
Relationships ID Summary
----------------------------------------------------------------------
related to 0015538 [patch] Multi-tenant parking broken in ...
======================================================================
----------------------------------------------------------------------
(0112890) Micc (reporter) - 2009-10-29 02:47
https://issues.asterisk.org/view.php?id=16085#c112890
----------------------------------------------------------------------
Apparently someone has already done a lot of thought about this and has
some very interesting code in sip_hangup() in chan_sip.c. The crash is
happening in a part of the code that has a comment above that says, "This
may get hairy..."
Whoever wrote that must have built their thought pyramid around what is
happening here and I would love to understand what this is trying to solve.
The crash is in a call to a macro called CHANNEL_DEADLOCK_AVOIDANCE(chan),
which is in a loop that is locking the owner of the channel. I think its
best if I just include the code snippet here so we can all see it easily.
/* We need to get the lock on bridge because ast_rtp_set_vars will
attempt
* to lock the bridge. This may get hairy...
*/
while (bridge && ast_channel_trylock(bridge)) {
struct ast_channel *chan = p->owner;
sip_pvt_unlock(p);
do {
/* Use chan since p->owner could go NULL on us
* while p is unlocked
*/
CHANNEL_DEADLOCK_AVOIDANCE(chan);
} while (sip_pvt_trylock(p));
bridge = p->owner ? ast_bridged_channel(p->owner) : NULL;
}
So, in my opinion, this code could be more clear about what is happening
inside the deadlock avoidance macro. Because this is such a "hairy"
situation it might be best to bring the code out of the macro and just
include it here. I don't see in the macro where its calling
ast_pthread_mutex_unlock. I see the call to lock and ast_channel_lock,
which may call it, but I would excpect to see that in the back trace.
So, lets just take a look at line 671 of lock.h like the backtrace says.
int canlog = strcmp(filename, "logger.c") & t->tracking;
And here is the line from the backtrace:
https://issues.asterisk.org/view.php?id=0 0x027a37d6 in
__ast_pthread_mutex_unlock (filename=0x280d054
"chan_sip.c", lineno=5342,
func=0x28101f4 "sip_hangup", mutex_name=0x281047b
"&chan->lock_dont_use", t=0x98)
at /usr/src/asterisk/asterisk-1.6.1.6/include/asterisk/lock.h:671
This is the first time I've used gdb to do any real debugging, so I might
be all wrong here, but it looks to me like some things got a bit mixed up.
The name of the mutex isn't suppose to be "&chan->lock_dont_use" is it? But
I don't think thats why its crashing here. t is supposed to be a pointer to
ast_mutex_t, but if I read gdb correctly here it says t is 0x98, which is
not large enough to be a pointer. It looks like memory was overwritten
here.
So the question is could the memory be overwritten by the new calls to
ast_string_field_set? I would hope not, but someone else can maybe check
into that for me. I think more likely the owner channel was killed and the
thing this is trying to prevent is not actually working correctly.
So I'm going to try to make a few changes and upgrade to 1.6.1.8. I'm
going to put the code directly into the loop instead of the macro so I can
get a better back trace hopefully. I hope I've shed enough light on this
issue that maybe someone else can see the problem now. Or maybe I don't
know how to read a backtrace and use gdb so I might be completely off here.
Issue History
Date Modified Username Field Change
======================================================================
2009-10-29 02:47 Micc Note Added: 0112890
======================================================================
More information about the asterisk-bugs
mailing list