[asterisk-bugs] [JIRA] (ASTERISK-27210) Getting segfault in res_pjsip.so and libasteriskpj.so.2
Andreas Krüger (JIRA)
noreply at issues.asterisk.org
Fri Oct 27 04:15:21 CDT 2017
[ https://issues.asterisk.org/jira/browse/ASTERISK-27210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=239677#comment-239677 ]
Andreas Krüger commented on ASTERISK-27210:
-------------------------------------------
We suspect there is a locking issues with PJSIP.
>From what we see (and think), please correct us if wrong. But for us it looks like Asterisk is crashing at: ../src/pj/lock.c:290 in the function grp_lock_acquire().
The function is as following:
{code}
static pj_status_t grp_lock_acquire(LOCK_OBJ *p)
{
pj_grp_lock_t *glock = (pj_grp_lock_t*)p;
grp_lock_item *lck;
pj_assert(pj_atomic_get(glock->ref_cnt) > 0);
lck = glock->lock_list.next;
while (lck != &glock->lock_list) {
pj_lock_acquire(lck->lock);
lck = lck->next;
}
grp_lock_set_owner_thread(glock);
pj_grp_lock_add_ref(glock);
return PJ_SUCCESS;
}
{code}
According to the dump:
{code}
glock = 0x7f1100042df8
lck = 0x0
{code}
So lck is NULL.
So it looks like it tries to call pj_lock_acquire(lck->lock); where as lck is null and lck-lock would then cause a panic?
So the good question is. Why would lck = glock->lock_list.next; return NULL ? Is it ok?
Is there simply just missing a NULL check in this function?
> Getting segfault in res_pjsip.so and libasteriskpj.so.2
> -------------------------------------------------------
>
> Key: ASTERISK-27210
> URL: https://issues.asterisk.org/jira/browse/ASTERISK-27210
> Project: Asterisk
> Issue Type: Bug
> Security Level: None
> Components: Resources/res_pjsip
> Affects Versions: 14.5.0
> Environment: Ubuntu 16.04.2 LTS
> Reporter: Jeppe Ryskov Larsen
> Assignee: Unassigned
> Attachments: 20171027_asterisk-ASTERISK-27210-results.tar.gz, asterisk-ASTERISK-27210-results.tar.gz
>
>
> Suddenly we got these 4 segfaults in a relatively short timespan.
> {code}
> Aug 21 10:38:28 osl1-voip-cluster01-asterisk05 kernel: [28594.890030] asterisk[27799]: segfault at 1e0 ip 00007f341b163500 sp 00007f34331a1a08 error 4 in res_pjsip.so[7f341b151000+49000]
> Aug 21 10:40:43 osl1-voip-cluster01-asterisk05 kernel: [28729.766816] asterisk[5930]: segfault at 1e0 ip 00007fe2552bb500 sp 00007fe0779b1a08 error 4 in res_pjsip.so[7fe2552a9000+49000]
> Aug 21 10:41:52 osl1-voip-cluster01-asterisk05 kernel: [28799.083666] asterisk[7569]: segfault at 18 ip 00007f7828c780f8 sp 00007f7722f21940 error 4 in libasteriskpj.so.2[7f7828b6f000+15a000]
> Aug 21 10:42:44 osl1-voip-cluster01-asterisk05 kernel: [28850.946232] asterisk[9174]: segfault at 90 ip 00007f8c9fd630e8 sp 00007f8ba1d18940 error 4 in libasteriskpj.so.2[7f8c9fc5a000+15a000]
> {code}
> After investigating the circumstances during that timespan, i saw no behaviour out of the ordinary, and no changes has been made in the last months, and this has never occurred.
> Sadly, this is running in our production system, where we have debug turned off, so i can not provide a backtrace, but was just hoping maybe someone has seen similar before or can pinpoint us the the right direction for collecting more information so we can debug this further.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
More information about the asterisk-bugs
mailing list