[asterisk-bugs] [JIRA] (ASTERISK-27210) Getting segfault in res_pjsip.so and libasteriskpj.so.2

Andreas Krüger (JIRA) noreply at issues.asterisk.org
Fri Oct 27 04:15:21 CDT 2017


    [ https://issues.asterisk.org/jira/browse/ASTERISK-27210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=239677#comment-239677 ] 

Andreas Krüger commented on ASTERISK-27210:
-------------------------------------------

We suspect there is a locking issues with PJSIP.

>From what we see (and think), please correct us if wrong. But for us it looks like Asterisk is crashing at: ../src/pj/lock.c:290 in the function grp_lock_acquire().

The function is as following:

{code}
static pj_status_t grp_lock_acquire(LOCK_OBJ *p)
{
    pj_grp_lock_t *glock = (pj_grp_lock_t*)p;
    grp_lock_item *lck;

    pj_assert(pj_atomic_get(glock->ref_cnt) > 0);

    lck = glock->lock_list.next;
    while (lck != &glock->lock_list) {

        pj_lock_acquire(lck->lock);
        lck = lck->next;

    }
    grp_lock_set_owner_thread(glock);
    pj_grp_lock_add_ref(glock);
    return PJ_SUCCESS;
}
{code}

According to the dump:

{code}
glock = 0x7f1100042df8
lck = 0x0
{code}

So lck is NULL.

So it looks like it tries to call pj_lock_acquire(lck->lock); where as lck is null and lck-lock would then cause a panic?

So the good question is. Why would lck = glock->lock_list.next; return NULL ? Is it ok?
Is there simply just missing a NULL check in this function?

> Getting segfault in res_pjsip.so and libasteriskpj.so.2
> -------------------------------------------------------
>
>                 Key: ASTERISK-27210
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-27210
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Resources/res_pjsip
>    Affects Versions: 14.5.0
>         Environment: Ubuntu 16.04.2 LTS
>            Reporter: Jeppe Ryskov Larsen
>            Assignee: Unassigned
>         Attachments: 20171027_asterisk-ASTERISK-27210-results.tar.gz, asterisk-ASTERISK-27210-results.tar.gz
>
>
> Suddenly we got these 4 segfaults in a relatively short timespan. 
> {code}
> Aug 21 10:38:28 osl1-voip-cluster01-asterisk05 kernel: [28594.890030] asterisk[27799]: segfault at 1e0 ip 00007f341b163500 sp 00007f34331a1a08 error 4 in res_pjsip.so[7f341b151000+49000]
> Aug 21 10:40:43 osl1-voip-cluster01-asterisk05 kernel: [28729.766816] asterisk[5930]: segfault at 1e0 ip 00007fe2552bb500 sp 00007fe0779b1a08 error 4 in res_pjsip.so[7fe2552a9000+49000]
> Aug 21 10:41:52 osl1-voip-cluster01-asterisk05 kernel: [28799.083666] asterisk[7569]: segfault at 18 ip 00007f7828c780f8 sp 00007f7722f21940 error 4 in libasteriskpj.so.2[7f7828b6f000+15a000]
> Aug 21 10:42:44 osl1-voip-cluster01-asterisk05 kernel: [28850.946232] asterisk[9174]: segfault at 90 ip 00007f8c9fd630e8 sp 00007f8ba1d18940 error 4 in libasteriskpj.so.2[7f8c9fc5a000+15a000]
> {code}
> After investigating the circumstances during that timespan, i saw no behaviour out of the ordinary, and no changes has been made in the last months, and this has never occurred.
> Sadly, this is running in our production system, where we have debug turned off, so i can not provide a backtrace, but was just hoping maybe someone has seen similar before or can pinpoint us the the right direction for collecting more information so we can debug this further. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list