[asterisk-bugs] [JIRA] (ASTERISK-25404) segfault/crash in chan_pjsip_hangup ... at chan_pjsip.c

Mark Michelson (JIRA) noreply at issues.asterisk.org
Wed Sep 30 11:41:33 CDT 2015


    [ https://issues.asterisk.org/jira/browse/ASTERISK-25404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=227716#comment-227716 ] 

Mark Michelson commented on ASTERISK-25404:
-------------------------------------------

It looks like somehow, the {{PJSIP/0110-Instr-Dsgn-Prof-Lrng-Ctr-00000000}} channel has two threads that think they are the controllers of that channel. Looking in debug.txt, you can see the following as a warning indicator that things are about to go south:
{noformat}
[Sep 18 19:15:31] WARNING[18107] pbx.c: PJSIP/0110-Instr-Dsgn-Prof-Lrng-Ctr-00000000 already has PBX structure?? 
{noformat}

It looks like the problem is that right after the call is connected, both Asterisk and the calling party attempt to send reinvites to each other. The reinvites result in both parties sending 491 responses to each other. Shortly after, the calling party sends another INVITE to Asterisk. Now, when this arrives, we are not properly detecting this reinvite as a reinvite inside of chan_pjsip. Instead, we think it's a new INVITE because the inv_session's state is "CONNECTING" instead of "CONFIRMED". The result is that a second thread is spawned in order to route the call. From this point on, you start seeing messages like the following occasionally:
{noformat}
[Sep 18 19:15:31] DEBUG[18107][C-00000001] channel.c: Thread 0x7effaf26f700 is blocking 'PJSIP/0110-Instr-Dsgn-Prof-Lrng-Ctr-00000000', already blocked by thread 0x7f01589c3700 in procedure ast_waitfor_nandfds
{noformat}
Once one of the controlling threads hangs up the channel, then the other attempts to do the same, and unfortunately structures have already been destroyed at that point.

It appears that the main problem here is our method of detecting that an incoming INVITE request is a reinvite. We try to use the session state, but apparently the session state can be re-set from CONFIRMED back to CONNECTING when dealing with certain reinvite situations. If we fix the way that we detect reinvites, then we will not have conflicting threads, and we won't have crashes.

> segfault/crash in chan_pjsip_hangup ... at chan_pjsip.c
> -------------------------------------------------------
>
>                 Key: ASTERISK-25404
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-25404
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Channels/chan_pjsip
>    Affects Versions: 13.1.0
>         Environment: certified/13.1-cert3-rc1 (snapshot asterisk-20702e0 dated 17 Sep 2015. Also experienced with snapshot asterisk-5b06b6f dated 7 Sep 2015).
> PJSIP with PJProject 2.4.5
> DAHDI 2.10.2
> libpri version: 1.4.15
> Digium Phone Module for Asterisk Version 13.0_2.2.0
> Digium Phone firmware 2_0_1_0_74452
> Ubuntu 14.04.2 LTS (GNU/Linux 3.16.0-43-generic x86_64)
> ProLiant DL380 Gen9. 16 GB memory 
> Wildcard AEX2400: wctdm24xxp+ 
>            Reporter: Chet Stevens
>            Assignee: Mark Michelson
>            Severity: Critical
>         Attachments: backtrace2.txt, backtrace3.txt, backtrace.txt, debug3.zip, debug.zip
>
>
> We are experiencing frequent crashes of Asterisk (6 times on 9/18/15). kern.log shows segfault with chan_pjsip.so:
> {noformat}
> Sep 18 19:15:36 0651-Facilities-Audit-Campus kernel: [6164166.327465] asterisk[18107]: segfault at 0 ip 00007f00a7f3e7f4 sp 00007effaf26eb00 error 4 in chan_pjsip.so[7f00a7f37000+f000]
> {noformat}
> A backtrace and debug for the minute of the crash with previous 100k lines will be attached.



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list