[asterisk-bugs] [JIRA] (ASTERISK-27170) segfault in pj_sockaddr_in_set_str_addr

Wed Aug 16 09:39:08 CDT 2017

    [ https://issues.asterisk.org/jira/browse/ASTERISK-27170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=238103#comment-238103 ] 

nappsoft commented on ASTERISK-27170:
-------------------------------------

FYI: today we had the 5th crash on the live system since I've started making SIP-traces. In every single case the customer dialed *8 what makes the agi script execute PickupChan. In all of the 5 cases the system crashed just after this particular call ended (+/- 0-1 seconds later, always when a thread tried to allocate memory). (So this is still all we know and the strange thing is that it's still hard to reproduce the crash even though I know some of the necessary conditions but obviously not all of them...)

As I noticed when running agi debug that the agi process number changes when doing "Exec PickupChan" in agi I decided to no longer exec PickupChan in the agi script directly => I'll deploy a version to the customer tonight that will Set a channel variable with the name of the channel to pickup, set a dialplan priority in which the PickupChan operation will be done and leave the agi script. Maybe this helps, will let you know.

> segfault in pj_sockaddr_in_set_str_addr
> ---------------------------------------
>
>                 Key: ASTERISK-27170
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-27170
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: PBX/General
>    Affects Versions: 13.16.0
>         Environment: 64bit linux musl 1.1.15
>            Reporter: nappsoft
>            Assignee: Unassigned
>         Attachments: crashlog.txt, trace_cel_crash.txt, trace.txt, valgrind2.txt
>
>
> From time to time asterisk crashes in pj_sockaddr_i_set_str_add. The asterisk version we use is 13.16.0 with some stability patches that flew into 13.17.0 (we will update to 13.17.0 soon). But we already had the same crashes with unpatched 13.16.0 versions and with older versions as well.
> According to the sip traces the last thing that happened was a sip transfer. The messageflow was:
> REFER (Phone) -> 202 Accepted (PBX) -> NOTIFY Trying (PBX) -> NOTIFY OK (PBX) -> BYE (Phone) - > OK (PBX for the BYE message) -> OK (Phone for the NOTIFY Trying) -> OK (Phone for the NOTIFY OK)
> As these are embedded systems with limited resources it's always difficult to make crash dumps there or to run asterisk in gdb... I'll try to get some complete backtraces in the future, but maybe somebody has an idea based on the described scenario. => maybe there is a race condition when the Phone sends OK messages for the NOTIFY messages after that the phone has already sent a BYE for the same call?

--
This message was sent by Atlassian JIRA
(v6.2#6252)