[asterisk-bugs] [JIRA] (ASTERISK-27170) segfault in pj_sockaddr_in_set_str_addr

Fri Aug 4 06:28:57 CDT 2017

    [ https://issues.asterisk.org/jira/browse/ASTERISK-27170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=237936#comment-237936 ] 

nappsoft commented on ASTERISK-27170:
-------------------------------------

I can't reproduce the crash with a modified softphone that does the same "late submission" of the answers to the NOTIFY messages, so this doesn't seem to be the root cause of the crash.

However I've made an interesting observation in the SIP traces we have from this crash: currently our systems:

- currently we have exactly two type of asterisk crashes on several systems (according to the segfault message in dmesg): one is crashing in libpjsip.so (pj_sockaddr_in_set_str_addr), one is crashing in libc (malloc or other memory operations). The pjsip crash seems to happen after a REFER (when hanging up the channels/sending the BYE mesasges), the other crash happens at the end of a call (as well while sending the BYE messages)
- we had both kinds of crashes on the same day on one system. In both cases one of the channels made a pickup with PickupChan over agi
- there were only 7 PickupChan operations during the whole day, but lots of other calls

=> so the fact that in both cases a Channel was involved that got picked up might be significant. I'll try to get other samples and to reproduce the issue

> segfault in pj_sockaddr_in_set_str_addr
> ---------------------------------------
>
>                 Key: ASTERISK-27170
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-27170
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: PBX/General
>    Affects Versions: 13.16.0
>         Environment: 64bit linux musl 1.1.15
>            Reporter: nappsoft
>
> From time to time asterisk crashes in pj_sockaddr_i_set_str_add. The asterisk version we use is 13.16.0 with some stability patches that flew into 13.17.0 (we will update to 13.17.0 soon). But we already had the same crashes with unpatched 13.16.0 versions and with older versions as well.
> According to the sip traces the last thing that happened was a sip transfer. The messageflow was:
> REFER (Phone) -> 202 Accepted (PBX) -> NOTIFY Trying (PBX) -> NOTIFY OK (PBX) -> BYE (Phone) - > OK (PBX for the BYE message) -> OK (Phone for the NOTIFY Trying) -> OK (Phone for the NOTIFY OK)
> As these are embedded systems with limited resources it's always difficult to make crash dumps there or to run asterisk in gdb... I'll try to get some complete backtraces in the future, but maybe somebody has an idea based on the described scenario. => maybe there is a race condition when the Phone sends OK messages for the NOTIFY messages after that the phone has already sent a BYE for the same call?

--
This message was sent by Atlassian JIRA
(v6.2#6252)