[asterisk-bugs] [JIRA] (ASTERISK-28869) pjsip: Crash in timer when sending request

Chris (JIRA) noreply at issues.asterisk.org
Mon May 18 12:57:25 CDT 2020


    [ https://issues.asterisk.org/jira/browse/ASTERISK-28869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=250838#comment-250838 ] 

Chris commented on ASTERISK-28869:
----------------------------------

Just wanted to update this, I know this setup isn't supported but I noticed the latest PJSIP 2.10 included some timer refactoring, so I compiled asterisk against that and it seems to have resolved these segfaults (no issues for over a week where we normally would have had a handful of crashes)

> pjsip: Crash in timer when sending request
> ------------------------------------------
>
>                 Key: ASTERISK-28869
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-28869
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: pjproject/pjsip
>    Affects Versions: 16.5.0, 16.9.0, 16.10.0
>         Environment: Google cloud, asterisk running on n1-standard-4 (4 vCPUs, 15 GB memory - Ubuntu 18.04.2 LTS ). Multiple asterisk instances using realtime config (share same mysql instance). Pjsip devices. Asterisk nodes are running behind Kamailio+RTPEngine. Softphones are Webrtc based connecting to Kamailio via websockets
>            Reporter: Chris
>            Assignee: Unassigned
>              Labels: webrtc
>         Attachments: 73fgqxw2.txt, coredump.tar.gz
>
>
> We have been running into random segfaults in Asterisk (tried versions 16.5 / 16.9 and now 16.10). Segfaults seem random but only appear to happen under a load of 15+ concurrent calls (never occur after hours while system is in use but under lighter load). Performance on asterisk machines during crash seems fine, not seeing any errors or warnings in the logs.
> Originally we started with one asterisk node but due to crashes we ended up adding multiple Asterisk nodes (behind Kamailio) to let us take more calls. It seems like receiving/transmitting any SIP message (publish device state, qualify aor, hangup etc)  can cause a segfault. 
> Segfaults always occur inside pop_freelist in pjsip library. Ht's timer_ids_freelist always appears corrupt/out of range during segfault eg:
> ht= {pool = 0x5616898c0140, max_size = 262142, cur_size = 23, max_entries_per_poll = 10, lock = 0x561689d02440, auto_delete_lock = 1, heap = 0x7f45530e7038,
>   timer_ids = 0x561689c02448, timer_ids_freelist = 1587220789, callback = 0x0}
> Thread1 backtrace: https://pastebin.com/73fgqxw2
> We have the full coredumps/logs available but they contain customer data.  Asterisk nodes are compiled with don't optimize/better backtraces



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list