[asterisk-bugs] [JIRA] (ASTERISK-28869) pjsip: Crash in timer when sending request

Chris (JIRA) noreply at issues.asterisk.org
Tue May 5 12:52:25 CDT 2020


    [ https://issues.asterisk.org/jira/browse/ASTERISK-28869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=250647#comment-250647 ] 

Chris commented on ASTERISK-28869:
----------------------------------

Attached core dump files + snippet of full asterisk log leading up to crash

To add more information to our setup, the dialplan is pretty simple (plays greeting, starts mixmonitor for certain queues then hands off to our Stasis app which handles starting/stopping MOH and assigning a call to a queued agent via dialing the agent then bridging the 2 channels (via ARI))

> pjsip: Crash in timer when sending request
> ------------------------------------------
>
>                 Key: ASTERISK-28869
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-28869
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: pjproject/pjsip
>    Affects Versions: 16.5.0, 16.9.0, 16.10.0
>         Environment: Google cloud, asterisk running on n1-standard-4 (4 vCPUs, 15 GB memory - Ubuntu 18.04.2 LTS ). Multiple asterisk instances using realtime config (share same mysql instance). Pjsip devices. Asterisk nodes are running behind Kamailio+RTPEngine. Softphones are Webrtc based connecting to Kamailio via websockets
>            Reporter: Chris
>            Assignee: Unassigned
>              Labels: webrtc
>         Attachments: 73fgqxw2.txt, coredump.tar.gz
>
>
> We have been running into random segfaults in Asterisk (tried versions 16.5 / 16.9 and now 16.10). Segfaults seem random but only appear to happen under a load of 15+ concurrent calls (never occur after hours while system is in use but under lighter load). Performance on asterisk machines during crash seems fine, not seeing any errors or warnings in the logs.
> Originally we started with one asterisk node but due to crashes we ended up adding multiple Asterisk nodes (behind Kamailio) to let us take more calls. It seems like receiving/transmitting any SIP message (publish device state, qualify aor, hangup etc)  can cause a segfault. 
> Segfaults always occur inside pop_freelist in pjsip library. Ht's timer_ids_freelist always appears corrupt/out of range during segfault eg:
> ht= {pool = 0x5616898c0140, max_size = 262142, cur_size = 23, max_entries_per_poll = 10, lock = 0x561689d02440, auto_delete_lock = 1, heap = 0x7f45530e7038,
>   timer_ids = 0x561689c02448, timer_ids_freelist = 1587220789, callback = 0x0}
> Thread1 backtrace: https://pastebin.com/73fgqxw2
> We have the full coredumps/logs available but they contain customer data.  Asterisk nodes are compiled with don't optimize/better backtraces



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list