[asterisk-bugs] [JIRA] (ASTERISK-26421) Segmentation Fault with ARI originate into mixing bridge with 43 clients

Mark Michelson (JIRA) noreply at issues.asterisk.org
Wed Oct 12 16:06:01 CDT 2016


    [ https://issues.asterisk.org/jira/browse/ASTERISK-26421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=232678#comment-232678 ] 

Mark Michelson commented on ASTERISK-26421:
-------------------------------------------

This issue has made it up the queue and is now assigned to me. When I saw how many frames were in the stack trace, my initial thought was that it had to be a loop in the CDR list. *It's not*. There actually is a chain of distinct CDRs that is that long somehow. The unreferencing pattern for a CDR is to call the destructor for the CDR, which unreferences the next CDR in the chain, which calls the destructor for that CDR, and repeat down the line. Usually, this chain is not very long so this pattern does not overflow the stack. In this particular case, though, there somehow got to be 1639 CDRs in the chain. That resulted in the 6500+ frame stack trace you saw and the eventual stack overflow.

So, there are two avenues to go down when solving this:
# Change the nature of CDR destruction. Rather than growing the stack as CDRs are destroyed, use an iterative approach, performing all destruction within the same stack frame. This probably is not incredibly difficult, and it will solve the case of the crash happening.
# Figure out how the chain got to be that long, and determine if that is a valid case. For this, I think I'll need to have some smaller test case to be able to reproduce the issue. I understand the basic premise that Asterisk was used to place calls into a mixing bridge. And I'm pretty sure based on the stack trace that we are talking about an ARI mixing bridge here and not ConfBridge. I'm curious if you can describe (just with words, no code necessary -- yet) what your dangerous demo was doing under the hood. From the stack trace, I see some channels running the ARI "smsconference" application, but I can't tell what types of channels these are (Local?, SIP?). I also see some SIP channels dialing out to PSTN numbers. I'm not sure where these SIP channels are connected to, though, on the other side (presumably they're in the ARI mixing bridge?).

> Segmentation Fault with ARI originate into mixing bridge with 43 clients
> ------------------------------------------------------------------------
>
>                 Key: ASTERISK-26421
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-26421
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>    Affects Versions: 13.11.2
>            Reporter: Andrew Nagy
>            Assignee: Mark Michelson
>            Severity: Minor
>         Attachments: backtrace.txt
>
>
> Asterisk crashed in the middle of our dangerous demo. We setup a system that would get an SMS then make an outbound call and bridge that call in. At around 43 calls Asterisk was at about 177% cpu. Since we lost the Dangerous Demo because of an Asterisk crash we'd like this moved to our L4 support for immediate fixing. Then please bring back everyone for Astricon's Dangerous Demos so that we can try this again and win, of course.
> .
> .
> .
> .
> .
> .
> .
> .
> .
> .
> .
> .
> Just kidding. This is not a priority I am just attaching the backtrace incase it's relevant for anyone.
> Thanks for all you do and thank for Astricon and Asterisk!



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list