[asterisk-bugs] [JIRA] Commented: (ASTERISK-20335) Crash in ast_cel_report_event

Mark Michelson (JIRA) noreply at issues.asterisk.org
Thu Sep 6 12:56:07 CDT 2012


    [ https://issues.asterisk.org/jira/browse/ASTERISK-20335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=196686#comment-196686 ] 

Mark Michelson commented on ASTERISK-20335:
-------------------------------------------

I've been working on this the past day, and I've unfortunately not made any good progress with it.

I've run thousands of calls through Asterisk 1.8.16.0-rc1 with CEL enabled and I have not managed to get a crash yet. I have tried dialing from within and from outside of a macro. I've tried dialing extensions that just play back audio, and I've tried dialing extensions that dial out to SIP endpoints. I have issued reload commands on the CLI at random times and I have logged in and out over AMI to try to elicit some sort of change.

I have tried running with both with and without valgrind enabled as well. Valgrind reports no invalid memory usage during my tests.

I've also looked through the CEL code, including the code as it exists in 1.8.16.0-rc1 as well as all changes that were made between 1.8.12.0 and 1.8.16.0-rc1. I'm not finding anything that seems like it would have caused a crash in the place where your backtrace indicates.

When you ran an unoptimized build and it was unusable, did you also have DEBUG_THREADS enabled? If you did, then disabling that might make the system capable of handling your load. Unfortunately, I have a feeling that you're not going to be re-enabling CEL any time soon since it has caused issues for you. If you have any other backtraces handy, uploading them might help a bit, because it may be that while the crashes were happening in similar places in the code, the circumstances leading to the crash might be different.

In an attempt to replicate your environment a bit more, I think my next move will be to set up a CentOS 5.8 VM and see if I can reproduce the problem there.

Do you happen to know circumstances behind the crash? Did it happen when a reload was issued? I can tell that the crash is occurring while calling Dial() from within your macro, but nothing seems out of the ordinary in the code path leading to the crash, nor does there seem to be anything out of the ordinary in any of the CEL code, as far as I can tell.

> Crash in ast_cel_report_event
> -----------------------------
>
>                 Key: ASTERISK-20335
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-20335
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Channels/chan_local
>    Affects Versions: 1.8.16.0
>         Environment: Centos server 5.8x64, 8 core CPU, 8GB RAM
>            Reporter: aragon
>            Assignee: aragon
>            Severity: Critical
>         Attachments: AST18-core-verbose.txt, asterisk.txt, core show channels.txt, default-dial-cav-joh-002-000918.txt, optimized backtrace.txt, SIP show channels.txt, verbose CLI sip set debug on.txt
>
>
> On a pretty busy system we get deadlocks and crashes daily since installing Asterisk 1.8.16rc1
> Upgraded from 1.8.12 because we were having problems with leaking bye's fixed in ASTERISK-19455
> We were able to collect verbose CLI, core show channels, sip show channels, and Asterisk CLI with sip set debug on.
> Also back traced a core dump file but this is an optimized build since we could not run non-optimized in this environment.
> Including the back trace anyway since it might help diagnose the problem.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



More information about the asterisk-bugs mailing list