[asterisk-bugs] [JIRA] (ASTERISK-28093) Asterisk Deadlocks with High Load

Richard Mudgett (JIRA) noreply at issues.asterisk.org
Fri Oct 5 14:33:54 CDT 2018


    [ https://issues.asterisk.org/jira/browse/ASTERISK-28093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=245084#comment-245084 ] 

Richard Mudgett commented on ASTERISK-28093:
--------------------------------------------

[^backtrace-threads.txt] - It's a deadlock involving the channels container.  It appears to be a deadlock between thread 64 and 796.

* Thread 64 has the channels container lock and wants the 0x7f20ce706e00 channel lock.  It is trying to check if the uniqueid is unique.
* Thread 793 Wants the 0x7f20ce706e00 channel lock to queue a frame onto the local channel.  Without the ASTERISK-27094 fix in 13.22.0 and later this thread would have deadlock potential.
* Thread 796 Likely has the 0x7f20ce706e00 channel lock and wants the channels container lock to hangup that channel.  (I've found another reason to not like RAII_VAR.  In this case the backtrace doesn't show which exit path in pbx_outgoing_attempt() RAII_VAR is cleaning up.)


> Asterisk Deadlocks with High Load
> ---------------------------------
>
>                 Key: ASTERISK-28093
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-28093
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: General
>    Affects Versions: 13.18.5, 13.22.0, 13.23.1
>         Environment: Debian 8.11 64 bits
> Xeon Ex64
> 32GB RAM
>            Reporter: Nicolas Tanski
>            Assignee: Nicolas Tanski
>            Severity: Critical
>              Labels: crash, deadlock
>         Attachments: backtrace-threads.txt
>
>
> Hi Team,
> we are experiencing frequent crash issues in the Asterisk instance, we are using our own dialer solution with AMI protocol.
> The Asterisk service operates on average with 250 simultaneous calls, 40 record calls, the load is between 1 and 3, in some cases the load quickly rises to 10, which causes the rejection of new calls and the instance of Asterisk does not respond.
> We were only able to restart the service by killing the Asterisk service with the Kill PID command.
> It was enabled the flags DONT_OPTIMIZE, BETTER_BACKTRACES and MALLOC_DEBUG to try identify the problem.
> Summary
> Average simultaneous calls: 250
> Average record calls: 40
> Load Average: > 1 and < 3
> High Load Average: > 10
> After the crash in Asterisk, we execute the commands and the service returns to normal operation
> \# kill -9 PID_ASTERISK
> \# /etc/init.d/asterisk start



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list