[asterisk-bugs] [JIRA] (ASTERISK-25000) Deadlock in ast_do_masquerade (specifically in ast_hangup on the zombie clone if it's hungup during the masquerade)

William luke (JIRA) noreply at issues.asterisk.org
Mon Apr 27 16:21:33 CDT 2015


     [ https://issues.asterisk.org/jira/browse/ASTERISK-25000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

William luke updated ASTERISK-25000:
------------------------------------

    Status: Triage  (was: Waiting for Feedback)

> Deadlock in ast_do_masquerade (specifically in ast_hangup on the zombie clone if it's hungup during the masquerade)
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: ASTERISK-25000
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-25000
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Core/Channels
>    Affects Versions: 11.16.0
>         Environment: CentOS 6. Dual Xeon Dell server. Under relatively heavy load (250k calls/day), with lots of AMI actions.
>            Reporter: William luke
>            Assignee: William luke
>            Severity: Critical
>         Attachments: backtrace-threads-20150422.txt, dialplan_snippet.txt, verboselog.rar
>
>
> We're seeing a deadlock where the AMI thread completely locks up. (Thread ID 19109 in backtrace attachment)
> A backtrace shows that it's while doing a dual redirect.
> When redirecting the second channel (from manager.c:3895), inside ast_do_masquerade, we decide the clone was a zombie, and then in channel.c line 7331 call ast_hangup on it.
> This ast_hangup tries to grab a channel lock (channel.c:2885) and hangs here indefinitely.
> What's peculiar is that a few lines higher up it's successfully managed to grab and then release this same channel lock.
> It would seem that as the masquerade begun, this channel (clonechan) had at the same moment hungup. (see line 212288 in the verboselog attachment. The channel in question is "SIP/gl-agw-01-000f7f0e")
> So something has happened to the state of this channel and something has not release it's channel_lock.
> I'm unable to see which other thread is holding the lock.
> The issue occured at 15:32:20 in the verboselog file. The first part of the dual redirect can be seen at line 212279.
> I executed a "core restart" via the CLI, but this hung the CLI, and I had to kill the Asterisk process.



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list