[asterisk-bugs] [JIRA] (ASTERISK-25000) Deadlock in ast_do_masquerade (specifically in ast_hangup on the zombie clone if it's hungup during the masquerade)

William luke (JIRA) noreply at issues.asterisk.org
Tue Apr 28 03:52:33 CDT 2015


     [ https://issues.asterisk.org/jira/browse/ASTERISK-25000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

William luke updated ASTERISK-25000:
------------------------------------

    Attachment: backtrace-threads-20150427-first.txt

This is what looks like the same issue occurring, although presenting slightly differently as in this case it was the devicestate thread (12268) that fell over the rogue zombie locked channel (SIP/gl-agw-01-000a6da5<ZOMBIE>), whilst holding the lock to the channel container, and then everything piles up behind that.
A masquerade had recently completed, leaving that zombie thread behind, but the thread seemingly never hangs up.
The thread running this channel (20167) is also locked, which is odd, as I was half expecting it to have been stuck inside hangup perhaps, but it would seem not.
Is it remotely possible ast_channel_unlock could fail?

> Deadlock in ast_do_masquerade (specifically in ast_hangup on the zombie clone if it's hungup during the masquerade)
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: ASTERISK-25000
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-25000
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Core/Channels
>    Affects Versions: 11.16.0
>         Environment: CentOS 6. Dual Xeon Dell server. Under relatively heavy load (250k calls/day), with lots of AMI actions.
>            Reporter: William luke
>            Assignee: Matt Jordan
>            Severity: Critical
>         Attachments: backtrace-threads-20150422.txt, backtrace-threads-20150427-first.txt, dialplan_snippet.txt, verboselog.rar
>
>
> We're seeing a deadlock where the AMI thread completely locks up. (Thread ID 19109 in backtrace attachment)
> A backtrace shows that it's while doing a dual redirect.
> When redirecting the second channel (from manager.c:3895), inside ast_do_masquerade, we decide the clone was a zombie, and then in channel.c line 7331 call ast_hangup on it.
> This ast_hangup tries to grab a channel lock (channel.c:2885) and hangs here indefinitely.
> What's peculiar is that a few lines higher up it's successfully managed to grab and then release this same channel lock.
> It would seem that as the masquerade begun, this channel (clonechan) had at the same moment hungup. (see line 212288 in the verboselog attachment. The channel in question is "SIP/gl-agw-01-000f7f0e")
> So something has happened to the state of this channel and something has not release it's channel_lock.
> I'm unable to see which other thread is holding the lock.
> The issue occured at 15:32:20 in the verboselog file. The first part of the dual redirect can be seen at line 212279.
> I executed a "core restart" via the CLI, but this hung the CLI, and I had to kill the Asterisk process.



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list