[asterisk-bugs] [JIRA] (ASTERISK-25321) [patch]DeadLock ChanSpy with call over Local channel

Fri Mar 4 08:37:57 CST 2016

    [ https://issues.asterisk.org/jira/browse/ASTERISK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=229833#comment-229833 ] 

Walter Doekes commented on ASTERISK-25321:
------------------------------------------

{quote}
14:20 < hexanol> wdoekes: so I just did a few tests with a patched Asterisk 11, and before the patch, I was able to make it freeze systematically (by adding a sleep(1) in ast_do_masquerade, just before the call to ast_autochan_new_channel), and once the patch is applied, I can't reproduce it
{quote}

So, the patch works as advertised.

Leaves us with the possibility of the old channel being unreffed to 0 after the unref in ast_autochan_new_channel.

The ast_autochan_new_channel is called (only) in ast_do_masquerade where the argument to ast_do_masquerade is the channel that would disappear.
{code}
int ast_do_masquerade(struct ast_channel *original)
...
        ast_channel_lock(original);
...
        /* Bump the refs to ensure that they won't dissapear on us. */
        ast_channel_ref(original);
        ast_channel_ref(clonechan);
...
        ast_channel_unlock(original);
...
        ast_channel_lock_both(original, clonechan);
...
        ast_autochan_new_channel(clonechan, original);
...
        ast_channel_unlock(original);
        ast_channel_unlock(clonechan);
...
        ast_channel_lock(original);
...
                ast_channel_unlock(original);
...
        ast_channel_unref(original);
        ast_channel_unref(clonechan);
{code}
During {{ast_autochan_new_channel(clonechan, original)}}, the clonechan, which was the previous autochan->chan, will be unreffed once.

First at the end of ast_do_masquerade, it will be unreffed again ({{ast_channel_unref(clonechan)}}). The comments seem to suggest that the unref of clonechan could very well be the last ref to it, causing it to be destroyed.

That means that there is a tiny tiny chance of ast_channel_lock() (through the ast_autochan_channel_lock) being called on a channel that was just destroyed. But the chance seems slim, and I'd be willing to delegate that problem to a different issue/ticket.

> [patch]DeadLock ChanSpy with call over Local channel
> ----------------------------------------------------
>
>                 Key: ASTERISK-25321
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-25321
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Applications/app_chanspy
>    Affects Versions: 11.16.0, 11.18.0
>         Environment: custom CentOS 6 based distro
>            Reporter: Filip Frank
>            Assignee: Filip Frank
>            Severity: Critical
>         Attachments: ASTERISK-25321_retry_chan_lock_until_stable.patch, asterisk.log_19082015_1258, backtrace-threads-ffr19082015_1258.txt, corelocks_ffr19082015_1258.txt, spydeadlock.patch
>
>
> We have a problem with ramdom deadlocks with using ChanSpy running on SIP channels, and dialing by AMI Originate to Local channel, which Dial another Local channel, and then Dial SIP peer. 
> Example: 1. SIP/iptel205 doing ChanSpy(SIP/iptel210)
>                2. AMI Originate Local/210 at dialer
>                3. Dial(Local/210 at internal)
>                4. Dial(SIP/iptel210)
>                5. Answer SIP/iptel210
>                6. Start dial caller from originate for ex 00420591122223 at outgoning
> Here is part of backtrace from coredump, I get it by gcore when asterisk was deadlocked.
> [Edit by Rusty - removed excessive debug from description. Please attach debug and annotated files to the issue with More > Attach Files]

--
This message was sent by Atlassian JIRA
(v6.2#6252)