[asterisk-bugs] [JIRA] (ASTERISK-21207) [patch] - Deadlock on fax extension calling ast_async_goto() with locked channel

Masahide Yamamoto (JIRA) noreply at issues.asterisk.org
Fri Jul 4 22:17:58 CDT 2014


    [ https://issues.asterisk.org/jira/browse/ASTERISK-21207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=220282#comment-220282 ] 

Masahide Yamamoto commented on ASTERISK-21207:
----------------------------------------------

We also have been encountering this deadlock issue so far in 1.8 branch.
Unfortunately this issue does not seem to have been fixed in the latest 1.8 branch either.

bq. The ast_channel_unlock in process_sdp was just cut&pasted into place, and is pointless. It only unlocks the recursive lock from a couple lines up, not the big lock held by the caller of process_sdp – ultimately handle_request_do.

According to the above comment from Ashley, the lock in process_sdp seems to need to be disabled so the following ast_exists_extension and ast_async_goto will work without deadlocking like:

{code}
if (ast_test_flag(&p->flags[1], SIP_PAGE2_FAX_DETECT_T38)) {

        /* Commented out */
        /* ast_channel_lock(p->owner); */

        /* Need to check with ast_channel_trylock instead of directly locking with ast_channel_lock because we are not sure if the channel is always locked whenever it comes here.. */
        int f_locked_state;
        if(EBUSY != (f_locked_state = ast_channel_trylock(p->owner))) ast_channel_unlock(p->owner);

        if (strcmp(p->owner->exten, "fax")) {
                const char *target_context = S_OR(p->owner->macrocontext, p->owner->context);

                /* Commented out */
                /* ast_channel_unlock(p->owner); */

                /* Unlock if the channel was locked for the following ast_exists_extension and ast_async_goto */
                if(EBUSY == f_locked_state) ast_channel_unlock(p->owner);

                if (ast_exists_extension(p->owner, target_context, "fax", 1,
                        S_COR(p->owner->caller.id.number.valid, p->owner->caller.id.number.str, NULL))) {
                        ast_verbose(VERBOSE_PREFIX_2 "Redirecting '%s' to fax extension due to peer T.38 re-INVITE\n", p->owner->name);

                        /* Assume the channel is not locked here as long as it hasn't been locked in ast_exists_extension */
                        ast_channel_lock(p->owner); /* Added */
                        pbx_builtin_setvar_helper(p->owner, "FAXEXTEN", p->owner->exten);
                        ast_channel_unlock(p->owner); /* Added */

                        if (ast_async_goto(p->owner, target_context, "fax", 1)) {
                                ast_log(LOG_NOTICE, "Failed to async goto '%s' into fax of '%s'\n", p->owner->name, target_context);
                        }
                } else {
                        ast_log(LOG_NOTICE, "T.38 re-INVITE detected but no fax extension\n");
                }

                /* Re-lock if the channel was locked */
                if(EBUSY == f_locked_state) ast_channel_lock(p->owner);
        }

        /* Commented out */
        /*  } else {
                ast_channel_unlock(p->owner);
        } */

}
{code}

FYI: Like the above code snippet, We can use ast_channel_trylock / ast_mutex_trylock for non-blocking lock attempt and checking.

> [patch] - Deadlock on fax extension calling ast_async_goto() with locked channel
> --------------------------------------------------------------------------------
>
>                 Key: ASTERISK-21207
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-21207
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Channels/chan_dahdi, Channels/chan_sip/General
>    Affects Versions: 10.7.1, 11.2.1
>         Environment: CentOS 6.3 x86_64
>            Reporter: Ashley Winters
>            Severity: Critical
>         Attachments: backtrace-threads.txt, core-show-locks.txt, fax-deadlock.patch, fax-deadlock-v2.patch, fax-deadlock-v2.patch-11.3.0, gdb-fax-deadlock.txt, issue_log
>
>
> On an asterisk system with heavy use of AGI and inbound CNG-detected faxing, occasionally all channel activity will freeze. Running 'core show channels' returns nothing, but the logs continue running with anything except channel activity. Running with 'sip set debug on' shows that chan_sip.c doesn't even claim to be reading packets anymore.
> This deadlock was triggered several times daily across our array of asterisk servers, which process hundreds of faxes and tens of thousands of calls daily.



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list