[asterisk-bugs] [JIRA] (ASTERISK-21406) chan_sip deadlock on monlock between unload_module and do_monitor

Corey Farrell (JIRA) noreply at issues.asterisk.org
Wed Apr 10 19:39:01 CDT 2013


     [ https://issues.asterisk.org/jira/browse/ASTERISK-21406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Corey Farrell updated ASTERISK-21406:
-------------------------------------

    Attachment: chan_sip-unload-testfix.patch

[^chan_sip-unload-testfix.patch] is a possible fix.  At first I did not use sched_yield(), the ast_debug message was printed, but the deadlock was avoided.  After adding sched_yield I was not been able to reproduce the deadlock and or the ast_mutex_trylock failed message.

This patch has not been tested with any SIP peers/activity, it was only tested as a way to fix the specific issue.
                
> chan_sip deadlock on monlock between unload_module and do_monitor
> -----------------------------------------------------------------
>
>                 Key: ASTERISK-21406
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-21406
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Channels/chan_sip/General
>    Affects Versions: 11.4.0
>         Environment: Ubuntu/quantal, eglibc-2.15-0ubuntu20
>            Reporter: Corey Farrell
>         Attachments: chan_sip-unload-testfix.patch
>
>
> unload_module cancels/joins the monitor thread while holding monlock.  If do_monitor attempts to lock monlock while unload_module already has it, they deadlock.  do_monitor waits for monlock while unload_module waits for do_monitor to exit.
> I've experienced this issue a couple of times in production when attempting to shutting down.  I found the cause while running valgrind tests.  I believe valgrind slowed things down so much it caused the deadlock to occur somewhat reliably.  I could not replicate the issue with lock debugging enabled.  I added ast_log messages to unload_module, found that they stopped while monlock was held.  The valgrind testing was done with 'make samples', no changes to /etc/asterisk.  I tried attaching gdb once the lock occured but it could not find symbols (probably because of valgrind).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.asterisk.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



More information about the asterisk-bugs mailing list