[asterisk-bugs] [JIRA] (ASTERISK-25468) Deadlock in chan_sip - core show locks shows do_monitor lock

Barry Flanagan (JIRA) noreply at issues.asterisk.org
Fri Oct 16 15:46:33 CDT 2015


    [ https://issues.asterisk.org/jira/browse/ASTERISK-25468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=227909#comment-227909 ] 

Barry Flanagan commented on ASTERISK-25468:
-------------------------------------------

Hi Rusty, Stefan,

It appears to have happened between 13.4.0 and 13.5.0. I have tested 13.2.0 - 13.6.0 and the issue appears in .5 and .6

=======================================================================
=== 13.5.0
=== Currently Held Locks
=======================================================================
===
=== <pending> <lock#> (<file>): <lock type> <line num> <function> <lock name> <lock addr> (times locked)
===
=== Thread ID: 0x7f4bc0f8d700 LWP:14221 (do_monitor           started at [28915] chan_sip.c restart_monitor())
=== ---> Lock #0 (chan_sip.c): MUTEX 28886 do_monitor &monlock 0x7f4bcfffd4a0 (1)
        main/backtrace.c:59 __ast_bt_get_addresses() (0x467743+1D)
        main/lock.c:258 __ast_pthread_mutex_lock() (0x537a4d+C7)
        channels/chan_sip.c:28887 do_monitor()
        main/utils.c:1237 dummy_start()
        nptl/pthread_create.c:312 start_thread()
        libc.so.6 clone() (0x7f4c58789410+6D)
=== -------------------------------------------------------------------
===
=======================================================================

I'll upload the full gdb backtrace for 13.5 as well, in case it is of any use.



> Deadlock in chan_sip - core show locks shows do_monitor lock
> ------------------------------------------------------------
>
>                 Key: ASTERISK-25468
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-25468
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Channels/chan_sip/General
>    Affects Versions: 13.6.0
>         Environment: Debian 8 and Ubuntu 14.04.3.
> Asterisk latest 13.6.0 from Git.
> Realtime using odbc/mysql
>            Reporter: Barry Flanagan
>            Assignee: Barry Flanagan
>         Attachments: ASTERISK-25468_gdb-output.txt
>
>
> I am trying to bring a new server into an existing cluster of Asterisk boxes and I keep getting the same problem.
> Servers are all behind a Kamailio LB, and when I add this new server to the dispatcher group, kamailio starts sending REGISTER and  SUBSCRIBE requests to the new server. After a few minutes chan_sip just hangs, no longer processing any traffic at all. Nothing shows up in the logs, and Asterisk itself is still running. I can see the incoming SIP packets using sngrep, but Asterisk does not see them.
> I have tried this on KVM and Openvz virtual servers as well as physical servers, and have tried both Debian 8 and Ubuntu 14.04 with the exact same results.
> When chan_sip freezes, 'core show locks' shows the following every time:
> {code}
> =======================================================================
> === GIT-13-f8707ae
> === Currently Held Locks
> =======================================================================
> ===
> === <pending> <lock#> (<file>): <lock type> <line num> <function> <lock name> <lock addr> (times locked)
> ===
> === Thread ID: 0x7fba21a4c700 LWP:13423 (do_monitor           started at [28932] chan_sip.c restart_monitor())
> === ---> Lock #0 (chan_sip.c): MUTEX 28903 do_monitor &monlock 0x7fba319054a0 (1)
>         main/backtrace.c:59 __ast_bt_get_addresses() (0x46777f+1D)
>         main/lock.c:258 __ast_pthread_mutex_lock() (0x5379ef+C7)
>         channels/chan_sip.c:28904 do_monitor()
>         main/utils.c:1237 dummy_start()
>         :0 start_thread()
>         libc.so.6 clone() (0x7fbab9e25410+6D)
> === -------------------------------------------------------------------
> {code}
> There is no core sump when this happens. SIP simply stops responding, peers do not expire, etc.
> I have managed to get a gdb backtrace from the running process using 
> {code}
> gdb -ex "thread apply all bt" --batch /usr/sbin/asterisk <pid>
> {code}
> Hopefully that will give some clue. I will upload it as an attachment.
> Any help much appreciated.



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list