[asterisk-bugs] [JIRA] (ASTERISK-25468) Deadlock in chan_sip - core show locks shows do_monitor lock

Barry Flanagan (JIRA) noreply at issues.asterisk.org
Fri Oct 16 04:31:32 CDT 2015


    [ https://issues.asterisk.org/jira/browse/ASTERISK-25468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=227898#comment-227898 ] 

Barry Flanagan commented on ASTERISK-25468:
-------------------------------------------

Thanks Stephan. Could be. I am going to try version 13.2.0 as the other servers in this cluster are running that version and are stable. If that turns out to work, I'll try each release up to 13.6.0 and see when it breaks.


> Deadlock in chan_sip - core show locks shows do_monitor lock
> ------------------------------------------------------------
>
>                 Key: ASTERISK-25468
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-25468
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Channels/chan_sip/General
>    Affects Versions: 13.6.0
>         Environment: Debian 8 and Ubuntu 14.04.3.
> Asterisk latest 13.6.0 from Git.
> Realtime using odbc/mysql
>            Reporter: Barry Flanagan
>         Attachments: ASTERISK-25468_gdb-output.txt
>
>
> I am trying to bring a new server into an existing cluster of Asterisk boxes and I keep getting the same problem.
> Servers are all behind a Kamailio LB, and when I add this new server to the dispatcher group, kamailio starts sending REGISTER and  SUBSCRIBE requests to the new server. After a few minutes chan_sip just hangs, no longer processing any traffic at all. Nothing shows up in the logs, and Asterisk itself is still running. I can see the incoming SIP packets using sngrep, but Asterisk does not see them.
> I have tried this on KVM and Openvz virtual servers as well as physical servers, and have tried both Debian 8 and Ubuntu 14.04 with the exact same results.
> When chan_sip freezes, 'core show locks' shows the following every time:
> {code}
> =======================================================================
> === GIT-13-f8707ae
> === Currently Held Locks
> =======================================================================
> ===
> === <pending> <lock#> (<file>): <lock type> <line num> <function> <lock name> <lock addr> (times locked)
> ===
> === Thread ID: 0x7fba21a4c700 LWP:13423 (do_monitor           started at [28932] chan_sip.c restart_monitor())
> === ---> Lock #0 (chan_sip.c): MUTEX 28903 do_monitor &monlock 0x7fba319054a0 (1)
>         main/backtrace.c:59 __ast_bt_get_addresses() (0x46777f+1D)
>         main/lock.c:258 __ast_pthread_mutex_lock() (0x5379ef+C7)
>         channels/chan_sip.c:28904 do_monitor()
>         main/utils.c:1237 dummy_start()
>         :0 start_thread()
>         libc.so.6 clone() (0x7fbab9e25410+6D)
> === -------------------------------------------------------------------
> {code}
> There is no core sump when this happens. SIP simply stops responding, peers do not expire, etc.
> I have managed to get a gdb backtrace from the running process using 
> {code}
> gdb -ex "thread apply all bt" --batch /usr/sbin/asterisk <pid>
> {code}
> Hopefully that will give some clue. I will upload it as an attachment.
> Any help much appreciated.



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list