[asterisk-bugs] [JIRA] (ASTERISK-25023) Deadlock in chan_sip in update_provisional_keepalive

Etienne Lessard (JIRA) noreply at issues.asterisk.org
Fri Oct 9 12:43:34 CDT 2015


    [ https://issues.asterisk.org/jira/browse/ASTERISK-25023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=227837#comment-227837 ] 

Etienne Lessard commented on ASTERISK-25023:
--------------------------------------------

Hello,

We are currently experiencing the same problem with asterisk 13.5.0.

I'm attaching a backtrace and the output of "core show locks". When the freeze was detected, the process was killed with an ABRT signal, that's why the backtrace shows that the process terminated with signal 6. This asterisk was compiled with DONT_OPTIMIZE and DEBUG_THREAD, but without BETTER_BACKTRACES, sorry.

There is a bit of noise in these 2 files since they are taken from a "load test" system. The interesting threads are thread 202 (0xa7d4bb70) and thread 504 (0xb47bbb70); we see that the thread 504 is executing ast_sched_runq and is trying to lock a sip_pvt, but the thread 202 has already a lock on this sip_pvt and is trying to call ast_sched_del, which never returns since it's waiting on a condition variable that will be signaled only when thread 504 returns.

Thank you.



> Deadlock in chan_sip in update_provisional_keepalive
> ----------------------------------------------------
>
>                 Key: ASTERISK-25023
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-25023
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Channels/chan_sip/General
>    Affects Versions: 13.3.2
>         Environment: centos 6 / 64Bit
>            Reporter: Arnd Schmitter
>         Attachments: backtrace.txt, core-show-locks.txt, trace.txt
>
>
> There is a race condition / deadlock when update_provisional_keepalive is called.
> If there is already a scheduler run plant for calling send_provisional_keepalive_full.
> the func update_provisional_keepalive gets called with a locked sip_pvt struct and the first thing it does is delete the plant scheduler.
> If the plant scheduler is started after the sip_pvt lock and before the AST_SCHED_DEL_UNREF call in update_provisional_keepalive is executed, the scheduler job is blocked in send_provisional_keepalive_full, waiting to get a lock on the sip_pvt struct.
> The call to AST_SCHED_DEL_UNREF is waiting for a condistion signal, that the running scheduler finish.



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list