[asterisk-bugs] [JIRA] (ASTERISK-28098) ATTENDED_TRANSFER_COMPLETE_SOUND deadlocks in Local channels (Asterisk 11)

Walter Doekes (JIRA) noreply at issues.asterisk.org
Thu Oct 11 04:13:54 CDT 2018


Walter Doekes created ASTERISK-28098:
----------------------------------------

             Summary: ATTENDED_TRANSFER_COMPLETE_SOUND deadlocks in Local channels (Asterisk 11)
                 Key: ASTERISK-28098
                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-28098
             Project: Asterisk
          Issue Type: Bug
      Security Level: None
          Components: Channels/chan_sip/Transfers
    Affects Versions: 11.25.3
            Reporter: Walter Doekes
            Severity: Minor


Hi. Just filing this here for the sake of reference. _As Asterisk 11 is EOL, I'm not suggesting that anyone should fix this. Please close this ticket._

After enabling ATTENDED_TRANSFER_COMPLETE_SOUND today, Asterisken started locking up. It appears to be two threads waiting on each other:

{noformat}
[Switching to thread 12 (Thread 0x7f44d4118700 (LWP 10318))]

(gdb) bt
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f45015b2c06 in __GI___pthread_mutex_lock (mutex=0x56414ab24110) at ../nptl/pthread_mutex_lock.c:115
#2  0x00005641494fdad0 in ast_do_masquerade (original=original at entry=0x7f44b42174c8) at channel.c:6914
#3  0x000056414950473d in __ast_read (chan=0x7f44b42174c8, dropaudio=0) at channel.c:3887
#4  0x000056414954cf92 in waitstream_core (c=<optimized out>, c at entry=0x7f44b42174c8, breakon=breakon at entry=0x7f44be87918d "", forward=0x5641495f3bda "", forward at entry=0x0, reverse=0x5641495f3bda "", reverse at entry=0x0, skip_ms=skip_ms at entry=0, audiofd=audiofd at entry=-1, cmdfd=<optimized out>, context=0x0, cb=0x0) at file.c:1326
#5  0x000056414954f62c in ast_waitstream (c=c at entry=0x7f44b42174c8, breakon=breakon at entry=0x7f44be87918d "") at file.c:1436
#6  0x00007f44be822ad8 in local_attended_transfer (transferer=transferer at entry=0x7f449c0664c8, current=current at entry=0x7f44d41158e0, seqno=seqno at entry=104, nounlock=nounlock at entry=0x7f44d4117274, req=0x7f44d4117340) at chan_sip.c:26550
#7  0x00007f44be832f0d in handle_request_refer (p=p at entry=0x7f449c0664c8, req=req at entry=0x7f44d4117340, seqno=104, nounlock=nounlock at entry=0x7f44d4117274) at chan_sip.c:26886
#8  0x00007f44be8684ad in handle_incoming (p=p at entry=0x7f449c0664c8, req=req at entry=0x7f44d4117340, addr=addr at entry=0x7f44d41172b0, recount=recount at entry=0x7f44d4117270, nounlock=nounlock at entry=0x7f44d4117274) at chan_sip.c:28798
#9  0x00007f44be86a853 in handle_request_do (req=req at entry=0x7f44d4117340, addr=addr at entry=0x7f44d41172b0) at chan_sip.c:28984
#10 0x00007f44be86bfa3 in sipsock_read (id=<optimized out>, fd=<optimized out>, events=<optimized out>, ignore=<optimized out>) at chan_sip.c:28915

(gdb) print *(pthread_mutex_t*)0x56414ab24110
$5 = {__data = {__lock = 2, __count = 1, __owner = 26036, __nusers = 1, __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = "\002\000\000\000\001\000\000\000\264e\000\000\001\000\000\000\001", '\000' <repeats 22 times>, __align = 4294967298}
{noformat}
^-- 12 waiting on 236 (LWP 26036)

{noformat}
[Switching to thread 236 (Thread 0x7f43eaf0b700 (LWP 26036))]

(gdb) bt
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f45015b2c06 in __GI___pthread_mutex_lock (mutex=0x7f44b0cddfa0) at ../nptl/pthread_mutex_lock.c:115
#2  0x00005641494fdaf5 in ast_do_masquerade (original=original at entry=0x7f44b0cddff8) at channel.c:6920
#3  0x000056414950473d in __ast_read (chan=0x7f44b0cddff8, dropaudio=0) at channel.c:3887
#4  0x000056414950777a in ast_read (chan=0x7f44b0cddff8) at channel.c:4411
#5  ast_generic_bridge (rc=0x7f43eaf05708, fo=0x7f43eaf05700, config=0x7f43eaf05c60, c1=0x7f44b42174c8, c0=0x7f44b0cddff8) at channel.c:7648
#6  ast_channel_bridge (c0=c0 at entry=0x7f44b0cddff8, c1=c1 at entry=0x7f44b42174c8, config=config at entry=0x7f43eaf05c60, fo=fo at entry=0x7f43eaf05700, rc=rc at entry=0x7f43eaf05708) at channel.c:8123
#7  0x00005641495456f0 in ast_bridge_call (chan=chan at entry=0x7f44b0cddff8, peer=peer at entry=0x7f44b42174c8, config=config at entry=0x7f43eaf05c60) at features.c:4487
#8  0x00007f44c8caa0e6 in dial_exec_full (chan=0x7f44b0cddff8, data=<optimized out>, peerflags=peerflags at entry=0x7f43eaf066e0, continue_exec=continue_exec at entry=0x0) at app_dial.c:3065
#9  0x00007f44c8cab1c6 in dial_exec (chan=<optimized out>, data=<optimized out>) at app_dial.c:3148
#10 0x0000564149594dfe in pbx_exec (c=c at entry=0x7f44b0cddff8, app=app at entry=0x56414acd20f0, data=data at entry=0x7f43eaf06c60 "Local/IDxxx at osvpi_route_phoneaccount,20") at pbx.c:1680

(gdb) print *(pthread_mutex_t*)0x7f44b0cddfa0
$3 = {__data = {__lock = 2, __count = 1, __owner = 10318, __nusers = 1, __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = "\002\000\000\000\001\000\000\000N(\000\000\001\000\000\000\001", '\000' <repeats 22 times>, __align = 4294967298}
{noformat}
^-- 236 waiting on 12 (LWP 10318)

Without diving too deeply into this, our theory is:
- both threads are trying to do the same masquerade, and/or
- the refer (sipsock_read) thread should not call ast_waitstream while it still has channel locks

The deadlock happens when the audio is played on a Local channel:
{noformat}
# grep Playing.*xfer debug.log   | sed -e 's/.*-- <//;s/\/.*//' | uniq -c
     30 SIP
      1 Local
{noformat}
^-- one Local channel -> deadlock

The relevant Asterisk 11 code looks like this:
{noformat}
                if (target.chan2 && !ast_strlen_zero(xfersound) && ast_streamfile(target.chan2, xfersound, ast_channel_language(target.chan2)) >= 0) {
                        ast_waitstream(target.chan2, ""); // deadlock here
                }
...
                ast_channel_ref(current->chan1);
                ast_channel_unlock(current->chan1); /* current.chan1 is p->owner before the masq, it was locked by socket_read()*/
                ast_channel_unlock(target.chan1);
                *nounlock = 1;  /* we just unlocked the dialog's channel and have no plans of locking it again. */
                sip_pvt_unlock(targetcall_pvt);
                sip_pvt_unlock(transferer);
{noformat}
^-- lots of unlocks _after_ the the playing, instead of before

In Asterisk 13, the relevant code looks like this:
{noformat}
                ast_channel_unlock(to_transfer_target);
                if (!target_complete_sound) {
                        ast_channel_lock(to_transferee);
...
                        ast_channel_unlock(to_transferee);
                }
                if (target_complete_sound) {
                        ast_bridge_channel_write_playfile(to_target_bridge_channel, NULL,
                                target_complete_sound, NULL);
                }
{noformat}
^-- unlocking of channel target _before_ playing

We'll assume things are good in 13. Will report back if it isn't.



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list