[asterisk-bugs] [JIRA] (ASTERISK-25321) DeadLock ChanSpy with call over Local channel

Rusty Newton (JIRA) noreply at issues.asterisk.org
Thu Aug 13 09:21:32 CDT 2015


     [ https://issues.asterisk.org/jira/browse/ASTERISK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rusty Newton updated ASTERISK-25321:
------------------------------------

    Description: 
We have a problem with ramdom deadlocks with using ChanSpy running on SIP channels, and dialing by AMI Originate to Local channel, which Dial another Local channel, and then Dial SIP peer. 

Example: 1. SIP/iptel205 doing ChanSpy(SIP/iptel210)
               2. AMI Originate Local/210 at dialer
               3. Dial(Local/210 at internal)
               4. Dial(SIP/iptel210)
               5. Answer SIP/iptel210
               6. Start dial caller from originate for ex 00420591122223 at outgoning

Here is part of backtrace from coredump, I get it by gcore when asterisk was deadlocked.

[Edit by Rusty - removed excessive debug from description. Please attach debug and annotated files to the issue with More > Attach Files]


  was:
We have a problem with ramdom deadlocks with using ChanSpy running on SIP channels, and dialing by AMI Originate to Local channel, which Dial another Local channel, and then Dial SIP peer. 

Example: 1. SIP/iptel205 doing ChanSpy(SIP/iptel210)
               2. AMI Originate Local/210 at dialer
               3. Dial(Local/210 at internal)
               4. Dial(SIP/iptel210)
               5. Answer SIP/iptel210
               6. Start dial caller from originate for ex 00420591122223 at outgoning

Here is part of backtrace from coredump, I get it by gcore when asterisk was deadlocked.

(gdb) thread 67
[Switching to thread 67 (Thread 0x7f9cc1e52700 (LWP 23734))]#0  0x00007f9d5e1dd264 in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00007f9d5e1dd264 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007f9d5e1d8523 in _L_lock_892 () from /lib64/libpthread.so.0
#2  0x00007f9d5e1d8407 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x0000000000466605 in ast_channel_by_uniqueid_cb (obj=0x7f9c88167288, arg=0x7f9cc1e4c710, data=<value optimized out>, flags=<value optimized out>) at channel.c:1520
#4  0x0000000000444d00 in internal_ao2_callback (c=0x2153ba8, flags=0, cb_fn=<value optimized out>, arg=0x7f9cc1e4c710, data=0x7f9cc1e4c418, type=WITH_DATA, tag=0x0, file=0x0, line=0, func=0x0) at astobj2.c:1105
#5  0x0000000000445241 in __ao2_callback_data (c=<value optimized out>, flags=<value optimized out>, cb_fn=<value optimized out>, arg=<value optimized out>, data=<value optimized out>) at astobj2.c:1225
#6  0x000000000046ba9c in ast_channel_get_by_name_prefix (name=0x7f9cc1e4c710 "Local/264 at internal_phones-000009b0;1", name_len=0) at channel.c:1630
#7  0x00000000004b4ad4 in ast_bridge_call (chan=<value optimized out>, peer=0x7f9c882ebca8, config=0x7f9c00000000) at features.c:4912
#8  0x00007f9d1ea91df9 in dial_exec_full (chan=<value optimized out>, data=<value optimized out>, peerflags=0x7f9cc1e4d570, continue_exec=0x0) at app_dial.c:3047
#9  0x00007f9d1ea92ef6 in dial_exec (chan=<value optimized out>, data=<value optimized out>) at app_dial.c:3130
#10 0x0000000000507f14 in pbx_exec (c=0x7f9c8842ae58, app=0x2e3e780, data=0x7f9cc1e4faf0 "Local/264 at internal_phones,10,c") at pbx.c:1622
#11 0x0000000000508755 in pbx_extension_helper (c=0x7f9c8842ae58, con=<value optimized out>, context=<value optimized out>, exten=<value optimized out>, priority=4, label=0x0, callerid=0x7f9c8812e3a0 "00421911927437", action=E_SPAWN, 
    found=0x7f9cc1e51c7c, combined_find_spawn=1) at pbx.c:4915
#12 0x0000000000508c00 in ast_spawn_extension (c=<value optimized out>, context=<value optimized out>, exten=<value optimized out>, priority=<value optimized out>, callerid=<value optimized out>, found=<value optimized out>, 
    combined_find_spawn=1) at pbx.c:6037
#13 0x000000000050dbb4 in __ast_pbx_run (c=0x7f9c8842ae58, args=0x0) at pbx.c:6512
#14 0x000000000050fb2b in pbx_thread (data=<value optimized out>) at pbx.c:6842
#15 0x000000000054b4a8 in dummy_start (data=<value optimized out>) at utils.c:1223
#16 0x00007f9d5e1d69d1 in start_thread () from /lib64/libpthread.so.0
#17 0x00007f9d5f7918fd in clone () from /lib64/libc.so.6
(gdb) frame 2
#2  0x00007f9d5e1d8407 in pthread_mutex_lock () from /lib64/libpthread.so.0
(gdb) print *(pthread_mutex_t *)$r8
$1 = {__data = {__lock = 2, __count = 1, __owner = 19054, __nusers = 1, __kind = 1, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
  __size = "\002\000\000\000\001\000\000\000nJ\000\000\001\000\000\000\001", '\000' <repeats 22 times>, __align = 4294967298}
(gdb) 

Thread LWP19054 i thread 43:

[Switching to thread 43 (Thread 0x7f9cc2be2700 (LWP 19054))]#0  0x00007f9d5f7880d3 in poll () from /lib64/libc.so.6
(gdb) bt
#0  0x00007f9d5f7880d3 in poll () from /lib64/libc.so.6
#1  0x00000000004751ae in ast_waitfor_nandfds (c=0x7f9cc2bd6408, n=1, fds=0x0, nfds=956909, exception=0x0, outfd=0x0, ms=0x7f9cc2bd6404) at channel.c:3266
#2  0x00000000004756f3 in ast_waitfor (c=0x7f9cf88f11d8, ms=100000) at channel.c:3555
#3  0x00007f9d0c3abb31 in channel_spy (chan=0x7f9cf88f11d8, spyee_autochan=0x7f9c88266e20, volfactor=0x7f9cc2bdb59c, fd=<value optimized out>, user_options=0x7f9cc2bdc540, flags=0x7f9cc2bdc530, exitcontext=0x7f9cc2bdc2b0 "")
    at app_chanspy.c:648
#4  0x00007f9d0c3acbd4 in common_exec (chan=0x7f9cf88f11d8, flags=0x7f9cc2bdc530, volfactor=0, fd=0, user_options=0x7f9cc2bdc540, mygroup=0x0, myenforced=0x0, spec=0x7f9cc2bdc380 "SIP/iptel280", exten=0x0, context=0x0, mailbox=0x0, 
    name_context=0x0) at app_chanspy.c:1041
#5  0x00007f9d0c3adee5 in chanspy_exec (chan=0x7f9cf88f11d8, data=<value optimized out>) at app_chanspy.c:1199
#6  0x0000000000507f14 in pbx_exec (c=0x7f9cf88f11d8, app=0x2e3ec80, data=0x7f9cc2bdfaf0 "SIP/iptel280") at pbx.c:1622
#7  0x0000000000508755 in pbx_extension_helper (c=0x7f9cf88f11d8, con=<value optimized out>, context=<value optimized out>, exten=<value optimized out>, priority=1, label=0x0, callerid=0x7f9cf82ba7d0 "285", action=E_SPAWN, 
    found=0x7f9cc2be1c7c, combined_find_spawn=1) at pbx.c:4915
#8  0x0000000000508c00 in ast_spawn_extension (c=<value optimized out>, context=<value optimized out>, exten=<value optimized out>, priority=<value optimized out>, callerid=<value optimized out>, found=<value optimized out>, 
    combined_find_spawn=1) at pbx.c:6037
#9  0x000000000050dbb4 in __ast_pbx_run (c=0x7f9cf88f11d8, args=0x0) at pbx.c:6512
#10 0x000000000050fb2b in pbx_thread (data=<value optimized out>) at pbx.c:6842
#11 0x000000000054b4a8 in dummy_start (data=<value optimized out>) at utils.c:1223
#12 0x00007f9d5e1d69d1 in start_thread () from /lib64/libpthread.so.0
#13 0x00007f9d5f7918fd in clone () from /lib64/libc.so.6
(gdb) 

You can see in frame 2 call function ast_waitfor in channel.c:3555
here is
ast_waitfor_nandfds(&c, 1, NULL, 0, NULL, NULL, &ms);
but in frame 1 there is something wrong with nfds=956909, it was called with 0 value. So it seems like stack corruption ?

Also in locals value from timeval start(tv_usec) is same as you can see in nfds arg.

(gdb) frame 1
#1  0x00000000004751ae in ast_waitfor_nandfds (c=0x7f9cc2bd6408, n=1, fds=0x0, nfds=956909, exception=0x0, outfd=0x0, ms=0x7f9cc2bd6404) at channel.c:3266
3266    channel.c: No such file or directory.
        in channel.c
(gdb) info locals
kbrms = 100000
start = {tv_sec = 1438933624, tv_usec = 956909}
pfds = <value optimized out>
res = <value optimized out>
rms = 100000
x = <value optimized out>
y = <value optimized out>
max = 4
sz = <value optimized out>
now = <value optimized out>
whentohangup = <value optimized out>
diff = <value optimized out>
winner = 0x0
fdmap = 0x7f9cc2bd6200
__PRETTY_FUNCTION__ = "ast_waitfor_nandfds"






> DeadLock ChanSpy with call over Local channel
> ---------------------------------------------
>
>                 Key: ASTERISK-25321
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-25321
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>    Affects Versions: 11.16.0, 11.18.0
>         Environment: custom CentOS 6 based distro
>            Reporter: Filip Frank
>            Severity: Critical
>
> We have a problem with ramdom deadlocks with using ChanSpy running on SIP channels, and dialing by AMI Originate to Local channel, which Dial another Local channel, and then Dial SIP peer. 
> Example: 1. SIP/iptel205 doing ChanSpy(SIP/iptel210)
>                2. AMI Originate Local/210 at dialer
>                3. Dial(Local/210 at internal)
>                4. Dial(SIP/iptel210)
>                5. Answer SIP/iptel210
>                6. Start dial caller from originate for ex 00420591122223 at outgoning
> Here is part of backtrace from coredump, I get it by gcore when asterisk was deadlocked.
> [Edit by Rusty - removed excessive debug from description. Please attach debug and annotated files to the issue with More > Attach Files]



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list