[asterisk-bugs] [JIRA] (ASTERISK-29474) core: Exceptionally long queue length with Local channels and ConfBridge

Kevin Harwell (JIRA) noreply at issues.asterisk.org
Fri Jun 11 13:32:08 CDT 2021


    [ https://issues.asterisk.org/jira/browse/ASTERISK-29474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=255269#comment-255269 ] 

Kevin Harwell edited comment on ASTERISK-29474 at 6/11/21 1:31 PM:
-------------------------------------------------------------------

I took a look at one of the latest backtraces ([^core-asterisk-running-2021-06-11T00-17-11+0000-brief.txt]). I do see a number of threads attempting to get locks, but so far not seeing an obvious deadlock, or a case where at least two separate threads are waiting on each other.

For instance in Thread 98:
{noformat}
Thread 98 (Thread 0x7f9c58512700 (LWP 21726)):
#0  0x00007f9c5485a9d0 in _Unwind_Find_FDE at plt () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#1  0x00007f9c54866b13 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#2  0x00007f9c54868618 in _Unwind_Backtrace () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#3  0x00007f9c54b64e7f in __GI___backtrace (array=<optimized out>, size=<optimized out>) at ../sysdeps/x86_64/backtrace.c:110
#4  0x000055b8a58ca0da in __ast_bt_get_addresses (bt=0x7f9c5850ae60) at backtrace.c:94
#5  0x000055b8a597b1f5 in __ast_pthread_mutex_lock (filename=0x55b8a5ad4bc0 "bridge_channel.c", lineno=1083, func=0x55b8a5ad5d30 <__PRETTY_FUNCTION__.17261> "ast_bridge_channel_queue_frame", mutex_name=0x55b8a5ad4bd1 "bridge_channel", t=0x7f9c30023910) at lock.c:270
#6  0x000055b8a58bd0ee in __ao2_lock (user_data=0x7f9c30023960, lock_how=AO2_LOCK_REQ_MUTEX, file=0x55b8a5ad4bc0 "bridge_channel.c", func=0x55b8a5ad5d30 <__PRETTY_FUNCTION__.17261> "ast_bridge_channel_queue_frame", line=1083, var=0x55b8a5ad4bd1 "bridge_channel") at astobj2.c:241
#7  0x000055b8a58e5386 in _ast_bridge_channel_lock (bridge_channel=0x7f9c30023960, file=0x55b8a5ad4bc0 "bridge_channel.c", function=0x55b8a5ad5d30 <__PRETTY_FUNCTION__.17261> "ast_bridge_channel_queue_frame", line=1083, var=0x55b8a5ad4bd1 "bridge_channel") at /usr/src/asterisk-18.4.0/include/asterisk/bridge_channel.h:234
#8  0x000055b8a58e7c0f in ast_bridge_channel_queue_frame (bridge_channel=0x7f9c30023960, fr=0x7f9c1c031330) at bridge_channel.c:1083
#9  0x000055b8a58e7f51 in ast_bridge_queue_everyone_else (bridge=0x7f9c300133f0, bridge_channel=0x0, frame=0x7f9c1c031330) at bridge_channel.c:1136
#10 0x00007f9c507e92ad in softmix_bridge_write_control (bridge=0x7f9c300133f0, bridge_channel=0x7f9c3003ffc0, frame=0x7f9c1c031330) at bridge_softmix.c:1257
#11 0x00007f9c507e94d6 in softmix_bridge_write (bridge=0x7f9c300133f0, bridge_channel=0x7f9c3003ffc0, frame=0x7f9c1c031330) at bridge_softmix.c:1354
#12 0x000055b8a58e6f39 in bridge_channel_write_frame (bridge_channel=0x7f9c3003ffc0, frame=0x7f9c1c031330) at bridge_channel.c:711
#13 0x000055b8a58ec409 in bridge_handle_trip (bridge_channel=0x7f9c3003ffc0) at bridge_channel.c:2706
#14 0x000055b8a58ec83a in bridge_channel_wait (bridge_channel=0x7f9c3003ffc0) at bridge_channel.c:2842
#15 0x000055b8a58ed03e in bridge_channel_internal_join (bridge_channel=0x7f9c3003ffc0) at bridge_channel.c:2993
#16 0x000055b8a58cfb17 in ast_bridge_join (bridge=0x7f9c300133f0, chan=0x7f9c1c022940, swap=0x0, features=0x7f9c5850e870, tech_args=0x7f9c5850e898, flags=(unknown: 0)) at bridge.c:17250x7f9c30023960
#17 0x00007f9be9f3c41d in confbridge_exec (chan=0x7f9c1c022940, data=0x7f9c5850ea50 "coincall1623370594,silentbridge,incoglistener") at app_confbridge.c:2758
#18 0x000055b8a59b0f43 in pbx_exec (c=0x7f9c1c022940, app=0x55b8a9737740, data=0x7f9c5850ea50 "coincall1623370594,silentbridge,incoglistener") at pbx_app.c:492
#19 0x000055b8a599c006 in pbx_extension_helper (c=0x7f9c1c022940, con=0x0, context=0x7f9c1c023300 "caller2", exten=0x7f9c1c023350 "s", priority=2, label=0x0, callerid=0x0, action=E_SPAWN, found=0x7f9c58511cec, combined_find_spawn=1) at pbx.c:2947
#20 0x000055b8a59a00d4 in ast_spawn_extension (c=0x7f9c1c022940, context=0x7f9c1c023300 "caller2", exten=0x7f9c1c023350 "s", priority=2, callerid=0x0, found=0x7f9c58511cec, combined_find_spawn=1) at pbx.c:4206
#21 0x000055b8a59a0df5 in __ast_pbx_run (c=0x7f9c1c022940, args=0x0) at pbx.c:4380
#22 0x000055b8a59a266e in pbx_thread (data=0x7f9c1c022940) at pbx.c:4704
#23 0x000055b8a5a414e4 in dummy_start (data=0x7f9c1c0148e0) at utils.c:1299
#24 0x00007f9c55f4d4a4 in start_thread (arg=0x7f9c58512700) at pthread_create.c:456
#25 0x00007f9c54b57d0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
{noformat}
_bridge_channel_write_frame_ locks the bridge in this stack trace, and then this thread gets blocked waiting {{user_data=0x7f9c30023960}}, which turns up also in Thread 94:
{noformat}
Thread 94 (Thread 0x7f9bdc19d700 (LWP 21695)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f9c55f4fc26 in __GI___pthread_mutex_lock (mutex=0x7f9c5862b970 <_rtld_global+2352>) at ../nptl/pthread_mutex_lock.c:115
#2  0x00007f9c54b8dd7f in __GI___dl_iterate_phdr (callback=0x7f9c54868da0, data=0x7f9bdc19c360) at dl-iteratephdr.c:41
#3  0x00007f9c5486a11e in _Unwind_Find_FDE () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#4  0x00007f9c54866b13 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#5  0x00007f9c54867d30 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#6  0x00007f9c548685e8 in _Unwind_Backtrace () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#7  0x00007f9c54b64e7f in __GI___backtrace (array=<optimized out>, size=<optimized out>) at ../sysdeps/x86_64/backtrace.c:110
#8  0x000055b8a58ca0da in __ast_bt_get_addresses (bt=0x7f9bdc19c930) at backtrace.c:94
#9  0x000055b8a597b1f5 in __ast_pthread_mutex_lock (filename=0x55b8a5ad8171 "channel.c", lineno=3087, func=0x55b8a5adc790 <__PRETTY_FUNCTION__.19719> "ast_waitfor_nandfds", mutex_name=0x55b8a5ad8dbd "c[x]", t=0x7f9c30025c30) at lock.c:270
#10 0x000055b8a58bd0ee in __ao2_lock (user_data=0x7f9c30025c80, lock_how=AO2_LOCK_REQ_MUTEX, file=0x55b8a5ad8171 "channel.c", func=0x55b8a5adc790 <__PRETTY_FUNCTION__.19719> "ast_waitfor_nandfds", line=3087, var=0x55b8a5ad8dbd "c[x]") at astobj2.c:241
#11 0x000055b8a5900f3d in ast_waitfor_nandfds (c=0x7f9c30023998, n=1, fds=0x7f9c30023a30, nfds=1, exception=0x0, outfd=0x7f9bdc19cd00, ms=0x7f9bdc19cd04) at channel.c:3087
#12 0x000055b8a58ec71f in bridge_channel_wait (bridge_channel=0x7f9c30023960) at bridge_channel.c:2827
#13 0x000055b8a58ed03e in bridge_channel_internal_join (bridge_channel=0x7f9c30023960) at bridge_channel.c:2993
#14 0x000055b8a58cfe0f in bridge_channel_ind_thread (data=0x7f9c30023960) at bridge.c:1797
#15 0x000055b8a5a414e4 in dummy_start (data=0x7f9c30023ac0) at utils.c:1299
#16 0x00007f9c55f4d4a4 in start_thread (arg=0x7f9bdc19d700) at pthread_create.c:456
#17 0x00007f9c54b57d0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
{noformat}
Here though that _bridge_channel_ is not itself blocked but instead this is blocked by {{user_data=0x7f9c30025c80}}, which is not found elsewhere in the backtrace.

By the way if you search on the locked bridge too it turns up elsewhere but only on thread is actually being blocked on it and from it's stack trace it appears to not lock any effected objects prior to it being blocked:
{noformat}
Thread 92 (Thread 0x7f9bdc297700 (LWP 21693)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f9c55f4fc26 in __GI___pthread_mutex_lock (mutex=0x7f9c300133a0) at ../nptl/pthread_mutex_lock.c:115
#2  0x000055b8a597b2d1 in __ast_pthread_mutex_lock (filename=0x7f9c507f0989 "bridge_softmix.c", lineno=1968, func=0x7f9c507f1ac0 <__PRETTY_FUNCTION__.17530> "softmix_mixing_loop", mutex_name=0x7f9c507f10cc "bridge", t=0x7f9c300133a0) at lock.c:326
#3  0x000055b8a58bd0ee in __ao2_lock (user_data=0x7f9c300133f0, lock_how=AO2_LOCK_REQ_MUTEX, file=0x7f9c507f0989 "bridge_softmix.c", func=0x7f9c507f1ac0 <__PRETTY_FUNCTION__.17530> "softmix_mixing_loop", line=1968, var=0x7f9c507f10cc "bridge") at astobj2.c:241
#4  0x00007f9c507e5d62 in _ast_bridge_lock (bridge=0x7f9c300133f0, file=0x7f9c507f0989 "bridge_softmix.c", function=0x7f9c507f1ac0 <__PRETTY_FUNCTION__.17530> "softmix_mixing_loop", line=1968, var=0x7f9c507f10cc "bridge") at /usr/src/asterisk-18.4.0/include/asterisk/bridge.h:483
#5  0x00007f9c507eabca in softmix_mixing_loop (bridge=0x7f9c300133f0) at bridge_softmix.c:1968
#6  0x00007f9c507eaf5c in softmix_mixing_thread (data=0x7f9c30009e90) at bridge_softmix.c:2047
#7  0x000055b8a5a414e4 in dummy_start (data=0x7f9c3000f820) at utils.c:1299
#8  0x00007f9c55f4d4a4 in start_thread (arg=0x7f9bdc297700) at pthread_create.c:456
#9  0x00007f9c54b57d0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
{noformat}
Even though it's not entirely obvious whether there is a deadlock one may still exist. However, given the latest backtrace my guess is either one of the channels, or bridge channels is being locked somewhere (just not seeing it yet), or another possibility is a thread locked one of those objects and then never unlocked it but that thread ended thus it does not show up in the backtrace.

In the latter case you'd be looking for a "lock" someplace, with no associated "unlock" (maybe an off nominal path). Not saying that's what's happening here, but something to keep in mind.

This probably wasn't all that helpful, but maybe someone will see something I missed.



was (Author: kharwell):
I took a look at one of the latest backtraces ([^core-asterisk-running-2021-06-11T00-17-11%2B0000-brief.txt]). I do see a number of threads attempting to get locks, but so far not seeing an obvious deadlock, or a case where at least two separate threads are waiting on each other.

For instance in Thread 98:
{noformat}
Thread 98 (Thread 0x7f9c58512700 (LWP 21726)):
#0  0x00007f9c5485a9d0 in _Unwind_Find_FDE at plt () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#1  0x00007f9c54866b13 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#2  0x00007f9c54868618 in _Unwind_Backtrace () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#3  0x00007f9c54b64e7f in __GI___backtrace (array=<optimized out>, size=<optimized out>) at ../sysdeps/x86_64/backtrace.c:110
#4  0x000055b8a58ca0da in __ast_bt_get_addresses (bt=0x7f9c5850ae60) at backtrace.c:94
#5  0x000055b8a597b1f5 in __ast_pthread_mutex_lock (filename=0x55b8a5ad4bc0 "bridge_channel.c", lineno=1083, func=0x55b8a5ad5d30 <__PRETTY_FUNCTION__.17261> "ast_bridge_channel_queue_frame", mutex_name=0x55b8a5ad4bd1 "bridge_channel", t=0x7f9c30023910) at lock.c:270
#6  0x000055b8a58bd0ee in __ao2_lock (user_data=0x7f9c30023960, lock_how=AO2_LOCK_REQ_MUTEX, file=0x55b8a5ad4bc0 "bridge_channel.c", func=0x55b8a5ad5d30 <__PRETTY_FUNCTION__.17261> "ast_bridge_channel_queue_frame", line=1083, var=0x55b8a5ad4bd1 "bridge_channel") at astobj2.c:241
#7  0x000055b8a58e5386 in _ast_bridge_channel_lock (bridge_channel=0x7f9c30023960, file=0x55b8a5ad4bc0 "bridge_channel.c", function=0x55b8a5ad5d30 <__PRETTY_FUNCTION__.17261> "ast_bridge_channel_queue_frame", line=1083, var=0x55b8a5ad4bd1 "bridge_channel") at /usr/src/asterisk-18.4.0/include/asterisk/bridge_channel.h:234
#8  0x000055b8a58e7c0f in ast_bridge_channel_queue_frame (bridge_channel=0x7f9c30023960, fr=0x7f9c1c031330) at bridge_channel.c:1083
#9  0x000055b8a58e7f51 in ast_bridge_queue_everyone_else (bridge=0x7f9c300133f0, bridge_channel=0x0, frame=0x7f9c1c031330) at bridge_channel.c:1136
#10 0x00007f9c507e92ad in softmix_bridge_write_control (bridge=0x7f9c300133f0, bridge_channel=0x7f9c3003ffc0, frame=0x7f9c1c031330) at bridge_softmix.c:1257
#11 0x00007f9c507e94d6 in softmix_bridge_write (bridge=0x7f9c300133f0, bridge_channel=0x7f9c3003ffc0, frame=0x7f9c1c031330) at bridge_softmix.c:1354
#12 0x000055b8a58e6f39 in bridge_channel_write_frame (bridge_channel=0x7f9c3003ffc0, frame=0x7f9c1c031330) at bridge_channel.c:711
#13 0x000055b8a58ec409 in bridge_handle_trip (bridge_channel=0x7f9c3003ffc0) at bridge_channel.c:2706
#14 0x000055b8a58ec83a in bridge_channel_wait (bridge_channel=0x7f9c3003ffc0) at bridge_channel.c:2842
#15 0x000055b8a58ed03e in bridge_channel_internal_join (bridge_channel=0x7f9c3003ffc0) at bridge_channel.c:2993
#16 0x000055b8a58cfb17 in ast_bridge_join (bridge=0x7f9c300133f0, chan=0x7f9c1c022940, swap=0x0, features=0x7f9c5850e870, tech_args=0x7f9c5850e898, flags=(unknown: 0)) at bridge.c:17250x7f9c30023960
#17 0x00007f9be9f3c41d in confbridge_exec (chan=0x7f9c1c022940, data=0x7f9c5850ea50 "coincall1623370594,silentbridge,incoglistener") at app_confbridge.c:2758
#18 0x000055b8a59b0f43 in pbx_exec (c=0x7f9c1c022940, app=0x55b8a9737740, data=0x7f9c5850ea50 "coincall1623370594,silentbridge,incoglistener") at pbx_app.c:492
#19 0x000055b8a599c006 in pbx_extension_helper (c=0x7f9c1c022940, con=0x0, context=0x7f9c1c023300 "caller2", exten=0x7f9c1c023350 "s", priority=2, label=0x0, callerid=0x0, action=E_SPAWN, found=0x7f9c58511cec, combined_find_spawn=1) at pbx.c:2947
#20 0x000055b8a59a00d4 in ast_spawn_extension (c=0x7f9c1c022940, context=0x7f9c1c023300 "caller2", exten=0x7f9c1c023350 "s", priority=2, callerid=0x0, found=0x7f9c58511cec, combined_find_spawn=1) at pbx.c:4206
#21 0x000055b8a59a0df5 in __ast_pbx_run (c=0x7f9c1c022940, args=0x0) at pbx.c:4380
#22 0x000055b8a59a266e in pbx_thread (data=0x7f9c1c022940) at pbx.c:4704
#23 0x000055b8a5a414e4 in dummy_start (data=0x7f9c1c0148e0) at utils.c:1299
#24 0x00007f9c55f4d4a4 in start_thread (arg=0x7f9c58512700) at pthread_create.c:456
#25 0x00007f9c54b57d0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
{noformat}
_bridge_channel_write_frame_ locks the bridge in this stack trace, and then this thread gets blocked waiting {{user_data=0x7f9c30023960}}, which turns up also in Thread 94:
{noformat}
Thread 94 (Thread 0x7f9bdc19d700 (LWP 21695)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f9c55f4fc26 in __GI___pthread_mutex_lock (mutex=0x7f9c5862b970 <_rtld_global+2352>) at ../nptl/pthread_mutex_lock.c:115
#2  0x00007f9c54b8dd7f in __GI___dl_iterate_phdr (callback=0x7f9c54868da0, data=0x7f9bdc19c360) at dl-iteratephdr.c:41
#3  0x00007f9c5486a11e in _Unwind_Find_FDE () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#4  0x00007f9c54866b13 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#5  0x00007f9c54867d30 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#6  0x00007f9c548685e8 in _Unwind_Backtrace () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#7  0x00007f9c54b64e7f in __GI___backtrace (array=<optimized out>, size=<optimized out>) at ../sysdeps/x86_64/backtrace.c:110
#8  0x000055b8a58ca0da in __ast_bt_get_addresses (bt=0x7f9bdc19c930) at backtrace.c:94
#9  0x000055b8a597b1f5 in __ast_pthread_mutex_lock (filename=0x55b8a5ad8171 "channel.c", lineno=3087, func=0x55b8a5adc790 <__PRETTY_FUNCTION__.19719> "ast_waitfor_nandfds", mutex_name=0x55b8a5ad8dbd "c[x]", t=0x7f9c30025c30) at lock.c:270
#10 0x000055b8a58bd0ee in __ao2_lock (user_data=0x7f9c30025c80, lock_how=AO2_LOCK_REQ_MUTEX, file=0x55b8a5ad8171 "channel.c", func=0x55b8a5adc790 <__PRETTY_FUNCTION__.19719> "ast_waitfor_nandfds", line=3087, var=0x55b8a5ad8dbd "c[x]") at astobj2.c:241
#11 0x000055b8a5900f3d in ast_waitfor_nandfds (c=0x7f9c30023998, n=1, fds=0x7f9c30023a30, nfds=1, exception=0x0, outfd=0x7f9bdc19cd00, ms=0x7f9bdc19cd04) at channel.c:3087
#12 0x000055b8a58ec71f in bridge_channel_wait (bridge_channel=0x7f9c30023960) at bridge_channel.c:2827
#13 0x000055b8a58ed03e in bridge_channel_internal_join (bridge_channel=0x7f9c30023960) at bridge_channel.c:2993
#14 0x000055b8a58cfe0f in bridge_channel_ind_thread (data=0x7f9c30023960) at bridge.c:1797
#15 0x000055b8a5a414e4 in dummy_start (data=0x7f9c30023ac0) at utils.c:1299
#16 0x00007f9c55f4d4a4 in start_thread (arg=0x7f9bdc19d700) at pthread_create.c:456
#17 0x00007f9c54b57d0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
{noformat}
Here though that _bridge_channel_ is not itself blocked but instead this is blocked by {{user_data=0x7f9c30025c80}}, which is not found elsewhere in the backtrace.

By the way if you search on the locked bridge too it turns up elsewhere but only on thread is actually being blocked on it and from it's stack trace it appears to not lock any effected objects prior to it being blocked:
{noformat}
Thread 92 (Thread 0x7f9bdc297700 (LWP 21693)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f9c55f4fc26 in __GI___pthread_mutex_lock (mutex=0x7f9c300133a0) at ../nptl/pthread_mutex_lock.c:115
#2  0x000055b8a597b2d1 in __ast_pthread_mutex_lock (filename=0x7f9c507f0989 "bridge_softmix.c", lineno=1968, func=0x7f9c507f1ac0 <__PRETTY_FUNCTION__.17530> "softmix_mixing_loop", mutex_name=0x7f9c507f10cc "bridge", t=0x7f9c300133a0) at lock.c:326
#3  0x000055b8a58bd0ee in __ao2_lock (user_data=0x7f9c300133f0, lock_how=AO2_LOCK_REQ_MUTEX, file=0x7f9c507f0989 "bridge_softmix.c", func=0x7f9c507f1ac0 <__PRETTY_FUNCTION__.17530> "softmix_mixing_loop", line=1968, var=0x7f9c507f10cc "bridge") at astobj2.c:241
#4  0x00007f9c507e5d62 in _ast_bridge_lock (bridge=0x7f9c300133f0, file=0x7f9c507f0989 "bridge_softmix.c", function=0x7f9c507f1ac0 <__PRETTY_FUNCTION__.17530> "softmix_mixing_loop", line=1968, var=0x7f9c507f10cc "bridge") at /usr/src/asterisk-18.4.0/include/asterisk/bridge.h:483
#5  0x00007f9c507eabca in softmix_mixing_loop (bridge=0x7f9c300133f0) at bridge_softmix.c:1968
#6  0x00007f9c507eaf5c in softmix_mixing_thread (data=0x7f9c30009e90) at bridge_softmix.c:2047
#7  0x000055b8a5a414e4 in dummy_start (data=0x7f9c3000f820) at utils.c:1299
#8  0x00007f9c55f4d4a4 in start_thread (arg=0x7f9bdc297700) at pthread_create.c:456
#9  0x00007f9c54b57d0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
{noformat}
Even though it's not entirely obvious whether there is a deadlock one may still exist. However, given the latest backtrace my guess is either one of the channels, or bridge channels is being locked somewhere (just not seeing it yet), or another possibility is a thread locked one of those objects and then never unlocked it but that thread ended thus it does not show up in the backtrace.

In the latter case you'd be looking for a "lock" someplace, with no associated "unlock" (maybe an off nominal path). Not saying that's what's happening here, but something to keep in mind.

This probably wasn't all that helpful, but maybe someone will see something I missed.


> core: Exceptionally long queue length with Local channels and ConfBridge
> ------------------------------------------------------------------------
>
>                 Key: ASTERISK-29474
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-29474
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Applications/app_confbridge
>    Affects Versions: 18.4.0
>            Reporter: N A
>            Assignee: Unassigned
>         Attachments: confbridge.conf, core-asterisk-running-2021-06-10T12-30-28-0400-brief.txt, core-asterisk-running-2021-06-10T12-30-28-0400-full.txt, core-asterisk-running-2021-06-10T12-30-28-0400-info.txt, core-asterisk-running-2021-06-10T12-30-28-0400-locks.txt, core-asterisk-running-2021-06-10T12-30-28-0400-thread1.txt, core-asterisk-running-2021-06-10T12-38-12-0400-brief.txt, core-asterisk-running-2021-06-10T12-38-12-0400-full.txt, core-asterisk-running-2021-06-10T12-38-12-0400-info.txt, core-asterisk-running-2021-06-10T12-38-12-0400-locks.txt, core-asterisk-running-2021-06-10T12-38-12-0400-thread1.txt, core-asterisk-running-2021-06-11T00-06-54+0000-brief.txt, core-asterisk-running-2021-06-11T00-06-54+0000-full.txt, core-asterisk-running-2021-06-11T00-06-54+0000-info.txt, core-asterisk-running-2021-06-11T00-06-54+0000-locks.txt, core-asterisk-running-2021-06-11T00-06-54+0000-thread1.txt, core-asterisk-running-2021-06-11T00-17-11+0000-brief.txt, core-asterisk-running-2021-06-11T00-17-11+0000-full.txt, core-asterisk-running-2021-06-11T00-17-11+0000-info.txt, core-asterisk-running-2021-06-11T00-17-11+0000-locks.txt, core-asterisk-running-2021-06-11T00-17-11+0000-thread1.txt, core-asterisk-running-2021-06-11T17-18-53+0000-full.txt, debug_log_123456.txt, debug_threads.txt, extensions.conf, locks.txt
>
>
> It's our friend "__ast_queue_frame: Exceptionally long queue length queuing to..." again.
> I'm mark this as app_confbridge related because 99% of the time, deadlocks on my systems are caused by ConfBridge and using other applications if possible does not cause this issue at all.
> This issue happens with ConfBridges in multiple different scenarios - below is one that replicates every single time:
> - bring another channel playing audio into a confbridge
> - bring another channel that records locally into the confbridge, but muted
> - have that original channel join that ConfBridge
> All is good. Now, as soon as, from another phone, I join that same confbridge, even if muted, this whole thing happens immediately until at least after all channels and the bridge are torn down. Usually Asterisk crashes fairly quickly when this kind of thing happens; in this specific scenario, it doesn't crash quickly but it does basically kill both channels effectively and these warnings continue until everything involved is torn down, though sometimes this persists 10-20 seconds after that. In certain other similar scenarios, Asterisk usually crashes within a minute as CPU usage climbs to 100%.
> This is just one scenario that leads to this deadlock, but it can happen in several different ways. Typically it happens when multiple parties are in this aforementioned bridge using ConfBridge. Replacing ConfBridge with ChanSpy in places where possible is enough to make the issue disappear, hence why this seems to be a bug with app_confbridge.
> Attached are output from astcoredumper and core show locks during the deadlock.
> In this particular deadlock, there seems to be some kind of infinite loop caused by infinite progress updates, although I only see the infinite progress stuff when Asterisk is compiled with DONT_OPTIMIZE and DEBUG_THREADS. Regardless, this deadlocks occurs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list