[asterisk-bugs] [JIRA] (ASTERISK-28469) AMI frozen on high load

Steve Sether (JIRA) noreply at issues.asterisk.org
Tue Jan 24 11:17:03 CST 2023


    [ https://issues.asterisk.org/jira/browse/ASTERISK-28469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=261239#comment-261239 ] 

Steve Sether commented on ASTERISK-28469:
-----------------------------------------

I'd recently had some time to re-visit this issue, as it's become a higher priority.  I came up with a trivial way to reproduce it below using netcat (described below)  in hopes of having someone fix it I'm happy to report that it seems to be fixed as of Asterisk 16.30.  There's several fixes applied to the AMI from 16.25 through 16.30 and I'm not sure which one fixes it, but I happily can't reproduce the problem via the method below on 16.30, while I can on 16.24.

My step to reproduce on Linux.

nc -I 30000 targetServerNameOrIPHere 5038 < amiLogin.txt 
hit cntrl-z to suspend this process
make a call on the target Asterisk server (critical)
attempt to connect to the AMI via telnet. Telnet connects, but Asterisk doesn't respond with Asterisk Call Manager/1.3
When nc command is killed, Asterisk continues to accept connections on the AMI port.
 
amiLogin.txt consists of:
Action: Login
ActionID: 1
Username: yourUsernameHere
Events: on
Secret: yourPasswordHere




> AMI frozen on high load
> -----------------------
>
>                 Key: ASTERISK-28469
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-28469
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Core/ManagerInterface
>    Affects Versions: 16.4.0
>         Environment: Linux deacsrtsk12 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt25-2 (2016-04-08) x86_64 GNU/Linux
>            Reporter: Stefan Repke
>            Assignee: Stefan Repke
>            Severity: Major
>         Attachments: AMIfrozen.txt, case2.zip, core-asterisk-running-2019-07-01T16-15-22+0200-brief.txt, core-asterisk-running-2019-07-01T16-15-22+0200-full.txt, core-asterisk-running-2019-07-01T16-15-22+0200-locks.txt, core-asterisk-running-2019-07-01T16-15-22+0200-thread1.txt, core show locks, core show threads, debug.deacsrtsk12
>
>
> The AMI queue gets stuck when running Asterisk 16.0.4 with ~30 calls per minute: 
> After a few hundred calls have been processed, no further manager events are distrubuted and no new manager connections are accepted.
> The problem is reproducable and we've created a dump. I think the reason is that thread 0x7f6e6e56c700 is blocked in {{send_string}} of {{manager.c:3033}} (while holding/blocking lock 0x7f6ee0003260).
> Our version of libc6 is 2.19-18+deb8u10.
> The CLI command {{manager show eventq}} produces dozens of megabytes of output consisting of the queued events.
> {noformat:title=core show locks}
> =======================================================================
> === 16.4.0
> === Currently Held Locks
> =======================================================================
> ===
> === <pending> <lock#> (<file>): <lock type> <line num> <function> <lock name> <lock addr> (times locked)
> ===
> === Thread ID: 0x7f6e6ffff700 LWP:11439 (desc->accept_fn      started at [  770] tcptls.c ast_tcptls_server_start())
> === ---> Waiting for Lock #0 (manager.c): MUTEX 6896 purge_sessions session 0x7f6ee0003260 (1)
> 	[0x50277a] asterisk lock.h:446 ast_reentrancy_lock()
> 	[inlined] asterisk lock.c:272 __ast_pthread_mutex_lock()
> 	[0x45bd6d] asterisk astobj2.c:243 __ao2_lock()
> 	[0x5faaca] asterisk manager.c:2678 handle_showmanconn()
> 	[0x4bf2e5] asterisk cli.c:2834 ast_cli_command_full()
> 	[0x4bf454] asterisk cli.c:2861 ast_cli_command_multiple_full()
> 	[0x45865f] asterisk asterisk.c:1420 netconsole()
> 	[0x5a40c5] asterisk utils.c:1251 dummy_start()
> 	[0x7f6efd2c5064] libpthread.so.0 pthread_create.c:309 start_thread()
> 	[0x7f6efc26862d] libc.so.6 clone.S:113 clone()
> === --- ---> Locked Here: manager.c line 6104 (process_events)
> === -------------------------------------------------------------------
> ===
> === Thread ID: 0x7f6e6e56c700 LWP:11491 (handle_tcptls_connection started at [  309] tcptls.c ast_tcptls_server_root())
> === ---> Lock #0 (manager.c): MUTEX 6104 process_events s->session 0x7f6ee0003260 (1)
> 	[0x50277a] asterisk lock.h:446 ast_reentrancy_lock()
> 	[inlined] asterisk lock.c:272 __ast_pthread_mutex_lock()
> 	[0x45bd6d] asterisk astobj2.c:243 __ao2_lock()
> 	[0x5fff11] asterisk manager.c:6105 process_events()
> 	[0x604be8] asterisk manager.c:6714 do_message()
> 	[inlined] asterisk manager.c:6853 session_do()
> 	[0x5a40c5] asterisk utils.c:1251 dummy_start()
> 	[0x7f6efd2c5064] libpthread.so.0 pthread_create.c:309 start_thread()
> 	[0x7f6efc26862d] libc.so.6 clone.S:113 clone()
> === -------------------------------------------------------------------
> ===
> === Thread ID: 0x7f6e6dc38700 LWP:23563 (netconsole           started at [ 1529] asterisk.c listener())
> === ---> Waiting for Lock #0 (manager.c): MUTEX 2673 handle_showmanconn session 0x7f6ee0003260 (1)
> 	[0x50277a] asterisk lock.h:446 ast_reentrancy_lock()
> 	[inlined] asterisk lock.c:272 __ast_pthread_mutex_lock()
> 	[0x45bd6d] asterisk astobj2.c:243 __ao2_lock()
> 	[0x5faaca] asterisk manager.c:2678 handle_showmanconn()
> 	[0x4bf2e5] asterisk cli.c:2834 ast_cli_command_full()
> 	[0x4bf454] asterisk cli.c:2861 ast_cli_command_multiple_full()
> 	[0x45865f] asterisk asterisk.c:1420 netconsole()
> 	[0x5a40c5] asterisk utils.c:1251 dummy_start()
> 	[0x7f6efd2c5064] libpthread.so.0 pthread_create.c:309 start_thread()
> 	[0x7f6efc26862d] libc.so.6 clone.S:113 clone()
> === --- ---> Locked Here: manager.c line 6104 (process_events)
> === -------------------------------------------------------------------
> ===
> === Thread ID: 0x7f6e6d380700 LWP:23932 (netconsole           started at [ 1529] asterisk.c listener())
> === ---> Waiting for Lock #0 (manager.c): MUTEX 2673 handle_showmanconn session 0x7f6ee0003260 (1)
> 	[0x50277a] asterisk lock.h:446 ast_reentrancy_lock()
> 	[inlined] asterisk lock.c:272 __ast_pthread_mutex_lock()
> 	[0x45bd6d] asterisk astobj2.c:243 __ao2_lock()
> 	[0x5faaca] asterisk manager.c:2678 handle_showmanconn()
> 	[0x4bf2e5] asterisk cli.c:2834 ast_cli_command_full()
> 	[0x4bf454] asterisk cli.c:2861 ast_cli_command_multiple_full()
> 	[0x45865f] asterisk asterisk.c:1420 netconsole()
> 	[0x5a40c5] asterisk utils.c:1251 dummy_start()
> 	[0x7f6efd2c5064] libpthread.so.0 pthread_create.c:309 start_thread()
> 	[0x7f6efc26862d] libc.so.6 clone.S:113 clone()
> === --- ---> Locked Here: manager.c line 6104 (process_events)
> === -------------------------------------------------------------------
> ===
> === Thread ID: 0x7f6e6d3fc700 LWP:24956 (netconsole           started at [ 1529] asterisk.c listener())
> === ---> Waiting for Lock #0 (manager.c): MUTEX 2673 handle_showmanconn session 0x7f6ee0003260 (1)
> 	[0x50277a] asterisk lock.h:446 ast_reentrancy_lock()
> 	[inlined] asterisk lock.c:272 __ast_pthread_mutex_lock()
> 	[0x45bd6d] asterisk astobj2.c:243 __ao2_lock()
> 	[0x5faaca] asterisk manager.c:2678 handle_showmanconn()
> 	[0x4bf2e5] asterisk cli.c:2834 ast_cli_command_full()
> 	[0x4bf454] asterisk cli.c:2861 ast_cli_command_multiple_full()
> 	[0x45865f] asterisk asterisk.c:1420 netconsole()
> 	[0x5a40c5] asterisk utils.c:1251 dummy_start()
> 	[0x7f6efd2c5064] libpthread.so.0 pthread_create.c:309 start_thread()
> 	[0x7f6efc26862d] libc.so.6 clone.S:113 clone()
> === --- ---> Locked Here: manager.c line 6104 (process_events)
> === -------------------------------------------------------------------
> ===
> === Thread ID: 0x7f6efb470700 LWP:25271 (netconsole           started at [ 1529] asterisk.c listener())
> === ---> Waiting for Lock #0 (manager.c): MUTEX 2673 handle_showmanconn session 0x7f6ee0003260 (1)
> 	[0x50277a] asterisk lock.h:446 ast_reentrancy_lock()
> 	[inlined] asterisk lock.c:272 __ast_pthread_mutex_lock()
> 	[0x45bd6d] asterisk astobj2.c:243 __ao2_lock()
> 	[0x5faaca] asterisk manager.c:2678 handle_showmanconn()
> 	[0x4bf2e5] asterisk cli.c:2834 ast_cli_command_full()
> 	[0x4bf454] asterisk cli.c:2861 ast_cli_command_multiple_full()
> 	[0x45865f] asterisk asterisk.c:1420 netconsole()
> 	[0x5a40c5] asterisk utils.c:1251 dummy_start()
> 	[0x7f6efd2c5064] libpthread.so.0 pthread_create.c:309 start_thread()
> 	[0x7f6efc26862d] libc.so.6 clone.S:113 clone()
> === --- ---> Locked Here: manager.c line 6104 (process_events)
> === -------------------------------------------------------------------
> ===
> =======================================================================
> {noformat}
> {noformat:title=brief.txt}
> ...
> Thread 52 (Thread 0x7f6e6e56c700 (LWP 11491)):
> #0  0x00007f6efc25faed in poll () at ../sysdeps/unix/syscall-template.S:81
> #1  0x00000000005a6b78 in poll (__timeout=<optimized out>, __nfds=1, __fds=0x7f6e6e56b2d0) at /usr/include/x86_64-linux-gnu/bits/poll2.h:46
> #2  ast_wait_for_output (fd=<optimized out>, ms=<optimized out>) at utils.c:1357
> #3  0x00000000004f6f89 in ast_iostream_write (stream=0x7f6e680041e0, buffer=0x7f6e7018e4f8, size=5000, size at entry=487) at iostream.c:473
> #4  0x00000000004f747d in ast_iostream_write (stream=stream at entry=0x7f6e680041e0, buffer=buffer at entry=0x7f6e7018e4f8, size=size at entry=487) at iostream.c:475
> #5  0x0000000000600121 in send_string (string=0x7f6e7018e4f8 "Event: NewextenrnPrivilege: dialplan,allrnTimestamp: 1561990424.176441rnSystemName: deacsrtsk12rnChannel: SIP/999253-000002c8rnChannelState: 4rnChannelStateDesc: RingrnCallerIDNum: 999253rnCallerIDNam"..., s=0x7f6e6e56b4d0) at manager.c:3033
> #6  process_events (s=0x7f6e6e56b4d0) at manager.c:6117
> #7  0x0000000000604be8 in do_message (s=<optimized out>) at manager.c:6714
> #8  session_do (data=0x7f6e6e56b2d0) at manager.c:6853
> #9  0x00000000005a40c5 in dummy_start (data=0x7f6e68003dc0) at utils.c:1249
> #10 0x00007f6efd2c5064 in start_thread (arg=0x7f6e6e56c700) at pthread_create.c:309
> #11 0x00007f6efc26862d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
> ...
> {noformat}
> {noformat:title=full.txt}
> ...
> Thread 52 (Thread 0x7f6e6e56c700 (LWP 11491)):
> #0  0x00007f6efc25faed in poll () at ../sysdeps/unix/syscall-template.S:81
> No locals.
> #1  0x00000000005a6b78 in poll (__timeout=<optimized out>, __nfds=1, __fds=0x7f6e6e56b2d0) at /usr/include/x86_64-linux-gnu/bits/poll2.h:46
> No locals.
> #2  ast_wait_for_output (fd=<optimized out>, ms=<optimized out>) at utils.c:1357
>         pfd = {{fd = 22, events = 4, revents = 0}}
> #3  0x00000000004f6f89 in ast_iostream_write (stream=0x7f6e680041e0, buffer=0x7f6e7018e4f8, size=5000, size at entry=487) at iostream.c:473
> No locals.
> #4  0x00000000004f747d in ast_iostream_write (stream=stream at entry=0x7f6e680041e0, buffer=buffer at entry=0x7f6e7018e4f8, size=size at entry=487) at iostream.c:475
> No locals.
> #5  0x0000000000600121 in send_string (string=0x7f6e7018e4f8 "Event: NewextenrnPrivilege: dialplan,allrnTimestamp: 1561990424.176441rnSystemName: deacsrtsk12rnChannel: SIP/999253-000002c8rnChannelState: 4rnChannelStateDesc: RingrnCallerIDNum: 999253rnCallerIDNam"..., s=0x7f6e6e56b4d0) at manager.c:3033
>         stream = 0x7f6e680041e0
>         len = 487
>         res = <optimized out>
>         string = 0x7f6e7018e4f8 "Event: NewextenrnPrivilege: dialplan,allrnTimestamp: 1561990424.176441rnSystemName: deacsrtsk12rnChannel: SIP/999253-000002c8rnChannelState: 4rnChannelStateDesc: RingrnCallerIDNum: 999253rnCallerIDNam"...
>         s = 0x7f6e6e56b4d0
> #6  process_events (s=0x7f6e6e56b4d0) at manager.c:6117
>         eqe = <optimized out>
>         ret = <optimized out>
>         __PRETTY_FUNCTION__ = "process_events"
> #7  0x0000000000604be8 in do_message (s=<optimized out>) at manager.c:6714
>         m = {hdrcount = 6, headers = {0x7f6ee0004620 "action: Monitor", 0x7f6ee04085d0 "actionid: 537#355425276_673410#", 0x7f6ee0008fb0 "file: /tmp/U23TJQ5V23X4MZQB", 0x7f6ee0009b70 "format: wav", 0x7f6ee0009b20 "channel: IAX2/999005-3957", 0x7f6ee0008fe0 "mix: true", 0x0 <repeats 122 times>}}
>         header_buf = "mix: true000AX2/999005-3957000B000061060#", '000' <repeats 993 times>
>         now = 1561988218
> #8  session_do (data=0x7f6e6e56b2d0) at manager.c:6853
>         s = {session = 0x7f6ee00032b8, stream = 0x7f6e680041e0, tcptls_session = 0x7f6e68004068, parsing = MESSAGE_OKAY, write_error = -1, hook = 0x0, lock = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '000' <repeats 16 times>, "001", '000' <repeats 22 times>, __align = 0}, track = 0x7f6ee0005ee0, flags = {tracking = 1, setup = 1}}}
>         res = 5
>         arg = 1
>         ser_remote_address_tmp = {ss = {ss_family = 10, __ss_align = 0, __ss_padding = "000000377377177000000001", '000' <repeats 48 times>, "|254367373n177000000000000000000000000000000370377377377377377377377220324367373n177000000340266Vnn177000000220I036377n177000000000000000000000000000"}, len = 28}
>         __PRETTY_FUNCTION__ = "session_do"
> #9  0x00000000005a40c5 in dummy_start (data=0x7f6e68003dc0) at utils.c:1249
>         __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {140114181235520, 57934947619282876, 140112167979552, 140114047020320, 21, 140112274310912, 57947742297497532, -139326592165109828}, __mask_was_saved = 0}}, __pad = {0x7f6e6e56bef0, 0x0, 0x7f6efc521878 <__elf_set___libc_thread_subfreeres_element___rpc_thread_destroy__>, 0x7f6efc521880 <internal_trans_names>}}
>         __cancel_arg = 0x7f6e6e56c700
>         __not_first_call = <optimized out>
>         ret = <optimized out>
>         a = {start_routine = 0x5955b0 <handle_tcptls_connection>, data = 0x7f6e68004068, name = 0x7f6e68004a20 "handle_tcptls_connection started at [  309] tcptls.c ast_tcptls_server_root()"}
>         mutex_attr = {__size = "001000000", __align = 1}
>         __PRETTY_FUNCTION__ = "dummy_start"
> #10 0x00007f6efd2c5064 in start_thread (arg=0x7f6e6e56c700) at pthread_create.c:309
>         __res = <optimized out>
>         pd = 0x7f6e6e56c700
>         now = <optimized out>
>         unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140112274310912, -139326233515286596, 0, 140114047020320, 21, 140112274310912, 57947742289108924, 57695635703832508}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
>         not_first_call = <optimized out>
>         pagesize_m1 = <optimized out>
>         sp = <optimized out>
>         freesize = <optimized out>
>         __PRETTY_FUNCTION__ = "start_thread"
> #11 0x00007f6efc26862d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
> No locals.
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list