[asterisk-bugs] [JIRA] (ASTERISK-28469) AMI frozen on high load

Freetech Solutions (JIRA) noreply at issues.asterisk.org
Tue Jul 7 15:43:25 CDT 2020


    [ https://issues.asterisk.org/jira/browse/ASTERISK-28469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=251375#comment-251375 ] 

Freetech Solutions commented on ASTERISK-28469:
-----------------------------------------------

Hi Gents,

I was also able to reproduce the issue in Asterisk 16.9 version, under CentOS7. 

Asterisk 16.9.0 built by root @ build-example.com on a x86_64 running Linux on 2020-06-09 19:08:29 UTC
CentOS Linux release 7.8.2003 (Core)
4 cores-CPU
8 GB RAM

Scenario: Pure Asterisk, 1 queue configured to receive incoming calls, 1 extension registered in the queue call. Sipp program was used to test the scenario for 30 calls using a CPS = 2 (call per second).

1) Login/Register the extension in queue.
2) Pause extension over AMI (pause successfully done).
3) Issue command: asterisk -rx 'manager show eventq' (just an event coming from my ami user):

Usecount: 0
Category: 262144
Event:
Event: SuccessfulAuth
Privilege: security,all
EventTV: 2020-07-07T17:32:37.211-0300
Severity: Informational
Service: AMI
EventVersion: 1
AccountID: labuserami
SessionID: 0x7f2b780021a0
LocalAddress: IPV4/TCP/0.0.0.0/5038
RemoteAddress: IPV4/TCP/127.0.0.1/48266
UsingPassword: 0
SessionTV: 2020-07-07T17:32:37.211-0300

3) Issue sipp command as follow: sipp -sn uac 172.16.20.19:5159 -s 445566 -r 2 -rp 1000 -d 10000 (5159 is my SIP udp testing port, 172.16.20.19 is our asterisk 16.9, 445566 is the DID destination, "r" and "rp" makes 2 calls per second, "d" is duration of the call in milliseconds).
4) After reaching 40 calls waiting on queue (at 2 CPS rate), issue again: asterisk -rx 'manager show eventq' (dozens of megabytes of queued events as described in the very beginning of this track). 

5) Issue command: asterisk -rx 'core show taskprocessors'. Queued events can be seen below:

stasis/m:manager:core-00000006                                             161644      57334      57336       2700       3000

6) Any AMI command attempt to port 5038 (or http manager) gives timeout.

Note: with CPS=1 issue is not observed.

> AMI frozen on high load
> -----------------------
>
>                 Key: ASTERISK-28469
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-28469
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Core/ManagerInterface
>    Affects Versions: 16.4.0
>         Environment: Linux deacsrtsk12 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt25-2 (2016-04-08) x86_64 GNU/Linux
>            Reporter: Stefan Repke
>            Assignee: Stefan Repke
>         Attachments: AMIfrozen.txt, case2.zip, core-asterisk-running-2019-07-01T16-15-22+0200-brief.txt, core-asterisk-running-2019-07-01T16-15-22+0200-full.txt, core-asterisk-running-2019-07-01T16-15-22+0200-locks.txt, core-asterisk-running-2019-07-01T16-15-22+0200-thread1.txt, core show locks, core show threads, debug.deacsrtsk12
>
>
> The AMI queue gets stuck when running Asterisk 16.0.4 with ~30 calls per minute: 
> After a few hundred calls have been processed, no further manager events are distrubuted and no new manager connections are accepted.
> The problem is reproducable and we've created a dump. I think the reason is that thread 0x7f6e6e56c700 is blocked in {{send_string}} of {{manager.c:3033}} (while holding/blocking lock 0x7f6ee0003260).
> Our version of libc6 is 2.19-18+deb8u10.
> The CLI command {{manager show eventq}} produces dozens of megabytes of output consisting of the queued events.
> {noformat:title=core show locks}
> =======================================================================
> === 16.4.0
> === Currently Held Locks
> =======================================================================
> ===
> === <pending> <lock#> (<file>): <lock type> <line num> <function> <lock name> <lock addr> (times locked)
> ===
> === Thread ID: 0x7f6e6ffff700 LWP:11439 (desc->accept_fn      started at [  770] tcptls.c ast_tcptls_server_start())
> === ---> Waiting for Lock #0 (manager.c): MUTEX 6896 purge_sessions session 0x7f6ee0003260 (1)
> 	[0x50277a] asterisk lock.h:446 ast_reentrancy_lock()
> 	[inlined] asterisk lock.c:272 __ast_pthread_mutex_lock()
> 	[0x45bd6d] asterisk astobj2.c:243 __ao2_lock()
> 	[0x5faaca] asterisk manager.c:2678 handle_showmanconn()
> 	[0x4bf2e5] asterisk cli.c:2834 ast_cli_command_full()
> 	[0x4bf454] asterisk cli.c:2861 ast_cli_command_multiple_full()
> 	[0x45865f] asterisk asterisk.c:1420 netconsole()
> 	[0x5a40c5] asterisk utils.c:1251 dummy_start()
> 	[0x7f6efd2c5064] libpthread.so.0 pthread_create.c:309 start_thread()
> 	[0x7f6efc26862d] libc.so.6 clone.S:113 clone()
> === --- ---> Locked Here: manager.c line 6104 (process_events)
> === -------------------------------------------------------------------
> ===
> === Thread ID: 0x7f6e6e56c700 LWP:11491 (handle_tcptls_connection started at [  309] tcptls.c ast_tcptls_server_root())
> === ---> Lock #0 (manager.c): MUTEX 6104 process_events s->session 0x7f6ee0003260 (1)
> 	[0x50277a] asterisk lock.h:446 ast_reentrancy_lock()
> 	[inlined] asterisk lock.c:272 __ast_pthread_mutex_lock()
> 	[0x45bd6d] asterisk astobj2.c:243 __ao2_lock()
> 	[0x5fff11] asterisk manager.c:6105 process_events()
> 	[0x604be8] asterisk manager.c:6714 do_message()
> 	[inlined] asterisk manager.c:6853 session_do()
> 	[0x5a40c5] asterisk utils.c:1251 dummy_start()
> 	[0x7f6efd2c5064] libpthread.so.0 pthread_create.c:309 start_thread()
> 	[0x7f6efc26862d] libc.so.6 clone.S:113 clone()
> === -------------------------------------------------------------------
> ===
> === Thread ID: 0x7f6e6dc38700 LWP:23563 (netconsole           started at [ 1529] asterisk.c listener())
> === ---> Waiting for Lock #0 (manager.c): MUTEX 2673 handle_showmanconn session 0x7f6ee0003260 (1)
> 	[0x50277a] asterisk lock.h:446 ast_reentrancy_lock()
> 	[inlined] asterisk lock.c:272 __ast_pthread_mutex_lock()
> 	[0x45bd6d] asterisk astobj2.c:243 __ao2_lock()
> 	[0x5faaca] asterisk manager.c:2678 handle_showmanconn()
> 	[0x4bf2e5] asterisk cli.c:2834 ast_cli_command_full()
> 	[0x4bf454] asterisk cli.c:2861 ast_cli_command_multiple_full()
> 	[0x45865f] asterisk asterisk.c:1420 netconsole()
> 	[0x5a40c5] asterisk utils.c:1251 dummy_start()
> 	[0x7f6efd2c5064] libpthread.so.0 pthread_create.c:309 start_thread()
> 	[0x7f6efc26862d] libc.so.6 clone.S:113 clone()
> === --- ---> Locked Here: manager.c line 6104 (process_events)
> === -------------------------------------------------------------------
> ===
> === Thread ID: 0x7f6e6d380700 LWP:23932 (netconsole           started at [ 1529] asterisk.c listener())
> === ---> Waiting for Lock #0 (manager.c): MUTEX 2673 handle_showmanconn session 0x7f6ee0003260 (1)
> 	[0x50277a] asterisk lock.h:446 ast_reentrancy_lock()
> 	[inlined] asterisk lock.c:272 __ast_pthread_mutex_lock()
> 	[0x45bd6d] asterisk astobj2.c:243 __ao2_lock()
> 	[0x5faaca] asterisk manager.c:2678 handle_showmanconn()
> 	[0x4bf2e5] asterisk cli.c:2834 ast_cli_command_full()
> 	[0x4bf454] asterisk cli.c:2861 ast_cli_command_multiple_full()
> 	[0x45865f] asterisk asterisk.c:1420 netconsole()
> 	[0x5a40c5] asterisk utils.c:1251 dummy_start()
> 	[0x7f6efd2c5064] libpthread.so.0 pthread_create.c:309 start_thread()
> 	[0x7f6efc26862d] libc.so.6 clone.S:113 clone()
> === --- ---> Locked Here: manager.c line 6104 (process_events)
> === -------------------------------------------------------------------
> ===
> === Thread ID: 0x7f6e6d3fc700 LWP:24956 (netconsole           started at [ 1529] asterisk.c listener())
> === ---> Waiting for Lock #0 (manager.c): MUTEX 2673 handle_showmanconn session 0x7f6ee0003260 (1)
> 	[0x50277a] asterisk lock.h:446 ast_reentrancy_lock()
> 	[inlined] asterisk lock.c:272 __ast_pthread_mutex_lock()
> 	[0x45bd6d] asterisk astobj2.c:243 __ao2_lock()
> 	[0x5faaca] asterisk manager.c:2678 handle_showmanconn()
> 	[0x4bf2e5] asterisk cli.c:2834 ast_cli_command_full()
> 	[0x4bf454] asterisk cli.c:2861 ast_cli_command_multiple_full()
> 	[0x45865f] asterisk asterisk.c:1420 netconsole()
> 	[0x5a40c5] asterisk utils.c:1251 dummy_start()
> 	[0x7f6efd2c5064] libpthread.so.0 pthread_create.c:309 start_thread()
> 	[0x7f6efc26862d] libc.so.6 clone.S:113 clone()
> === --- ---> Locked Here: manager.c line 6104 (process_events)
> === -------------------------------------------------------------------
> ===
> === Thread ID: 0x7f6efb470700 LWP:25271 (netconsole           started at [ 1529] asterisk.c listener())
> === ---> Waiting for Lock #0 (manager.c): MUTEX 2673 handle_showmanconn session 0x7f6ee0003260 (1)
> 	[0x50277a] asterisk lock.h:446 ast_reentrancy_lock()
> 	[inlined] asterisk lock.c:272 __ast_pthread_mutex_lock()
> 	[0x45bd6d] asterisk astobj2.c:243 __ao2_lock()
> 	[0x5faaca] asterisk manager.c:2678 handle_showmanconn()
> 	[0x4bf2e5] asterisk cli.c:2834 ast_cli_command_full()
> 	[0x4bf454] asterisk cli.c:2861 ast_cli_command_multiple_full()
> 	[0x45865f] asterisk asterisk.c:1420 netconsole()
> 	[0x5a40c5] asterisk utils.c:1251 dummy_start()
> 	[0x7f6efd2c5064] libpthread.so.0 pthread_create.c:309 start_thread()
> 	[0x7f6efc26862d] libc.so.6 clone.S:113 clone()
> === --- ---> Locked Here: manager.c line 6104 (process_events)
> === -------------------------------------------------------------------
> ===
> =======================================================================
> {noformat}
> {noformat:title=brief.txt}
> ...
> Thread 52 (Thread 0x7f6e6e56c700 (LWP 11491)):
> #0  0x00007f6efc25faed in poll () at ../sysdeps/unix/syscall-template.S:81
> #1  0x00000000005a6b78 in poll (__timeout=<optimized out>, __nfds=1, __fds=0x7f6e6e56b2d0) at /usr/include/x86_64-linux-gnu/bits/poll2.h:46
> #2  ast_wait_for_output (fd=<optimized out>, ms=<optimized out>) at utils.c:1357
> #3  0x00000000004f6f89 in ast_iostream_write (stream=0x7f6e680041e0, buffer=0x7f6e7018e4f8, size=5000, size at entry=487) at iostream.c:473
> #4  0x00000000004f747d in ast_iostream_write (stream=stream at entry=0x7f6e680041e0, buffer=buffer at entry=0x7f6e7018e4f8, size=size at entry=487) at iostream.c:475
> #5  0x0000000000600121 in send_string (string=0x7f6e7018e4f8 "Event: NewextenrnPrivilege: dialplan,allrnTimestamp: 1561990424.176441rnSystemName: deacsrtsk12rnChannel: SIP/999253-000002c8rnChannelState: 4rnChannelStateDesc: RingrnCallerIDNum: 999253rnCallerIDNam"..., s=0x7f6e6e56b4d0) at manager.c:3033
> #6  process_events (s=0x7f6e6e56b4d0) at manager.c:6117
> #7  0x0000000000604be8 in do_message (s=<optimized out>) at manager.c:6714
> #8  session_do (data=0x7f6e6e56b2d0) at manager.c:6853
> #9  0x00000000005a40c5 in dummy_start (data=0x7f6e68003dc0) at utils.c:1249
> #10 0x00007f6efd2c5064 in start_thread (arg=0x7f6e6e56c700) at pthread_create.c:309
> #11 0x00007f6efc26862d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
> ...
> {noformat}
> {noformat:title=full.txt}
> ...
> Thread 52 (Thread 0x7f6e6e56c700 (LWP 11491)):
> #0  0x00007f6efc25faed in poll () at ../sysdeps/unix/syscall-template.S:81
> No locals.
> #1  0x00000000005a6b78 in poll (__timeout=<optimized out>, __nfds=1, __fds=0x7f6e6e56b2d0) at /usr/include/x86_64-linux-gnu/bits/poll2.h:46
> No locals.
> #2  ast_wait_for_output (fd=<optimized out>, ms=<optimized out>) at utils.c:1357
>         pfd = {{fd = 22, events = 4, revents = 0}}
> #3  0x00000000004f6f89 in ast_iostream_write (stream=0x7f6e680041e0, buffer=0x7f6e7018e4f8, size=5000, size at entry=487) at iostream.c:473
> No locals.
> #4  0x00000000004f747d in ast_iostream_write (stream=stream at entry=0x7f6e680041e0, buffer=buffer at entry=0x7f6e7018e4f8, size=size at entry=487) at iostream.c:475
> No locals.
> #5  0x0000000000600121 in send_string (string=0x7f6e7018e4f8 "Event: NewextenrnPrivilege: dialplan,allrnTimestamp: 1561990424.176441rnSystemName: deacsrtsk12rnChannel: SIP/999253-000002c8rnChannelState: 4rnChannelStateDesc: RingrnCallerIDNum: 999253rnCallerIDNam"..., s=0x7f6e6e56b4d0) at manager.c:3033
>         stream = 0x7f6e680041e0
>         len = 487
>         res = <optimized out>
>         string = 0x7f6e7018e4f8 "Event: NewextenrnPrivilege: dialplan,allrnTimestamp: 1561990424.176441rnSystemName: deacsrtsk12rnChannel: SIP/999253-000002c8rnChannelState: 4rnChannelStateDesc: RingrnCallerIDNum: 999253rnCallerIDNam"...
>         s = 0x7f6e6e56b4d0
> #6  process_events (s=0x7f6e6e56b4d0) at manager.c:6117
>         eqe = <optimized out>
>         ret = <optimized out>
>         __PRETTY_FUNCTION__ = "process_events"
> #7  0x0000000000604be8 in do_message (s=<optimized out>) at manager.c:6714
>         m = {hdrcount = 6, headers = {0x7f6ee0004620 "action: Monitor", 0x7f6ee04085d0 "actionid: 537#355425276_673410#", 0x7f6ee0008fb0 "file: /tmp/U23TJQ5V23X4MZQB", 0x7f6ee0009b70 "format: wav", 0x7f6ee0009b20 "channel: IAX2/999005-3957", 0x7f6ee0008fe0 "mix: true", 0x0 <repeats 122 times>}}
>         header_buf = "mix: true000AX2/999005-3957000B000061060#", '000' <repeats 993 times>
>         now = 1561988218
> #8  session_do (data=0x7f6e6e56b2d0) at manager.c:6853
>         s = {session = 0x7f6ee00032b8, stream = 0x7f6e680041e0, tcptls_session = 0x7f6e68004068, parsing = MESSAGE_OKAY, write_error = -1, hook = 0x0, lock = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '000' <repeats 16 times>, "001", '000' <repeats 22 times>, __align = 0}, track = 0x7f6ee0005ee0, flags = {tracking = 1, setup = 1}}}
>         res = 5
>         arg = 1
>         ser_remote_address_tmp = {ss = {ss_family = 10, __ss_align = 0, __ss_padding = "000000377377177000000001", '000' <repeats 48 times>, "|254367373n177000000000000000000000000000000370377377377377377377377220324367373n177000000340266Vnn177000000220I036377n177000000000000000000000000000"}, len = 28}
>         __PRETTY_FUNCTION__ = "session_do"
> #9  0x00000000005a40c5 in dummy_start (data=0x7f6e68003dc0) at utils.c:1249
>         __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {140114181235520, 57934947619282876, 140112167979552, 140114047020320, 21, 140112274310912, 57947742297497532, -139326592165109828}, __mask_was_saved = 0}}, __pad = {0x7f6e6e56bef0, 0x0, 0x7f6efc521878 <__elf_set___libc_thread_subfreeres_element___rpc_thread_destroy__>, 0x7f6efc521880 <internal_trans_names>}}
>         __cancel_arg = 0x7f6e6e56c700
>         __not_first_call = <optimized out>
>         ret = <optimized out>
>         a = {start_routine = 0x5955b0 <handle_tcptls_connection>, data = 0x7f6e68004068, name = 0x7f6e68004a20 "handle_tcptls_connection started at [  309] tcptls.c ast_tcptls_server_root()"}
>         mutex_attr = {__size = "001000000", __align = 1}
>         __PRETTY_FUNCTION__ = "dummy_start"
> #10 0x00007f6efd2c5064 in start_thread (arg=0x7f6e6e56c700) at pthread_create.c:309
>         __res = <optimized out>
>         pd = 0x7f6e6e56c700
>         now = <optimized out>
>         unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140112274310912, -139326233515286596, 0, 140114047020320, 21, 140112274310912, 57947742289108924, 57695635703832508}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
>         not_first_call = <optimized out>
>         pagesize_m1 = <optimized out>
>         sp = <optimized out>
>         freesize = <optimized out>
>         __PRETTY_FUNCTION__ = "start_thread"
> #11 0x00007f6efc26862d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
> No locals.
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list