[asterisk-bugs] [JIRA] (ASTERISK-27862) ARI: Crashing on json_mem_free (after sending event)
Stefan Repke (JIRA)
noreply at issues.asterisk.org
Tue May 22 09:00:56 CDT 2018
[ https://issues.asterisk.org/jira/browse/ASTERISK-27862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stefan Repke updated ASTERISK-27862:
------------------------------------
Status: Waiting for Feedback (was: Waiting for Feedback)
It's a memory corruption for sure! I did some extensive testing with the two options of memory debugging:
1. With valgrind, I wasn't able to reproduce the crash!
2. With {{MALLOC_DEBUG}} (and {{memory backtrace off}}), it crashed again (after a few seconds of stress testing):
{noformat:title=mmlog}
WARNING: Memory corrupted after free of 0x7f92c0020480 allocated at json.c ast_json_malloc() line 155
WARNING: Memory corrupted after free of 0x7f92ec0066c0 allocated at json.c ast_json_malloc() line 155
WARNING: Memory corrupted after free of 0x7f92c00381f0 allocated at json.c ast_json_malloc() line 155
WARNING: Freeing unregistered memory 0x7f92ac041cf0 by json.c json_mem_free() line 84
{noformat}
{noformat:title=core-thread1.txt}
!@!@!@! thread1.txt !@!@!@!
$1 = {si_signo = 11, si_errno = 0, si_code = 128, _sifields = {_pad = {0 <repeats 28 times>}, _kill = {si_pid = 0, si_uid = 0}, _timer = {si_tid = 0, si_overrun = 0, si_sigval = {sival_int = 0, sival_ptr = 0x0}}, _rt = {si_pid = 0, si_uid = 0, si_sigval = {sival_int = 0, sival_ptr = 0x0}}, _sigchld = {si_pid = 0, si_uid = 0, si_status = 0, si_utime = 0, si_stime = 0}, _sigfault = {si_addr = 0x0}, _sigpoll = {si_band = 0, si_fd = 0}}}
Signal Stop Print Pass to program Description
SIGSEGV Yes Yes Yes Segmentation fault
Thread 1 (Thread 0x7f92b3d17700 (LWP 9121)):
#0 0x0000000000536992 in ast_json_unref (json=0x7f92ac04adc8) at json.c:253
__cur = 0xdeaddeaddeaddead
free_list = 0x7f92ac010150
mem = 0x7f92ac041cf0
#1 0x00007f92f37cc6c5 in sub_default_handler (data=0x7f935000cf70, sub=0x7f935000f650, message=0x7f92ac045d80) at stasis/app.c:357
app = 0x7f935000cf70
json = 0x7f92ac04adc8
#2 0x00000000005deb0e in router_dispatch (data=0x7f935000f510, sub=0x7f935000f650, message=0x7f92ac045d80) at stasis_message_router.c:201
router = 0x7f935000f510
route = {message_type = 0x0, callback = 0x7f92f37cc616 <sub_default_handler>, data = 0x7f935000cf70}
#3 0x00000000005cd840 in subscription_invoke (sub=0x7f935000f650, message=0x7f92ac045d80) at stasis.c:434
__PRETTY_FUNCTION__ = "subscription_invoke"
#4 0x00000000005ce319 in dispatch_exec_async (local=0x7f92b3d16da0) at stasis.c:733
sub = 0x7f935000f650
message = 0x7f92ac045d80
#5 0x00000000005eb2e9 in ast_taskprocessor_execute (tps=0x7f935000f7c0) at taskprocessor.c:970
local = {local_data = 0x7f935000f650, data = 0x7f92ac045d80}
t = 0x7f92b80131e0
size = 0
__PRETTY_FUNCTION__ = "ast_taskprocessor_execute"
#6 0x00000000005e957c in default_tps_processing_function (data=0x7f9350002100) at taskprocessor.c:191
listener = 0x7f9350002100
tps = 0x7f935000f7c0
pvt = 0x7f9350001fe0
sem_value = 6701170
res = 0
__PRETTY_FUNCTION__ = "default_tps_processing_function"
#7 0x0000000000600b19 in dummy_start (data=0x7f935000fb20) at utils.c:1239
__cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {0, -1581999541431752663, 0, 140270822379616, 0, 140268058801920, -1581999541406586839, 1526435149223897129}, __mask_was_saved = 0}}, __pad = {0x7f92b3d16ef0, 0x0, 0x0, 0x0}}
__cancel_routine = 0x451d2a <ast_unregister_thread>
__cancel_arg = 0x7f92b3d17700
__not_first_call = 0
ret = 0x0
a = {start_routine = 0x5e94e6 <default_tps_processing_function>, data = 0x7f9350002100, name = 0x7f935000fbe0 "default_tps_processing_function started at [ 208] taskprocessor.c default_listener_start()"}
__PRETTY_FUNCTION__ = "dummy_start"
#8 0x00007f9356986064 in start_thread (arg=0x7f92b3d17700) at pthread_create.c:309
__res = <optimized out>
pd = 0x7f92b3d17700
now = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140268058801920, 1526434874704078889, 0, 140270822379616, 0, 140268058801920, -1581999541429655511, -1582925802170013655}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = <optimized out>
pagesize_m1 = <optimized out>
sp = <optimized out>
freesize = <optimized out>
__PRETTY_FUNCTION__ = "start_thread"
#9 0x00007f935592962d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
No locals.
{noformat}
3. With {{MALLOC_DEBUG}} and {{memory backtrace on}}, I wasn't able to reproduce the crash either (even after hours of stress testing)!
But this gives interessting insights into the other memory corruptions "at json.c ast_json_malloc() line 155":
{noformat:title=mmlog}
WARNING: Memory corrupted after free of 0x7f894c088400 allocated at json.c ast_json_malloc() line 155
Memory allocation backtrace:
#0: [0x45cc17] main/astmm.c:523 __ast_malloc() (0x45cbdb+3C)
#1: [0x53674f] main/json.c:155 ast_json_malloc() (0x536724+2B)
#2: [0x7f89a1cd72ee] libjansson.so.4 json_integer() (0x7f89a1cd72e0+E)
#3: [0x7f89a1cd5c76] libjansson.so.4 <unknown>()
#4: [0x7f89a1cd5afb] libjansson.so.4 <unknown>()
#5: [0x7f89a1cd5f5f] libjansson.so.4 json_vpack_ex() (0x7f89a1cd5ed0+8F)
#6: [0x537b6d] main/json.c:859 ast_json_vpack() (0x537b1c+51)
#7: [0x537b0c] main/json.c:850 ast_json_pack() (0x537a65+A7)
#8: [0x7f890b49ebb7] channels/chan_iax2.c:9082 update_registry()
#9: [0x7f890b4ae9ed] channels/chan_iax2.c:11555 socket_process_helper()
#10: [0x7f890b4b26e0] channels/chan_iax2.c:11974 socket_process()
#11: [0x7f890b4b2dba] channels/chan_iax2.c:12086 iax2_process_thread()
#12: [0x600b19] main/utils.c:1239 dummy_start()
{noformat}
4. Further, I've observed some occurances of the following line during stress testing:
{noformat}
[2018-05-22 13:30:17] ERROR[21499]: ari/ari_websockets.c:175 ast_ari_websocket_session_write: Failed to encode JSON object
{noformat}
All this looks like a fatal race condition on the json objects ...
> ARI: Crashing on json_mem_free (after sending event)
> ----------------------------------------------------
>
> Key: ASTERISK-27862
> URL: https://issues.asterisk.org/jira/browse/ASTERISK-27862
> Project: Asterisk
> Issue Type: Bug
> Security Level: None
> Components: Applications/app_stasis
> Affects Versions: 13.21.0
> Reporter: Stefan Repke
> Assignee: Stefan Repke
> Attachments: backtrace 1.txt, backtrace 2.txt, backtrace 3.txt, cli.txt
>
>
> ARI is activated and there are several apps connected with "subscribeAll=true".
> Then, Asterisk crashes occasionally when delivering ARI events!
> With normal workload, this happens once or twice a day. It can be provoked by adding many (eg. 10) ARI subscribers and producing lots of events (eg. PeerStatusChange by registering/unregistering a peer).
> I coredumped three crashes, all having a memory corruption in {{json_mem_free}} at {{json.c:84}}; cf. attached backtraces.
> * Backtrace 2 and 3 have similar problems: {{double free or corruption (fasttop)}},
> * Backtrace 1 looks like wrong memory access.
> I've also attached the last logger messages from the CLI (debug=5 / verbose=5).
--
This message was sent by Atlassian JIRA
(v6.2#6252)
More information about the asterisk-bugs
mailing list