[asterisk-bugs] [JIRA] (ASTERISK-28906) ari: Race condition when destroying holding bridge multiple times with channels in it

Sean Bright (JIRA) noreply at issues.asterisk.org
Fri Dec 11 15:46:16 CST 2020


    [ https://issues.asterisk.org/jira/browse/ASTERISK-28906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=250882#comment-250882 ] 

Sean Bright edited comment on ASTERISK-28906 at 12/11/20 3:45 PM:
------------------------------------------------------------------

Another crash-log this time tb pointed to a different fault:

{noformat}
#0  __GI_raise (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007f598d874801 in __GI_abort () at abort.c:79
#2  0x00007f598d8bd897 in __libc_message (action=action at entry=do_abort, fmt=fmt at entry=0x7f598d9eab9a "%s\n") at ../sysdeps/posix/libc_fatal.c:181
#3  0x00007f598d8c490a in malloc_printerr (str=str at entry=0x7f598d9ec3f0 "malloc_consolidate(): invalid chunk size") at malloc.c:5350
#4  0x00007f598d8c4bae in malloc_consolidate (av=av at entry=0x7f5890000020) at malloc.c:4441
#5  0x00007f598d8c87d8 in _int_malloc (av=av at entry=0x7f5890000020, bytes=bytes at entry=8272) at malloc.c:3703
#6  0x00007f598d8ce0b1 in __libc_calloc (n=n at entry=1, elem_size=elem_size at entry=8272) at malloc.c:3436
#7  0x000055e9afb3f421 in __ast_repl_calloc (func=0x55e9afd647d0 <__PRETTY_FUNCTION__.17546> "playtones_alloc", lineno=133, file=0x55e9afd63dee "indications.c", size=8272, nmemb=1) at astmm.c:1558
#8  __ast_calloc (nmemb=nmemb at entry=1, size=size at entry=8272, file=file at entry=0x55e9afd63dee "indications.c", lineno=lineno at entry=133, func=func at entry=0x55e9afd647d0 <__PRETTY_FUNCTION__.17546> "playtones_alloc") at astmm.c:1628
#9  0x000055e9afce79f2 in playtones_alloc (chan=0x7f58ec7c7e10, params=0x7f58481c6b90) at indications.c:133
#10 0x000055e9afb9044f in ast_activate_generator (chan=chan at entry=0x7f58ec7c7e10, gen=gen at entry=0x55e9affd8c20 <playtones>, params=params at entry=0x7f58481c6b90) at channel.c:2940
#11 0x000055e9afce80a1 in ast_playtones_start (chan=chan at entry=0x7f58ec7c7e10, vol=vol at entry=0, playlst=<optimized out>, interruptible=interruptible at entry=1) at indications.c:385
#12 0x000055e9afb8e9d9 in indicate_data_internal (chan=chan at entry=0x7f58ec7c7e10, _condition=_condition at entry=3, data=data at entry=0x0, datalen=datalen at entry=0) at channel.c:4558
#13 0x000055e9afb90ccd in ast_indicate_data (chan=chan at entry=0x7f58ec7c7e10, _condition=3, data=0x0, datalen=0) at channel.c:4617
#14 0x000055e9afb6e09b in bridge_channel_handle_control (fr=0x7f58ec356710, bridge_channel=0x7f59740f6070) at bridge_channel.c:2348
#15 bridge_channel_handle_write (bridge_channel=0x7f59740f6070) at bridge_channel.c:2420
#16 bridge_channel_wait (bridge_channel=0x7f59740f6070) at bridge_channel.c:2771
#17 bridge_channel_internal_join (bridge_channel=bridge_channel at entry=0x7f59740f6070) at bridge_channel.c:2911
#18 0x000055e9afb53045 in bridge_channel_depart_thread (data=data at entry=0x7f59740f6070) at bridge.c:1787
{noformat}



was (Author: jbreicis):
Another crash-log this time tb pointed to a different fault:

#0  __GI_raise (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007f598d874801 in __GI_abort () at abort.c:79
#2  0x00007f598d8bd897 in __libc_message (action=action at entry=do_abort, fmt=fmt at entry=0x7f598d9eab9a "%s\n")
    at ../sysdeps/posix/libc_fatal.c:181
#3  0x00007f598d8c490a in malloc_printerr (str=str at entry=0x7f598d9ec3f0 "malloc_consolidate(): invalid chunk size")
    at malloc.c:5350
#4  0x00007f598d8c4bae in malloc_consolidate (av=av at entry=0x7f5890000020) at malloc.c:4441
#5  0x00007f598d8c87d8 in _int_malloc (av=av at entry=0x7f5890000020, bytes=bytes at entry=8272) at malloc.c:3703
#6  0x00007f598d8ce0b1 in __libc_calloc (n=n at entry=1, elem_size=elem_size at entry=8272) at malloc.c:3436
#7  0x000055e9afb3f421 in __ast_repl_calloc (func=0x55e9afd647d0 <__PRETTY_FUNCTION__.17546> "playtones_alloc",
    lineno=133, file=0x55e9afd63dee "indications.c", size=8272, nmemb=1) at astmm.c:1558
#8  __ast_calloc (nmemb=nmemb at entry=1, size=size at entry=8272, file=file at entry=0x55e9afd63dee "indications.c",
    lineno=lineno at entry=133, func=func at entry=0x55e9afd647d0 <__PRETTY_FUNCTION__.17546> "playtones_alloc")
    at astmm.c:1628
#9  0x000055e9afce79f2 in playtones_alloc (chan=0x7f58ec7c7e10, params=0x7f58481c6b90) at indications.c:133
#10 0x000055e9afb9044f in ast_activate_generator (chan=chan at entry=0x7f58ec7c7e10,
    gen=gen at entry=0x55e9affd8c20 <playtones>, params=params at entry=0x7f58481c6b90) at channel.c:2940
#11 0x000055e9afce80a1 in ast_playtones_start (chan=chan at entry=0x7f58ec7c7e10, vol=vol at entry=0,
    playlst=<optimized out>, interruptible=interruptible at entry=1) at indications.c:385
#12 0x000055e9afb8e9d9 in indicate_data_internal (chan=chan at entry=0x7f58ec7c7e10, _condition=_condition at entry=3,
    data=data at entry=0x0, datalen=datalen at entry=0) at channel.c:4558
#13 0x000055e9afb90ccd in ast_indicate_data (chan=chan at entry=0x7f58ec7c7e10, _condition=3, data=0x0, datalen=0)
    at channel.c:4617
#14 0x000055e9afb6e09b in bridge_channel_handle_control (fr=0x7f58ec356710, bridge_channel=0x7f59740f6070)
    at bridge_channel.c:2348
#15 bridge_channel_handle_write (bridge_channel=0x7f59740f6070) at bridge_channel.c:2420
#16 bridge_channel_wait (bridge_channel=0x7f59740f6070) at bridge_channel.c:2771
#17 bridge_channel_internal_join (bridge_channel=bridge_channel at entry=0x7f59740f6070) at bridge_channel.c:2911
#18 0x000055e9afb53045 in bridge_channel_depart_thread (data=data at entry=0x7f59740f6070) at bridge.c:1787

> ari: Race condition when destroying holding bridge multiple times with channels in it
> -------------------------------------------------------------------------------------
>
>                 Key: ASTERISK-28906
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-28906
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Core/Bridging/bridge_roles, Resources/res_stasis
>    Affects Versions: 16.6.1
>         Environment: Ubuntu 18.04.3 LTS in KVM. Asterisk built from source.
>            Reporter: Juris Breicis
>            Assignee: Unassigned
>            Severity: Minor
>         Attachments: 04062020-core-info.txt, 04062020-core-locks.txt, 04062020-core-thread1.txt, 04062020-sanitized.core-brief.txt, 04062020-sanitized.core-full.txt, core-locks.txt, core-locks.txt, core-thread1.txt, core-thread1.txt, sanitized.core-brief.txt, sanitized.core-brief.txt, sanitized.core-full.txt, sanitized.core-full.txt
>
>
> Bit of background: I am on asterisk 16.6.2 and the box itself is handling ~  500 simultaneous calls.
> The whole call management part is fully handled by Stasis app through ARI.
> With a recent bug with our own Stasis software (multiple consecutive bridge destroys in a row) we have spotted an asterisk crash on rare occasions:
> #0 __GI_raise (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
> #1 0x00007f7e96b77801 in __GI_abort () at abort.c:79
> #2 0x00007f7e96bc0897 in __libc_message (action=action at entry=do_abort, fmt=fmt at entry=0x7f7e96cedb9a "%s\n")
> at ../sysdeps/posix/libc_fatal.c:181
> #3 0x00007f7e96bc790a in malloc_printerr (str=str at entry=0x7f7e96cef828 "double free or corruption (fasttop)")
> at malloc.c:5350
> #4 0x00007f7e96bcf004 in _int_free (have_lock=0, p=0x7f7e7c01d670, av=0x7f7e7c000020) at malloc.c:4230
> #5 __GI___libc_free (mem=0x7f7e7c01d680) at malloc.c:3124
> #6 0x000055997dfc833d in __ast_free (ptr=<optimized out>, file=file at entry=0x55997e1afe4f "bridge_roles.c",
> lineno=lineno at entry=78, func=func at entry=0x55997e1b0150 <__PRETTY_FUNCTION__.14665> "bridge_role_destroy")
> at astmm.c:1577
> #7 0x000055997dff8779 in bridge_role_destroy (role=<optimized out>) at bridge_roles.c:78
> #8 ast_channel_clear_bridge_roles (chan=<optimized out>) at bridge_roles.c:374
> #9 0x00007f7e35d9d71a in bridge_stasis_pull (self=0x7f7dc8001020, bridge_channel=0x7f7e7012adf0)
> at stasis/stasis_bridge.c:292
> #10 0x000055997dff5239 in bridge_channel_internal_pull (bridge_channel=bridge_channel at entry=0x7f7e7012adf0)
> at bridge_channel.c:2170
> #11 0x000055997dff67da in bridge_channel_internal_join (bridge_channel=bridge_channel at entry=0x7f7e7012adf0)
> at bridge_channel.c:2921
> #12 0x000055997dfdc045 in bridge_channel_depart_thread (data=data at entry=0x7f7e7012adf0) at bridge.c:1787
> #13 0x000055997e122f85 in dummy_start (data=<optimized out>) at utils.c:1249
> #14 0x00007f7e9771f6db in start_thread (arg=0x7f7d6f6b8700) at pthread_create.c:463
> #15 0x00007f7e96c5888f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> I suspect it could be a race condition where there are multiple attempts to free same memory segment (after it has already been freed).
> We will apply fix to our Stasis app ourselves to remedy the issue for our case. However - I suspect that this is not expected behaviour - is there anything else, you would need from me, to be able to fix this for other users?



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list