[asterisk-bugs] [JIRA] (ASTERISK-28906) ari: Race condition when destroying holding bridge multiple times with channels in it

Thu May 21 06:13:25 CDT 2020

     [ https://issues.asterisk.org/jira/browse/ASTERISK-28906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joshua C. Colp updated ASTERISK-28906:
--------------------------------------

    Assignee: Juris Breicis  (was: Unassigned)
      Status: Waiting for Feedback  (was: Triage)

An Asterisk log with debug enabled would be best, but if that is not available then a pcap would at least give some insight into what is going on. Essentially it has to be determined what in your usage patterns is causing the issue, as this hasn't been experienced by others.

> ari: Race condition when destroying holding bridge multiple times with channels in it
> -------------------------------------------------------------------------------------
>
>                 Key: ASTERISK-28906
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-28906
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Core/Bridging/bridge_roles, Resources/res_stasis
>    Affects Versions: 16.6.1
>         Environment: Ubuntu 18.04.3 LTS in KVM. Asterisk built from source.
>            Reporter: Juris Breicis
>            Assignee: Juris Breicis
>            Severity: Minor
>         Attachments: core-locks.txt, core-locks.txt, core-thread1.txt, core-thread1.txt, sanitized.core-brief.txt, sanitized.core-brief.txt, sanitized.core-full.txt, sanitized.core-full.txt
>
>
> Bit of background: I am on asterisk 16.6.2 and the box itself is handling ~  500 simultaneous calls.
> The whole call management part is fully handled by Stasis app through ARI.
> With a recent bug with our own Stasis software (multiple consecutive bridge destroys in a row) we have spotted an asterisk crash on rare occasions:
> #0 __GI_raise (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
> #1 0x00007f7e96b77801 in __GI_abort () at abort.c:79
> #2 0x00007f7e96bc0897 in __libc_message (action=action at entry=do_abort, fmt=fmt at entry=0x7f7e96cedb9a "%s\n")
> at ../sysdeps/posix/libc_fatal.c:181
> #3 0x00007f7e96bc790a in malloc_printerr (str=str at entry=0x7f7e96cef828 "double free or corruption (fasttop)")
> at malloc.c:5350
> #4 0x00007f7e96bcf004 in _int_free (have_lock=0, p=0x7f7e7c01d670, av=0x7f7e7c000020) at malloc.c:4230
> #5 __GI___libc_free (mem=0x7f7e7c01d680) at malloc.c:3124
> #6 0x000055997dfc833d in __ast_free (ptr=<optimized out>, file=file at entry=0x55997e1afe4f "bridge_roles.c",
> lineno=lineno at entry=78, func=func at entry=0x55997e1b0150 <__PRETTY_FUNCTION__.14665> "bridge_role_destroy")
> at astmm.c:1577
> #7 0x000055997dff8779 in bridge_role_destroy (role=<optimized out>) at bridge_roles.c:78
> #8 ast_channel_clear_bridge_roles (chan=<optimized out>) at bridge_roles.c:374
> #9 0x00007f7e35d9d71a in bridge_stasis_pull (self=0x7f7dc8001020, bridge_channel=0x7f7e7012adf0)
> at stasis/stasis_bridge.c:292
> #10 0x000055997dff5239 in bridge_channel_internal_pull (bridge_channel=bridge_channel at entry=0x7f7e7012adf0)
> at bridge_channel.c:2170
> #11 0x000055997dff67da in bridge_channel_internal_join (bridge_channel=bridge_channel at entry=0x7f7e7012adf0)
> at bridge_channel.c:2921
> #12 0x000055997dfdc045 in bridge_channel_depart_thread (data=data at entry=0x7f7e7012adf0) at bridge.c:1787
> #13 0x000055997e122f85 in dummy_start (data=<optimized out>) at utils.c:1249
> #14 0x00007f7e9771f6db in start_thread (arg=0x7f7d6f6b8700) at pthread_create.c:463
> #15 0x00007f7e96c5888f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> I suspect it could be a race condition where there are multiple attempts to free same memory segment (after it has already been freed).
> We will apply fix to our Stasis app ourselves to remedy the issue for our case. However - I suspect that this is not expected behaviour - is there anything else, you would need from me, to be able to fix this for other users?

--
This message was sent by Atlassian JIRA
(v6.2#6252)