[asterisk-bugs] [JIRA] (ASTERISK-28906) ari: Race condition when destroying holding bridge multiple times with channels in it

Juris Breicis (JIRA) noreply at issues.asterisk.org
Thu Jun 4 09:54:25 CDT 2020


    [ https://issues.asterisk.org/jira/browse/ASTERISK-28906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=251022#comment-251022 ] 

Juris Breicis commented on ASTERISK-28906:
------------------------------------------

We just got the same crash on 17.4.0 compiled with DONT_OPTIMIZE flags set to on; I will be attaching core dump. 
I have full klog with ARI commands as well (but at the moment, our sanitization script is removing phone numbers and IP addresses from it, so Ill attach it later.)
It is quite large ~ 500 MB for last 15 minutes before the crash. Should I gzip before uploading it? or such large files are fine?

> ari: Race condition when destroying holding bridge multiple times with channels in it
> -------------------------------------------------------------------------------------
>
>                 Key: ASTERISK-28906
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-28906
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Core/Bridging/bridge_roles, Resources/res_stasis
>    Affects Versions: 16.6.1
>         Environment: Ubuntu 18.04.3 LTS in KVM. Asterisk built from source.
>            Reporter: Juris Breicis
>            Assignee: Juris Breicis
>            Severity: Minor
>         Attachments: core-locks.txt, core-locks.txt, core-thread1.txt, core-thread1.txt, sanitized.core-brief.txt, sanitized.core-brief.txt, sanitized.core-full.txt, sanitized.core-full.txt
>
>
> Bit of background: I am on asterisk 16.6.2 and the box itself is handling ~  500 simultaneous calls.
> The whole call management part is fully handled by Stasis app through ARI.
> With a recent bug with our own Stasis software (multiple consecutive bridge destroys in a row) we have spotted an asterisk crash on rare occasions:
> #0 __GI_raise (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
> #1 0x00007f7e96b77801 in __GI_abort () at abort.c:79
> #2 0x00007f7e96bc0897 in __libc_message (action=action at entry=do_abort, fmt=fmt at entry=0x7f7e96cedb9a "%s\n")
> at ../sysdeps/posix/libc_fatal.c:181
> #3 0x00007f7e96bc790a in malloc_printerr (str=str at entry=0x7f7e96cef828 "double free or corruption (fasttop)")
> at malloc.c:5350
> #4 0x00007f7e96bcf004 in _int_free (have_lock=0, p=0x7f7e7c01d670, av=0x7f7e7c000020) at malloc.c:4230
> #5 __GI___libc_free (mem=0x7f7e7c01d680) at malloc.c:3124
> #6 0x000055997dfc833d in __ast_free (ptr=<optimized out>, file=file at entry=0x55997e1afe4f "bridge_roles.c",
> lineno=lineno at entry=78, func=func at entry=0x55997e1b0150 <__PRETTY_FUNCTION__.14665> "bridge_role_destroy")
> at astmm.c:1577
> #7 0x000055997dff8779 in bridge_role_destroy (role=<optimized out>) at bridge_roles.c:78
> #8 ast_channel_clear_bridge_roles (chan=<optimized out>) at bridge_roles.c:374
> #9 0x00007f7e35d9d71a in bridge_stasis_pull (self=0x7f7dc8001020, bridge_channel=0x7f7e7012adf0)
> at stasis/stasis_bridge.c:292
> #10 0x000055997dff5239 in bridge_channel_internal_pull (bridge_channel=bridge_channel at entry=0x7f7e7012adf0)
> at bridge_channel.c:2170
> #11 0x000055997dff67da in bridge_channel_internal_join (bridge_channel=bridge_channel at entry=0x7f7e7012adf0)
> at bridge_channel.c:2921
> #12 0x000055997dfdc045 in bridge_channel_depart_thread (data=data at entry=0x7f7e7012adf0) at bridge.c:1787
> #13 0x000055997e122f85 in dummy_start (data=<optimized out>) at utils.c:1249
> #14 0x00007f7e9771f6db in start_thread (arg=0x7f7d6f6b8700) at pthread_create.c:463
> #15 0x00007f7e96c5888f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> I suspect it could be a race condition where there are multiple attempts to free same memory segment (after it has already been freed).
> We will apply fix to our Stasis app ourselves to remedy the issue for our case. However - I suspect that this is not expected behaviour - is there anything else, you would need from me, to be able to fix this for other users?



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list