[asterisk-bugs] [JIRA] (ASTERISK-28906) Crash in bridge_roles.c:78 while doing memory free for bridge roles

Juris Breicis (JIRA) noreply at issues.asterisk.org
Tue May 19 05:27:25 CDT 2020


    [ https://issues.asterisk.org/jira/browse/ASTERISK-28906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=250858#comment-250858 ] 

Juris Breicis commented on ASTERISK-28906:
------------------------------------------

First of - we have not been able to reproduce this with limited test (very simple Stasis service, which does exactly what you described); 
Also maybe this is due to the fact that this occurs on quite rare instances: machine is constantly servicing ~  1k channels and 500 bridges with a lifetime of ~ 3 minutes. And this happens once every 24 - 48hrs.

We are using quite a lot from ARI functionality - almost everything which is related to channels and bridges; from creating individual channels and bridges, to recording through Snoop channels, etc. etc.  However this crash seems to be introduced by a BUG in our Stasis app where we incorrectly tracked Bridge status within our backend, and would call out bridge destroy multiple times without checking if it has been destroyed before. 

Yes, I can add full crash log  (However the specific asterisk instance where I got this backtrace from, did not have DONT_OPTIMIZE  build flags) - will it still be okay, or should I wait for the crash on one of the instances with DONT_OPTIMIZE flag?

> Crash in bridge_roles.c:78 while doing memory free for bridge roles
> -------------------------------------------------------------------
>
>                 Key: ASTERISK-28906
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-28906
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Core/Bridging/bridge_roles, Resources/res_stasis
>    Affects Versions: 16.6.1
>         Environment: Ubuntu 18.04.3 LTS in KVM. Asterisk built from source.
>            Reporter: Juris Breicis
>            Assignee: Juris Breicis
>            Severity: Minor
>
> Bit of background: I am on asterisk 16.6.2 and the box itself is handling ~  500 simultaneous calls.
> The whole call management part is fully handled by Stasis app through ARI.
> With a recent bug with our own Stasis software (multiple consecutive bridge destroys in a row) we have spotted an asterisk crash on rare occasions:
> #0 __GI_raise (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
> #1 0x00007f7e96b77801 in __GI_abort () at abort.c:79
> #2 0x00007f7e96bc0897 in __libc_message (action=action at entry=do_abort, fmt=fmt at entry=0x7f7e96cedb9a "%s\n")
> at ../sysdeps/posix/libc_fatal.c:181
> #3 0x00007f7e96bc790a in malloc_printerr (str=str at entry=0x7f7e96cef828 "double free or corruption (fasttop)")
> at malloc.c:5350
> #4 0x00007f7e96bcf004 in _int_free (have_lock=0, p=0x7f7e7c01d670, av=0x7f7e7c000020) at malloc.c:4230
> #5 __GI___libc_free (mem=0x7f7e7c01d680) at malloc.c:3124
> #6 0x000055997dfc833d in __ast_free (ptr=<optimized out>, file=file at entry=0x55997e1afe4f "bridge_roles.c",
> lineno=lineno at entry=78, func=func at entry=0x55997e1b0150 <__PRETTY_FUNCTION__.14665> "bridge_role_destroy")
> at astmm.c:1577
> #7 0x000055997dff8779 in bridge_role_destroy (role=<optimized out>) at bridge_roles.c:78
> #8 ast_channel_clear_bridge_roles (chan=<optimized out>) at bridge_roles.c:374
> #9 0x00007f7e35d9d71a in bridge_stasis_pull (self=0x7f7dc8001020, bridge_channel=0x7f7e7012adf0)
> at stasis/stasis_bridge.c:292
> #10 0x000055997dff5239 in bridge_channel_internal_pull (bridge_channel=bridge_channel at entry=0x7f7e7012adf0)
> at bridge_channel.c:2170
> #11 0x000055997dff67da in bridge_channel_internal_join (bridge_channel=bridge_channel at entry=0x7f7e7012adf0)
> at bridge_channel.c:2921
> #12 0x000055997dfdc045 in bridge_channel_depart_thread (data=data at entry=0x7f7e7012adf0) at bridge.c:1787
> #13 0x000055997e122f85 in dummy_start (data=<optimized out>) at utils.c:1249
> #14 0x00007f7e9771f6db in start_thread (arg=0x7f7d6f6b8700) at pthread_create.c:463
> #15 0x00007f7e96c5888f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> I suspect it could be a race condition where there are multiple attempts to free same memory segment (after it has already been freed).
> We will apply fix to our Stasis app ourselves to remedy the issue for our case. However - I suspect that this is not expected behaviour - is there anything else, you would need from me, to be able to fix this for other users?



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list