[asterisk-bugs] [JIRA] (ASTERISK-29129) Race condition in pbx_lua dail subroutine leads to audio issues and FRACKs

Asterisk Team (JIRA) noreply at issues.asterisk.org
Fri Oct 16 07:53:36 CDT 2020


    [ https://issues.asterisk.org/jira/browse/ASTERISK-29129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=252464#comment-252464 ] 

Asterisk Team commented on ASTERISK-29129:
------------------------------------------

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. Please note that log messages and other files should not be sent to the Sangoma Asterisk Team unless explicitly asked for. All files should be placed on this issue in a sanitized fashion as needed.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur.

Please note that by submitting data, code, or documentation to Sangoma through JIRA, you accept the Terms of Use present at [https://www.asterisk.org/terms-of-use/|https://www.asterisk.org/terms-of-use/].

> Race condition in pbx_lua dail subroutine leads to audio issues and FRACKs
> --------------------------------------------------------------------------
>
>                 Key: ASTERISK-29129
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-29129
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: PBX/pbx_lua
>    Affects Versions: 17.7.0
>         Environment: Arch linux
>            Reporter: Daniil Gentili
>
> Consider the following simple lua dialplan:
> {noformat}
> extensions = {
>     ["from-internal"] = {
>         ["_."] = function(context, exten)
>             app.Answer()
>             app.Dial("Local/music at music", 100, "G(pony^999^1)")
>             app.Hangup()
>         end
>     },
>     ["pony"] = {
>         ["caller"] = function(context, exten)
>             app.Bridge(channel.DIALEDPEERNAME:get())
>         end,
>         ["callee"] = function(context, exten)
>             app.Answer()
>             os.execute("sleep 2")
>             app.Wait(5)
>             app.Congestion()
>             app.Hangup()
>         end
>     },
>     ["music"] = {
>         ["music"] = function(context, exten)
>             app.Answer()
>             app.Playback("/home/daniil/Musica/bruh")
>         end
>     }
> }
> {noformat}
> and conf dialplan:
> {noformat}
> [pony]
> exten => 999,1,GoTo(pony,caller,1)
> exten => 999,2,GoTo(pony,callee,1)
> {noformat}
> When executing the {{callee}} side of the {{G}} dial subroutine, if a big enough delay is introduced before calling {{Wait}}, once it is called it will cause audio issues and fracks.
> Dial() creates the following two bridges:
> * bridge A: SIP/XXXX <=> Local/pony at caller (A;1 <=> A;2)
> * bridge B: Local/pony at callee <=> Local/music at music (B;1 <=> B;2)
> Bridge(A;1, B;1) does the following:
> * B;1 is yanked out of the B channel, and put into the new C bridge:
> * bridge C: SIP/XXXX <=> Local/music at music (A;1 <=> B;1)
> Yanking the B;1 channel out is done internally by copying the channel struct members to a new channel and [hanging up the old zombie channel with the zombie flag+nullframe|https://github.com/asterisk/asterisk/blob/e831952ebac04042051538e444dfb917782b01c4/main/channel.c#L7187].
> Finally, the new channel structure is swapped into the currently running lua PBX by the [fixup function|https://github.com/asterisk/asterisk/blob/master/pbx/pbx_lua.c#L137].
> Race condition:
> * If the `Wait()` function gets called before `Bridge()`, it will be called on the old to-be-killed channel.
> The nullframe+zombie flag combination sent to the to-be-killed channel will be received correctly by `Wait()`, and will hangup the routine.
> * If the `Wait()` function gets called after `Bridge()`, it will be called on the newly swapped channel.
> The nullframe+zombieflag combo will be ignored, and instead `Wait()` will start sending null packets into the newly cloned channel (which is already in use by Playback in another context!), causing audio issues, and finally segfaults on hangup, probably when pbx_lua tries starting an autoservice on a hung up channel.
> Removing the lua fixup function solves the issue, but I'm not sure if this will introduce issues elsewhere.
> [17635|https://issues.asterisk.org/jira/browse/ASTERISK-17635], which is the issue that initially led to the introduction of the fixup function, seems to use a pre-masquerading function of asterisk; but the ast_channel_move (=> fixup) function seems to be used in places other than the bridging function.
> So I'm posting this here, in hopes of getting help for a fix that doesn't potentially break other stuff :)



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list