[asterisk-dev] [Code Review] 3300: Don't crash on lack of bridged rtp instance

Mark Michelson reviewboard at asterisk.org
Mon Mar 10 14:25:08 CDT 2014



> On March 6, 2014, 9:20 p.m., Mark Michelson wrote:
> > I'm not a fan of this change, because I think it's not really fixing the root problem. bridge_p2p_rtp_write() is called in only one place in res_rtp_asterisk.c, and it's called like this:
> > 
> >     /* If we are directly bridged to another instance send the audio directly out */
> >     if (ast_rtp_instance_get_bridged(instance) && !bridge_p2p_rtp_write(instance, rtpheader, res, hdrlen)) {
> >         return &ast_null_frame;
> >     }
> > 
> > In other words, before entering bridge_p2p_rtp_write(), ast_rtp_instance_get_bridged() was returning non-NULL. So something is NULLing out the pointer between its initial check and the call to bridge_p2p_rtp_write(). This means that adding a second check for the same thing is likely not to actually catch the problem and instead just make the crash happen less often.
> > 
> > Looking at the linked issue, the crash is occurring outside of bridging code, when a channel is read from in app_dial(). My guess is that this channel had been RTP bridged but was yanked out of the bridge via a masquerade and thrown into the dial application. The dial application attempted to read from the channel before the previous local bridge loop had finished with the channel, thus resulting in crashiness.
> > 
> > The crash on the linked issue seems to verify this. The crashing thread has channel 0x7f5e943a3cc0 being read from while simultaneously in thread 7380, the same channel is part of a masquerade that occured as a result of a SIP attended transfer.
> 
> Leif Madsen wrote:
>     I have to agree, and have to thank you for your analysis of the situation. For now this patch may help us stabilize some of our platform, but I definitely would prefer to figure out what the underlying problem is.
>     
>     We just had another crash today with the same type of issue. What else can I do to help debug this further?

<leifmadsen> putnopvut: on r3300 you commented after reviewing the backtrace. We had another crash in the same part of the code today, and thought I'd check to see if any more information could be had through gdb to help?
<putnopvut> leifmadsen: yeah,I saw your reviewboard comment. I think at this stage, the information-gathering stage is pretty much done, so I can't think of anything else that you could give us that would help out. The problem is that there's a race condition that needs to be patched.
<putnopvut> From the provided backtrace, I can tell what the problem is, but fixing it is not going to be easy.
<putnopvut> That's also why I didn't do my usual thing and provide a suggestion on the review of a different approach.
<leifmadsen> indeed
<putnopvut> leifmadsen: as a temporary workaround, you can apply russellb's patch. It's not going to hurt anything, and it may lessen the chances of the crash occurring, but the possibility will still be there.


- Mark


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/3300/#review11120
-----------------------------------------------------------


On March 5, 2014, 6:49 p.m., Russell Bryant wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/3300/
> -----------------------------------------------------------
> 
> (Updated March 5, 2014, 6:49 p.m.)
> 
> 
> Review request for Asterisk Developers and leifmadsen.
> 
> 
> Bugs: ASTERISK-23419
>     https://issues.asterisk.org/jira/browse/ASTERISK-23419
> 
> 
> Repository: Asterisk
> 
> 
> Description
> -------
> 
> This bug shows a crash in bridge_p2p_rtp_write().  There is no bridged rtp instance so it goes boom.  The patch just catches the case and returns.  I'm not really sure *why* there's no bridged instance, but this seems like a pretty safe function input sanity check.
> 
> 
> Diffs
> -----
> 
>   /tags/1.8.18.1/res/res_rtp_asterisk.c 405155 
> 
> Diff: https://reviewboard.asterisk.org/r/3300/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Russell Bryant
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20140310/efefd8cb/attachment-0001.html>


More information about the asterisk-dev mailing list