[asterisk-dev] [Code Review] 3990: CDRs/Dial: Fix an assertion caused by advancing a neutral state channel straight into dial pending without going through dial
Jonathan Rose
reviewboard at asterisk.org
Thu Sep 18 15:47:57 CDT 2014
> On Sept. 18, 2014, 3:10 p.m., rmudgett wrote:
> > /branches/12/main/stasis_channels.c, line 1257
> > <https://reviewboard.asterisk.org/r/3990/diff/5/?file=67398#file67398line1257>
> >
> > dial_masquerade() is called with several locks held: the global channels lock, old_chan, and new_chan.
> >
> > The calls to ast_channel_publish_dial_forward() will then try to lock cur->peer.
> >
> > I'm just noting this situation because locking more than one channel at a time normally has a deadlock potential. However, since the global channels container lock is held it should be safe enough.
> On Sept. 18, 2014, 3:10 p.m., rmudgett wrote:
> > /branches/12/main/stasis_channels.c, lines 381-394
> > <https://reviewboard.asterisk.org/r/3990/diff/5/?file=67398#file67398line381>
> >
> > Why is the caller channel lock held for the ast_channel_publish_dial_forward() call?
> >
> > A dead lock can happen holding the caller lock because ast_channel_publish_dial_forward() locks caller and peer in turn.
Well, I want to keep the lock held in case a masquerade happens before I do ast_channel_publish_dial_forward... not sure how realistic such a scenario is, probably not very, but perhaps I should just do a lock_both here instead?
- Jonathan
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/3990/#review13345
-----------------------------------------------------------
On Sept. 18, 2014, 1:25 p.m., Jonathan Rose wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/3990/
> -----------------------------------------------------------
>
> (Updated Sept. 18, 2014, 1:25 p.m.)
>
>
> Review request for Asterisk Developers, Matt Jordan and rmudgett.
>
>
> Bugs: ASTERISK-24237
> https://issues.asterisk.org/jira/browse/ASTERISK-24237
>
>
> Repository: Asterisk
>
>
> Description
> -------
>
> Reproduction:
> pj123 calls 1601
> pj123 does a SIP blonde transfer to 1603
> 1603 answers
> FRACK occurs
> all are PJSIP endpoints.
>
> Basically what happens is there is a second outbound dial from another pj123 channel. Before the dial is answered, the pj123 gets masqueraded out of the picture and replaced with a channel that isn't in the dial state.
>
> This patch fixes that by advancing the incoming channel to the dial state in the channel breakdown function of a datastore on the pj123 channel. Honestly, I'm not sure this is entirely adequate, but it seems to work in all of the conditions I've tried so far and it doesn't impede normal attended transfers. I might need to try and figure out what happens if I masquerade in a channel that is already dialing though. I'm not sure if that's something we can really expect to happen under normal conditions, but that seems like something that would screw up this approach.
>
> Note that I think this patch is the right idea, I just don't know if I need to account for the possibility that the channel that masquerades into pj123's dialing channel might not be in the neutral state.
>
>
> Diffs
> -----
>
> /branches/12/main/stasis_channels.c 422882
>
> Diff: https://reviewboard.asterisk.org/r/3990/diff/
>
>
> Testing
> -------
>
> Ran against reproduction of the above scenario. Assertion was gone and the CDR results were as follows:
>
> "","123","1601","default",""""" <123>","PJSIP/pj123-00000000","PJSIP/1601-00000001","Dial","PJSIP/1601,,tT","2014-09-11 21:48:51","2014-09-11 21:48:53","2014-09-11 21:48:57",5,4,"ANSWERED","DOCUMENTATION","1410472131.0",""
> "","123","","default",""""" <123>","PJSIP/pj123-00000002","PJSIP/1603-00000003","Dial","PJSIP/1603","2014-09-11 21:48:55",,"2014-09-11 21:48:57",2,0,"NO ANSWER","DOCUMENTATION","1410472135.6",""
> "","1601","1603","default",""""" <1601>","PJSIP/1601-00000001","PJSIP/1603-00000003","AppDial","(Outgoing Line)","2014-09-11 21:48:57","2014-09-11 21:48:57","2014-09-11 21:49:04",6,6,"ANSWERED","DOCUMENTATION","1410472131.1",""
>
> And dial events reported by AMI:
> http://pastebin.com/tWuwL7xa
> (note that there is a mismatch between the number of dial end and dial begin events... not sure if this is a problem, but I might be able to fix it just by updating the old chan, not sure what status code to use though).
>
> Ran against normal attended transfer, feature attended transfers, and blind transfers with no noticeable effect.
>
> Ran against testsuite blonde transfer tests, some attended transfer tests, some blind transfer tests, and all pjsip transfer tests (at least ones that will run on my box... a few won't due to sipp version requirements that I really need to get around to fixing eventually) for comparison purposes. All passed exhibiting the same behavior as before as far as warning messages and such go.
>
>
> Thanks,
>
> Jonathan Rose
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20140918/ae74c085/attachment-0001.html>
More information about the asterisk-dev
mailing list