<html>
<body>
<div style="font-family: Verdana, Arial, Helvetica, Sans-Serif;">
<table bgcolor="#f9f3c9" width="100%" cellpadding="8" style="border: 1px #c9c399 solid;">
<tr>
<td>
This is an automatically generated e-mail. To reply, visit:
<a href="https://reviewboard.asterisk.org/r/1400/">https://reviewboard.asterisk.org/r/1400/</a>
</td>
</tr>
</table>
<br />
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<p style="margin-top: 0;">On August 30th, 2011, 3:30 a.m., <b>Alec Davis</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Works in all the senarios's that I've reported.
I'm also wondering whether the 'pickup datastore' you had previously added to prevent the race condition of picking up the same extension is now redundant. As I commented it out, and once got the new message "can't setup masquerade" as below;
-- Starting simple switch on 'DAHDI/35-1'
-- Executing [823@family:1] Dial("DAHDI/35-1", "Local/823@en_phone") in new stack
-- Called Local/823@en_phone
-- Executing [823@en_phone:1] Dial("Local/823@en_phone-9f80;2", "SIP/gxp-823,20,rwt") in new stack
== Using UDPTL CoS mark 5
== Using SIP RTP CoS mark 5
== Extension Changed 823[phones] new state Ringing for Notify User gxp-822
-- Called SIP/gxp-823
-- Local/823@en_phone-9f80;1 is ringing
-- SIP/gxp-823-00000003 is ringing
-- Local/823@en_phone-9f80;1 is ringing
== Using UDPTL CoS mark 5
== Using UDPTL CoS mark 5
== Using SIP RTP CoS mark 5
== Extension Changed 824[phones] new state InUse for Notify User gxp-822
[2011-08-30 19:28:33.606655] NOTICE[3506]: features.c:6978 ast_pickup_call: pickup SIP/gxp-823-00000003 attempt by SIP/gxp-824-00000004
== Using SIP RTP CoS mark 5
== Extension Changed 822[phones] new state InUse for Notify User gxp-822
[2011-08-30 19:28:33.622086] NOTICE[3507]: features.c:6978 ast_pickup_call: pickup SIP/gxp-823-00000003 attempt by SIP/gxp-822-00000005
-- SIP/gxp-824-00000004 answered Local/823@en_phone-9f80;2
== Extension Changed 823[phones] new state Idle for Notify User gxp-822
-- Local/823@en_phone-9f80;1 stopped sounds
-- Local/823@en_phone-9f80;1 answered DAHDI/35-1
== Spawn extension (en_phone, 823, 1) exited non-zero on 'Local/823@en_phone-9f80;2'
[2011-08-30 19:28:33.939881] WARNING[3507]: channel.c:5769 __ast_channel_masquerade: Can't setup masquerade. One or both channels is dead. (Local/823@en_phone-9f80;1<ZOMBIE> <-- SIP/gxp-822-00000005)
[2011-08-30 19:28:33.939937] WARNING[3507]: features.c:7054 ast_do_pickup: Unable to masquerade 'SIP/gxp-822-00000005' into 'SIP/gxp-823-00000003'
[2011-08-30 19:28:33.940023] WARNING[3507]: features.c:6987 ast_pickup_call: pickup Local/823@en_phone-9f80;1<ZOMBIE> failed by SIP/gxp-822-00000005
</pre>
</blockquote>
<p>On August 30th, 2011, 3:42 a.m., <b>irroot</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">I think the datastore is still needed as there can be a race condition for a channel when multiple attempts are made to pickup a call in do_pickup after target is unlocked and before the masq its possible to have a channel pick it up again .... setting the DS will cause can_pickup to pass it over ...
Have got this patch out at >50 sites about 10 of them were getting orphaned channels regularly and about 5 a day were affected.</pre>
</blockquote>
<p>On August 30th, 2011, 3:52 a.m., <b>Alec Davis</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">testing dialplan senarios tested:
where 3 sip phones are registered as gxp-822, gxp-823, gxp-824
Senario 1: Localchan dial, Local chan pickup.
Dial 801 from 2 phones.
Dial either 800 or *8 from 2 other phones.
Both calls were successfully connected to 1 of each of the channels.
Senario 2:
Dial 802 or 803 or 804 from 1 phone.
Dial *8 from 2 other phones.
Success
exten => 800,1,NoOp(Debug Localchan pickup)
exten => 800,n,Dial(Local/824@en_pickup&Local/823@en_pickup)
exten => 801,1,NoOp(Debug Localchan Pickup, Ring Phones)
exten => 801,n,Dial(Local/823@en_phone&Local/824@en_phone)
exten => 802,1,Dial(SIP/gxp-823&SIP/gxp-824)
exten => 803,1,Dial(DAHDI/33&DAHDI/35)
exten => 804,1,Dial(Local/823@en_phone&Local/824@en_phone&DAHDI/33&DAHDI/35)
[en_pickup]
exten => _[0-9*#]!, 1, PickupChan(Local/${EXTEN}@en_phone)
[en_phone]
exten => _[0-9*#]!, 1, Dial(SIP/gxp-${EXTEN},20,rwt)
</pre>
</blockquote>
<p>On August 30th, 2011, 10:34 a.m., <b>rmudgett</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">The datastore prevents two users from picking up the exact same channel. The datastore is more graceful than if it were not there because the race changes to who can setup the masquerade first. The loser then gets a warning message in the log about a masquerade already setup for someone else.
The pickup race that this patch fixes is because the call is forked and the users are picking up different legs of the forked call.
FYI: One of my test attempts when running under valgrind had both legs of the call "successfully" picked up. As the answer was propagated to the forking dial, the dial determined the winner of the pickup based on which leg it saw answer first. The losing pickup attempt just got hung up on.</pre>
</blockquote>
<p>On August 30th, 2011, 12:56 p.m., <b>Alec Davis</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Agreed, regrading the datastore detects multiple pickups of the exact same channel is tidier, only one message, and the loser doesn't even get to masquerade stage.
re. running under valgrind, the result was correct wasn't it?
</pre>
</blockquote>
<p>On August 30th, 2011, 1:12 p.m., <b>rmudgett</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">The result was "correct" in that nothing bad happened. It was just an odd case since both pickup attempts were "successful". Both calls picked up using *8 and heard the successful pickup sound (tt-monkeys in this case :) ). The loser was just hung up on as if the caller went away; which would be expected from what happened.</pre>
</blockquote>
</blockquote>
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Maybe setting a pickup datastore on the head calling channel may have avoided the multiple successful pickups, where only one should have won :)
However, I understand your concerns related to attempting to track the parent or head caller.
But maybe all that I needed to do after the pickup election was to set parent=NULL, then any further transfers don't have the parent set.</pre>
<br />
<p>- Alec</p>
<br />
<p>On August 29th, 2011, 6:51 p.m., rmudgett wrote:</p>
<table bgcolor="#fefadf" width="100%" cellspacing="0" cellpadding="8" style="background-image: url('https://reviewboard.asterisk.org/media/rb/images/review_request_box_top_bg.png'); background-position: left top; background-repeat: repeat-x; border: 1px black solid;">
<tr>
<td>
<div>Review request for Asterisk Developers, Alec Davis and irroot.</div>
<div>By rmudgett.</div>
<p style="color: grey;"><i>Updated Aug. 29, 2011, 6:51 p.m.</i></p>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Description </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: 1px solid #b8b5a0">
<tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Attempting to pickup a call that has been forked to multiple extensions by two users either crashes or fails a masquerade with a "bad things may happen" message.
This is the scenario that is causing all the grief:
1) Pickup target is selected
2) target is marked as being picked up in ast_do_pickup()
3) target is unlocked by ast_do_pickup()
4) app dial or queue gets a chance to hang up losing calls and calls ast_hangup() on target
5) SINCE A MASQUERADE HAS NOT BEEN SETUP YET BY ast_do_pickup() with ast_channel_masquerade(), ast_hangup() completes successfully and the channel is no longer in the channels container.
6) ast_do_pickup() then calls ast_channel_masquerade() to schedule the masquerade on the dead channel.
7) ast_do_pickup() then calls ast_do_masquerade() on the dead channel
8) bad things happen while doing the masquerade and in the process ast_do_masquerade() puts the dead channel back into the channels container
9) The "orphaned" channel is visible in the channels list if a crash does not happen.
This patch does the following:
1) ast_hangup() sets the zombie flag on a successfully hung-up channel and does not release the channel lock until that has happened.
2) __ast_channel_masquerade() is fixed to not setup a masquerade if either channel is a zombie.
</pre>
</td>
</tr>
</table>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Testing </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: 1px solid #b8b5a0">
<tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Replicated the crash in ASTERISK-18222 with PickupChan() and *8 methods.
With the patch, the crash no longer happens and the new message of attempting to masquerade into a dead channel is output.</pre>
</td>
</tr>
</table>
<div style="margin-top: 1.5em;">
<b style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Bugs: </b>
<a href="https://issues.asterisk.org/jira/browse/ASTERISK-18222">ASTERISK-18222</a>,
<a href="https://issues.asterisk.org/jira/browse/ASTERISK-18273">ASTERISK-18273</a>
</div>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Diffs</b> </h1>
<ul style="margin-left: 3em; padding-left: 0;">
<li>/branches/1.8/main/channel.c <span style="color: grey">(333893)</span></li>
</ul>
<p><a href="https://reviewboard.asterisk.org/r/1400/diff/" style="margin-left: 3em;">View Diff</a></p>
</td>
</tr>
</table>
</div>
</body>
</html>