No subject
Fri Sep 2 03:59:05 CDT 2011
d off, and with my change the phone wasn't sent the timeout to the devi=
ce.
I'm wondering why we didn't finish up with 2 MWI subscriptions, the=
re may be a clue there, did the code kill off the previous mwi subscription=
when another for the same device arrived, or did the phone unsubscribe.
</pre>
</blockquote>
<p>On March 23rd, 2012, 9:44 a.m., <b>Mark Michelson</b> wrote:</p>
<blockquote style=3D"margin-left: 1em; border-left: 2px solid #d0d0d0; pad=
ding-left: 10px;">
<pre style=3D"white-space: pre-wrap; white-space: -moz-pre-wrap; white-sp=
ace: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;"> "=
;Working with the senario that a phone has restarted due to loss of power f=
or some reason.
Trusting a phone to unsubscribe to a previous subscription is just wro=
ng, it won't know what it's previous call-id was.
The cleaner removed the power. Then plugged it back in... so many sena=
rios :)"
Yes, this is why there's a timeout in place :)
Was this output taken from your patched system or from an unpatched Asteris=
k?
Looking at your change again, I missed something. It looks like old subscri=
ptions will time out in Asterisk, it's just that Asterisk won't bot=
her sending a NOTIFY to let subscribers know that they have timed out. What=
I still don't understand though is how this change actually fixes the =
problem you originally had. I think I'd need to see a SIP debug of the =
post-phone-reboot packets both before and after your patch to fully underst=
and what's up here.
Despite not having all the information in front of me, I have a bit of an i=
dea of what might be happening.
1. The phone is subscribed to Asterisk.
2. Asterisk sends some NOTIFYs to the phone. Let's say it sends 5 of th=
em. This means that the last version number Asterisk has sent to the phone =
is 4 (since version starts at 0)
3. Phone reboots. It loses all memory of the old subscription and the old v=
ersion number.
4. Phone starts a new subscription with Asterisk. Asterisk sends an initial=
NOTIFY for this new subscription with version 0.
5. Original subscription times out. Asterisk sends a NOTIFY to end the old =
subscription. The version number in this NOTIFY is 5.
6. The phone, for some reason, even though it should have not knowledge of =
this old subscription, stores 5 as the current version number (How does the=
phone respond to this NOTIFY? 200? 481?)
7. Asterisk sends NOTIFYs for the new subscription, incrementing version ev=
ery time. Since the phone has 5 as its current version, it will ignore all =
NOTIFYs until version 6 arrives.
If this is what's happening, then the problem isn't that the phone =
is remembering version numbers across reboots. The problem is actually that=
the phone will honor version numbers for NOTIFYs for subscriptions it does=
not, or at least should not, know about (which seems pretty exploitable if=
you ask me). The solution I proposed before may work, although it might no=
t. At step 4 above, Asterisk will send an initial NOTIFY with version 0. Th=
e phone may be programmed not to accept NOTIFYs on a subscription that do n=
ot start with version 0. If that's the case, my solution will not work.=
And if my solution does not work, then I don't know of a good one off =
the top of my head.
I still do not agree with this patch though. Most devices, including the ne=
wer generation phones you are using, properly handle a NOTIFY that ends a t=
imed-out subscription. If anything, the solution for this will have to be i=
mplemented as a new option because there is no way we will unconditionally =
never send a subscription-ending NOTIFY for dialog-info events.</pre>
</blockquote>
<p>On March 23rd, 2012, 3:54 p.m., <b>Alec Davis</b> wrote:</p>
<blockquote style=3D"margin-left: 1em; border-left: 2px solid #d0d0d0; pad=
ding-left: 10px;">
<pre style=3D"white-space: pre-wrap; white-space: -moz-pre-wrap; white-sp=
ace: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">
Yes this is unpatched, expected for the 3 DEBUG prints.
[code}
Index: channels/chan_sip.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- channels/chan_sip.c (revision 359029)
+++ channels/chan_sip.c (working copy)
@@ -3865,6 +3865,7 @@
p->subscribed =3D NONE;
append_history(p, "Subscribestatus", "timeou=
t");
ast_debug(3, "Re-scheduled destruction of SIP subscrip=
tion %s\n", p->callid ? p->callid : "<unknown>");
+ ast_log(LOG_NOTICE, "ALEC TIMEOUT of SIP subscription =
%s\n", p->callid ? p->callid : "<unknown>");
return 10000; /* Reschedule this destruction so that we k=
now that it's gone */
}
@@ -13127,10 +13130,12 @@
local_display, local_target=
, local_target, mto, mto);
} else {
ast_str_append(tmp, 0, "<dialog id=
=3D\"%s\" direction=3D\"recipient\">\n", exten);
+ ast_log(LOG_NOTICE, "ALEC <dialog i=
d=3D\"%s\" call-id=3D\"%s\" direction=3D\"recipien=
t\"> version=3D%u %s\n", exten, p->callid, p->dialogver,=
statestring);
}
} else {
ast_str_append(tmp, 0, "<dialog id=3D\"=
;%s\">\n", exten);
+ ast_log(LOG_NOTICE, "ALEC <dialog id=3D\&qu=
ot;%s\" call-id=3D\"%s\"> version=3D%u %s\n", exten,=
p->callid, p->dialogver, statestring);
}
ast_str_append(tmp, 0, "<state>%s</state>\=
n", statestring);
if (state =3D=3D AST_EXTENSION_ONHOLD) {
{code}
Regarding steps 1-7, you have it exactly.
answer for step 6: the phone obviously isn't matching call-id.
I agree, it's only a GXP20XX issue, Grandstream have fixed it, with the=
GXP21XX series.
I have access to a SNOM, and it works fine.
I have seen on one of the other open PBX forums, the same reported issue, t=
here fix was to remove any previous subscriptions that match peer and entit=
y. I will try to refind it. =
</pre>
</blockquote>
</blockquote>
<pre style=3D"white-space: pre-wrap; white-space: -moz-pre-wrap; white-spac=
e: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Seems 1.2 h=
ad code to remove old duplicate subscriptions:
{code}
/* remove any old subscription from this peer for t=
he same exten/context,
as the peer has obviously forgotten about it and=
it's wasteful to wait
for it to expire and send NOTIFY messages to the=
peer only to have them
ignored (or generate errors)
*/
ast_mutex_lock(&iflock);
for (p_old =3D iflist; p_old; p_old =3D p_old->n=
ext) {
if (p_old =3D=3D p)
continue;
if (p_old->initreq.method !=3D SIP_SUBSC=
RIBE)
continue;
if (p_old->subscribed =3D=3D NONE)
continue;
ast_mutex_lock(&p_old->lock);
if (!strcmp(p_old->username, p->usern=
ame)) {
if (!strcmp(p_old->exten, p->=
exten) &&
!strcmp(p_old->context, p-&g=
t;context)) {
ast_set_flag(p_old, SIP_NEE=
DDESTROY);
ast_mutex_unlock(&p_old=
->lock);
break;
}
}
ast_mutex_unlock(&p_old->lock);
}
ast_mutex_unlock(&iflock);
{code}</pre>
<br />
<p>- Alec</p>
<br />
<p>On March 13th, 2012, 5:35 p.m., Alec Davis wrote:</p>
<table bgcolor=3D"#fefadf" width=3D"100%" cellspacing=3D"0" cellpadding=3D"=
8" style=3D"background-image: url('https://reviewboard.asterisk.org/media/r=
b/images/review_request_box_top_bg.png'); background-position: left top; ba=
ckground-repeat: repeat-x; border: 1px black solid;">
<tr>
<td>
<div>Review request for Asterisk Developers.</div>
<div>By Alec Davis.</div>
<p style=3D"color: grey;"><i>Updated March 13, 2012, 5:35 p.m.</i></p>
<h1 style=3D"color: #575012; font-size: 10pt; margin-top: 1.5em;">Descripti=
on </h1>
<table width=3D"100%" bgcolor=3D"#ffffff" cellspacing=3D"0" cellpadding=3D"=
10" style=3D"border: 1px solid #b8b5a0">
<tr>
<td>
<pre style=3D"margin: 0; padding: 0; white-space: pre-wrap; white-space:=
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap=
: break-word;">Some phones (Grandstream GXP20XX series 1.2.5.3 firmware) if=
rebooted, loose the BLF lights due to the new subscription's "ver=
sion" being less than the previous subscription's "version&qu=
ot;.
This happens the instant that asterisk timeouts the previous subscription -=
which has the much larger "version" sequence number.
The only way for the lights to start working again, is for the current subs=
cription's "version" number to increment past the previous ol=
d "version" number.
That period of time could be huge, if the phone isn't rebooted for days.
The workaround has been to reboot the phone twice, within a few minutes of =
each reboot.
1st time wait for the BLF's to fail (then reboot again) - which is th=
e timeout period - a few minutes.
2nd time the lights will work, then fail, then after a few minutes start =
working again.
This fix prevents the old subscription timeout from updating the phones &qu=
ot;version" number. =
This issue doesn't affter the GXP21XX series.
Attempt to show issue byway of example below with some debug output showing=
call-id and version number.
astrid-test*CLI> core show hints
-=3D Registered Asterisk Dial Plan Hints =3D-
8601GXP0001 at trusted : Custom:q8601_a8612 State:I=
nUse Watchers 1
Debug output:
=3D=3D=3DSet the BLF state to InUse
- Executing [s at macro-custom-agent-inout:13] Set("SIP/GXP0001-00000005&=
quot;, "DEVICE_STATE(Custom:q8601_a8612)=3DINUSE") in new stack
...
[Mar 14 11:01:39] NOTICE[10819]: chan_sip.c:13136 state_notify_build_xml: A=
LEC <dialog id=3D"8601GXP0001" call-id=3D"3c65aed02ae1d5d=
f at 192.168.5.141"> version=3D5
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Reboot the phone, it gets a new BLF subscription.
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
[Mar 14 11:02:30] NOTICE[10826]: chan_sip.c:13136 state_notify_build_xml: A=
LEC <dialog id=3D"8601GXP0001" call-id=3D"79956c016812540=
0 at 192.168.5.141"> version=3D0
=3D=3D=3DThe BLF has correct state of InUse.
...
[Mar 14 11:05:00] NOTICE[10826]: chan_sip.c:13136 state_notify_build_xml: A=
LEC <dialog id=3D"8601GXP0001" call-id=3D"79956c016812540=
0 at 192.168.5.141"> version=3D1
=3D=3D=3DThe BLF still has correct state of InUse.
...
The timeout 'last notification'
[Mar 14 11:05:09] NOTICE[10826]: chan_sip.c:13131 state_notify_build_xml: A=
LEC <dialog id=3D"8601GXP0001" call-id=3D"3c65aed02ae1d5d=
f at 192.168.5.141" direction=3D"recipient"> version=3D6
[Mar 14 11:05:09] NOTICE[10826]: chan_sip.c:3868 __sip_autodestruct: ALEC T=
IMEOUT of SIP subscription 3c65aed02ae1d5df at 192.168.5.141
=3D=3D=3DThe BLF has incorrectly gone Idle.
...
[Mar 14 11:15:01] NOTICE[10826]: chan_sip.c:13136 state_notify_build_xml: A=
LEC <dialog id=3D"8601GXP0001" call-id=3D"79956c016812540=
0 at 192.168.5.141"> version=3D5
=3D=3D=3DThe BLF is still incorrect at Idle.
-- Registered SIP 'GXP0001' at 192.168.5.141:5066
[Mar 14 11:17:31] NOTICE[10826]: chan_sip.c:13136 state_notify_build_xml: A=
LEC <dialog id=3D"8601GXP0001" call-id=3D"79956c016812540=
0 at 192.168.5.141"> version=3D6
=3D=3D=3DThe BLF now has correct state of InUse.
</pre>
</td>
</tr>
</table>
<h1 style=3D"color: #575012; font-size: 10pt; margin-top: 1.5em;">Testing <=
/h1>
<table width=3D"100%" bgcolor=3D"#ffffff" cellspacing=3D"0" cellpadding=3D"=
10" style=3D"border: 1px solid #b8b5a0">
<tr>
<td>
<pre style=3D"margin: 0; padding: 0; white-space: pre-wrap; white-space:=
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap=
: break-word;">In place on a 1.8 deployment for months, our production box =
with ~80 GXP phones, half of which are the older GXP20XX series.
</pre>
</td>
</tr>
</table>
<h1 style=3D"color: #575012; font-size: 10pt; margin-top: 1.5em;">Diffs</b>=
</h1>
<ul style=3D"margin-left: 3em; padding-left: 0;">
<li>trunk/channels/chan_sip.c <span style=3D"color: grey">(358608)</span><=
/li>
</ul>
<p><a href=3D"https://reviewboard.asterisk.org/r/1813/diff/" style=3D"margi=
n-left: 3em;">View Diff</a></p>
</td>
</tr>
</table>
</div>
</body>
</html>
--===============3074523449968740501==--
More information about the asterisk-dev
mailing list