[asterisk-dev] Linksys SPA962 losing registration

Jeff LaCoursiere jeff at jeff.net
Tue Sep 8 08:51:35 CDT 2009


On Tue, 8 Sep 2009, Olle E. Johansson wrote:

>
> 8 sep 2009 kl. 03.18 skrev Jeff LaCoursiere:
>
>>
>> Update:
>>
>> After finding this issue: 15084
>>
>> I started looking at the logic in "handle_request_notify()" and don't
>> understand why I am not getting one or the other of "489 Bad Event" or
>> "200 OK", as surely the code must follow one branch or the other of
>> this:
>>
>> if (strcmp(event, "refer")) {
>> 	[snip]
>> 	transmit_response(p, "489 Bad event", req);
>> 	res = -1;
>> } else {
>> 	[snip a bunch of tests and a switch that shouldn't apply]
>> 	/* Confirm that we received this packet */
>> 	transmit_response(p, "200 OK", req);
>> };
>>
>> But instead I get no response at all, which is surely the difference
>> between the two versions.  I am willing to bet that 1.4.18 sent the
>> 489.
> chan_sip should *ALWAYS* send a response. If not, it's a bug.
>
> Open a new issue (that we can relate to 15084) and upload as an
> attachment a full SIP debug with
> debug level and verbose level set to 5.
>

Hi,

I will open the issue and supply the traces, but will have to wait for 
this evening when I have a chance to back out a patch I made last night 
and capture the info.

I am hoping to better understand some parts of the flow through the sip 
code - spent a long time last night tracing it with many "ast_verbose" 
inserts :)

I started from "sipsock_read()" and found my Linksys NOTIFY keep-alive 
packet read, and eventually found that this code:

         /* Find the active SIP dialog or create a new one */
         p = find_call(&req, &sin, req.method);  /* returns p locked */

returned NULL.  This explains why handle_request_notify() was never 
called.  I guess this makes sense - there is no active dialog - this is 
just an out-of-band keep alive!

Going into this routine I see that:

         } else if (intended_method == SIP_NOTIFY) {
             /* We do not support out-of-dialog NOTIFY either,
             like voicemail notification, so cancel that early */
             transmit_response_using_temp(callid, sin, 1, intended_method, req, "489 Bad event");
         }

is eventually executed, and indeed, I was able to find the 489 being sent 
with tcpdump, but to the INTERNAL IP address of the Linksys phone, which 
is remote and behind a NAT.  As an aside this is probably where the logic 
would go to send a "200 OK" if we can identify this as a keep alive and 
not any other kind of NOTIFY, as was patched in issue 15084 for 1.6.X.

This makes me wonder if there is some kind of configuration error in 
asterisk or on the phone, but all other features (calls, BLF, transfers, 
etc) all seem to work fine.  It is only the response to the NOTIFY 
keep-alive that seems to have a NAT issue, and it only occurred after 
upgrading from 1.4.18.

Because this is a production box and we were faced with going back to 
1.4.18 or trying any wild patch to make this work for customers returning 
to work this morning, I made this change in __sip_xmit():

root at hades:/usr/local/src/asterisk-1.4.26.2/channels# diff chan_sip.c 
chan_sip.c.orig
1797,1801c1797
< 	/*const struct sockaddr_in *dst = sip_real_dst(p);*/
< 	const struct sockaddr_in *dst;
<
< 	dst = &p->recv;
<
---
> 	const struct sockaddr_in *dst = sip_real_dst(p);

This seems pretty heavy-handed, and affects EVERY sip transmit, not just 
the ones I am interested in, but after installing this patch and testing 
various things it seemed to do the trick.  Obviously this is not the right 
thing to do, but I have a few questions:

First of all, what danger have I put us in by forcing dst to p->recv?

What is the intended difference between p->recv and p->sa?  By dumping 
their contents at various places I could see that at least for this 
particular NOTIFY the sin_addr for both were the same (public address), 
but the sin_port was different (p->rec.sin_port was the public port and 
p->sa.sin_port was the internal port).  This made no sense to me at all. 
I kind of expected p->sa.sin_addr to be the internal address of the phone.

I was also able to see that once *dst = sip_real_dst(p) was executed, 
p->recv and p->sa now contained the internal private IP address - both of 
them! - and that is what was used to incorrectly send the reply.  Thus my 
patch.  I don't understand why sip_real_dst() would make a change at all 
to p, and feel that if there is a bug, this is it.

Any enlightenment?  Will be happy to collect the requested info and post 
an issue as long as there isn't some glaring asterisk config that I am 
missing here.  Don't want any negative karma ;)

Cheers,

j





More information about the asterisk-dev mailing list