[asterisk-dev] SIP Registration Failing randomly (analyzed)
timo.teras at iki.fi
Tue Oct 12 06:40:16 CDT 2010
On 09/30/2010 06:06 PM, Timo Teräs wrote:
> On 09/30/2010 05:12 PM, Olle E. Johansson wrote:
>> 30 sep 2010 kl. 15.45 skrev Timo Teräs:
>>> I have had for a while pretty strange SIP Registration related issue.
>>> The client seems to randomly fail registration and the registry entry
>>> goes to REG_STATE_NOAUTH. I'm currently using Asterisk 126.96.36.199.
>>> Key observation was that my link seems to have random latency variation
>>> (normally it's maybe 10ms to the SIP Server; sometimes over 100ms).
>>> So what seems to happen is:
>>> 1. Asterisk sends (re)REGISTER
>>> 2. Time passes (~50-60ms), we are having more latency than normal,
>>> retransmit triggers and Asterisk sends REGISTER again thinking the
>>> previous was lost (on the resent packet Cseq is increased and From tag
>>> is new too; so it's maybe new registration attempt and not resend?)
>>> 3. Server receives 1st register and does not like reused nonce thus
>>> challenging us again for new authorization with 401 Unauthorized
>>> 4. Server receives 2nd register and does not like the old nonce at all
>>> anymore: it replies with 403 Forbidden
>>> 5. Asterisk receives 401 and after that 403. Receiving 403 makes
>>> asterisk go the REG_STATE_NOAUTH mode for the server in registry thus
>>> making the number not work at all, and giving up on all reregistration
>>> So my guestions are:
>>> 1. Why the nonce is reused at all? The regular digest is vulnurable to
>>> replay if nonce was accepted after reuse.
>> It doesn't hurt to reuse it and many providers depend on it.
>>> 2. Any ideas why the reregistration gets triggered after the 50-60ms
>>> with new Cseq and From tag?
>> Depends on if you have qualify turned on and the number of registration
>> attempts you have in sip.conf.
> I currently have:
> registertimeout = 4
> registerattempts = 0
> Globally qualify=yes, but for a type=friend entry matching the
> registration destination I have qualify=no.
> In addition I'm doing two registrations with different usernames to the
> same server. I also have two type=friend entries for this host; one for
> each username.
>>> 3. Why do we not attempt anything after the 403? I remember seeing
>>> posts on sip-implementers that it would be acceptable try after extended
>>> period of time that.
>> 403 means "never come back at all". You need to reconfigure if you
>> get this. 503 is different, in that case you often have a retry-after
>> setting so you can come back.
>> We should propably implement "registry restart <name>" so you don't
>> have to run "sip reload" to restart the registrations.
> I still think it would make sense to try after some period for 403. See e.g:
> But I'm still worried why Asterisk/the server gets confused on the
> retransmit message. It seems odd. I'll try to debug this further.
> I have similar setup on two places: the other one with stable latency
> works perfectly. The site with latency variations sees this problem. So
> it's definitely a timing issue and the server/Asterisk not liking the
> duplicate Register and the following 403.
Thinking more on this, should we not reset the previous authentication
data on timeout? This would prevent the usage of out dated
authentication info on retry after timeout, and should fix this issue.
It would still allow things to work with the servers that want to reuse
More information about the asterisk-dev