[asterisk-dev] [Code Review] 2475: NOTIFYs for BLF start queuing up and fail to be sent out after retries fail

Alec Davis reviewboard at asterisk.org
Tue May 7 03:14:11 CDT 2013


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/2475/#review8460
-----------------------------------------------------------


{quote}
From: asterisk-dev lists mail by Walter Doekes
I recently patched one of my (opensips) presence servers to *not* expire the subscription after a few unanswered notify's.

Temporary network issues (probably unpunctured NAT) made subsequent NOTIFYs not go out because the presence server had dropped the subscription. And then the subscribers were (rightly) wondering why they were not getting new notifies while nearby devices were getting theirs.
{quote}

With the above in mind, I can see a way forward;

new int required 'failed_transactions_count'

in the notify response 200OK: 
  reset failed_transactions_count

after the 10 packet retransmissions fail:
  increment failed_transactions_count
  if count < some value (3)
    clear pendinginvite (this unlocks the network interlock we're experiencing)
  else
    remove subscription

Not strictly the RFC compliant, but wasn't in the first place, but makes us tolerant.

The outcome after network connectivity is fixed:
  For 1.8 to trunk, if the subscription expiry had *NOT* elapsed, the BLF would come right on the next State Notify event trigger.

  For 11 and above;
  When [re-]subscribing, the BLF will update correctly, as State Notify as sent every [re-]subscribe.

  But for 1.8;
  After a re-subscribe, the BLF may still be wrong, as 1.8 doesn't send a State Notify after a re-subscribe (the subscription still exists as we not have had enough errors).
  After a subscribe the BLF will update correctly. Subscription doesn't exist, due to previous failed_transactions > some value, expired, device rebooted, etc. 
  


- Alec Davis


On May 6, 2013, 8:40 p.m., Alec Davis wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/2475/
> -----------------------------------------------------------
> 
> (Updated May 6, 2013, 8:40 p.m.)
> 
> 
> Review request for Asterisk Developers.
> 
> 
> Bugs: ASTERISK-21677
>     https://issues.asterisk.org/jira/browse/ASTERISK-21677
> 
> 
> Repository: Asterisk
> 
> 
> Description
> -------
> 
> The notify subsystem relies on a NOTIFY 200OK response to clear the SIP_PAGE2_STATECHANGEQUEUE flag and p->pendinginvite.
> If the response never arrives, then any further NOTIFYs cannot EVER be sent, they just 'queue' up by replacing the previous queued notify.
> 
> The fix: Follow RFC6665 4.2.2 more closely, after failed NOTIFY transaction remove the subscription.
> Then after a period of time the client will (re-)subscribe, which will create a new subscription.
> 
> For minimum BLF 'not working' time maxexpiry in sip.conf needs to be around 300, not the default of 3600 seconds.
> 
> 
> Diffs
> -----
> 
>   branches/1.8/channels/chan_sip.c 380212 
> 
> Diff: https://reviewboard.asterisk.org/r/2475/diff/
> 
> 
> Testing
> -------
> 
> As per bug report  https://issues.asterisk.org/jira/browse/ASTERISK-21677
> 
> Asterisk 1.8, subscribers will NOT update their status when they re-subscribe, but will on the next event.
> Asterisk 11, subscribers WILL update their status when they re-subscribes.
> 
> 
> Reporter of ASTERISK-21677 has also tested on a number of production servers.
> 
> 
> Thanks,
> 
> Alec Davis
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20130507/bfe97cf6/attachment.htm>


More information about the asterisk-dev mailing list