[asterisk-dev] Proposal to seperate qualify & keep alive

John Todd jtodd at loligo.com
Mon Jun 26 13:06:30 MST 2006


At 8:32 PM +0200 6/26/06, Johansson Olle E wrote:
>26 jun 2006 kl. 19.23 skrev John Lange:
>
>>In the current implementation, qualify sends out a SIP request at the
>>specified interval and if it doesn't receive a reply within that same
>>interval asterisk flags the peer as unreachable.
>>
>>This also acts as a sort of keep-alive for devices behind NAT when
>>combined with the nat=yes parameter. The regular flow of SIP packets
>>keeps the NAT connective alive for the device behind the firewall.
>>
>>The problem is, these are two very different concepts and at times it
>>would be nice if we could separate the two.
>>
>>Specifically; we have some clients with devices behind nat and
>>satellite. Their nat and satellite requires a more-or-less constant flow
>>of packets to keep the connection alive.  However due to the quirky
>>nature of satellite combined with long round-trip times the qualify
>>option needs to be set high (5000ms) or Asterisk won't send calls to the
>>client.
>>
>>In fact we would like to set qualify=no because often the client appears
>>to be very lagged when the satellite perceives the connection to be idle
>>(apparently it queues packets until it has a bunch and sends them in
>>groups) but if you initiate a call the lag drops immediately to an
>>acceptable level (800ms).
>>
>>But if we set qualify=no then the firewall closes the connection and
>>they can't receive any calls.
>>
>>So, the question is; is it reasonable to undertake the implementation of
>>a keep alive for sip clients?
>>
>>Any thoughts on how this should be done? SIP NOTIFY or would something
>>else make more sense?
>
>I don't see a reason for changing method. We should propably find a way
>to override and be able to dial out regardless of the monitoring status.
>That seems like a simple fix.
>
>/O

I would actually agree that the two functions should be separated.  I 
find myself often in the same position, where the use of "qualify=" 
is used as a NAT mapping tool only, and I don't particularly care 
about the actual milliseconds of response time to the request.  I 
also think we would be well-served to make these timers a bit more 
flexible, since right now everyone is in the "same bucket" as far as 
timing goes for how frequently OPTIONS requests are sent.  I'd like 
to be more aggressive for foolish people who have poorly-configured 
firewalls that close NAT UDP sessions after 30 (or fewer) seconds, 
and currently the only way to do this is to change the code to send 
ALL of my OPTIONS requests much more frequently, which eventually 
leads to a huge amount of nonsense noise on my network to solve for a 
few poorly behaved clients.

SER sends "bogus" packets fairly frequently as part of it's NAT 
module, and this seems to work well.

The current method in Asterisk has a few downsides:

   1) OPTIONS packets are larger than just simple UDP keepalives (but 
not by much)

   2) OPTIONS requests require stateful storage of status, so if I 
have 6000 SIP "peers" each using "qualify=", then Asterisk needs to 
store a fairly large amount of memory aside to track each one of 
those transmitted OPTIONS statements, and if at any time there are 
10% of those peers which are slow to respond (say, two cycles) then I 
have a huge backlog of stateful requests in queue.  If a UDP packet 
that did not require return receipt was sent just for NAT keepalives, 
this would be much lighter weight, and we could move the "heavier" 
OPTIONS request interval to a larger time value.

   3) The current OPTIONS request is bursty, and all of the OPTIONS 
are sent in 60 second intervals using the same interval timer.  This 
is really ugly, with big spikes of data every 60 seconds.  This 
should be probably distributed so that each entry has it's own timer.


I propose a different way to do this, with an example out of sip.conf 
listed below.  I know that this will require the creation of memory 
space for each of these timers (and a whole slew of timer-related 
issues internally to Asterisk) but it does seem like it would be more 
flexible to do it this way and may reduce the amount of processing 
for the OPTIONS requests if just lightweight UDP can be sent for NAT 
translations.  With this method, I could possibly crank up the 
OPTIONS qualifiers to something like 5 minutes, but leave the NAT 
translation keepalives down at 20 seconds and hopefully see less load 
on my Asterisk servers and network with large numbers of REGISTER'ed 
hosts.  This is all kind of pointless for 20 users, but Asterisk is 
no longer being used only for sites with double or triple-digit 
numbers of users, and it makes a difference at scale.


; Hypothetical sip.conf settings for "new" qualify/NAT timers
;
; Send OPTIONS requests to measure latency (450ms in this ex.)
;  every 120 seconds.  The qualifytime timer starts based
;  on the time the last REGISTER was successfully parsed, or
;  if a static IP host, then based on the time the entry was
;  parsed in this file plus a random number of seconds not
;  greater than the value in "qualify=".  If "qualify="
;  is non-zero but there is no "qualifytime=", then default
;  of qualifytime is 60 seconds.  If "qualifytime=" is
;  non-zero but there is no "qualify=", then qualifytime is
;  500 milliseconds.
qualify=450
qualifytime=120
;
; Send very minimal, one-way packets to hosts in order
;   to keep NAT translations open.  Send once every 20 seconds.
;   No default value.
nat-keepalive=20
;


JT



More information about the asterisk-dev mailing list