[asterisk-dev] Proposal to seperate qualify & keep alive
John Todd
jtodd at loligo.com
Mon Jun 26 13:06:30 MST 2006
At 8:32 PM +0200 6/26/06, Johansson Olle E wrote:
>26 jun 2006 kl. 19.23 skrev John Lange:
>
>>In the current implementation, qualify sends out a SIP request at the
>>specified interval and if it doesn't receive a reply within that same
>>interval asterisk flags the peer as unreachable.
>>
>>This also acts as a sort of keep-alive for devices behind NAT when
>>combined with the nat=yes parameter. The regular flow of SIP packets
>>keeps the NAT connective alive for the device behind the firewall.
>>
>>The problem is, these are two very different concepts and at times it
>>would be nice if we could separate the two.
>>
>>Specifically; we have some clients with devices behind nat and
>>satellite. Their nat and satellite requires a more-or-less constant flow
>>of packets to keep the connection alive. However due to the quirky
>>nature of satellite combined with long round-trip times the qualify
>>option needs to be set high (5000ms) or Asterisk won't send calls to the
>>client.
>>
>>In fact we would like to set qualify=no because often the client appears
>>to be very lagged when the satellite perceives the connection to be idle
>>(apparently it queues packets until it has a bunch and sends them in
>>groups) but if you initiate a call the lag drops immediately to an
>>acceptable level (800ms).
>>
>>But if we set qualify=no then the firewall closes the connection and
>>they can't receive any calls.
>>
>>So, the question is; is it reasonable to undertake the implementation of
>>a keep alive for sip clients?
>>
>>Any thoughts on how this should be done? SIP NOTIFY or would something
>>else make more sense?
>
>I don't see a reason for changing method. We should propably find a way
>to override and be able to dial out regardless of the monitoring status.
>That seems like a simple fix.
>
>/O
I would actually agree that the two functions should be separated. I
find myself often in the same position, where the use of "qualify="
is used as a NAT mapping tool only, and I don't particularly care
about the actual milliseconds of response time to the request. I
also think we would be well-served to make these timers a bit more
flexible, since right now everyone is in the "same bucket" as far as
timing goes for how frequently OPTIONS requests are sent. I'd like
to be more aggressive for foolish people who have poorly-configured
firewalls that close NAT UDP sessions after 30 (or fewer) seconds,
and currently the only way to do this is to change the code to send
ALL of my OPTIONS requests much more frequently, which eventually
leads to a huge amount of nonsense noise on my network to solve for a
few poorly behaved clients.
SER sends "bogus" packets fairly frequently as part of it's NAT
module, and this seems to work well.
The current method in Asterisk has a few downsides:
1) OPTIONS packets are larger than just simple UDP keepalives (but
not by much)
2) OPTIONS requests require stateful storage of status, so if I
have 6000 SIP "peers" each using "qualify=", then Asterisk needs to
store a fairly large amount of memory aside to track each one of
those transmitted OPTIONS statements, and if at any time there are
10% of those peers which are slow to respond (say, two cycles) then I
have a huge backlog of stateful requests in queue. If a UDP packet
that did not require return receipt was sent just for NAT keepalives,
this would be much lighter weight, and we could move the "heavier"
OPTIONS request interval to a larger time value.
3) The current OPTIONS request is bursty, and all of the OPTIONS
are sent in 60 second intervals using the same interval timer. This
is really ugly, with big spikes of data every 60 seconds. This
should be probably distributed so that each entry has it's own timer.
I propose a different way to do this, with an example out of sip.conf
listed below. I know that this will require the creation of memory
space for each of these timers (and a whole slew of timer-related
issues internally to Asterisk) but it does seem like it would be more
flexible to do it this way and may reduce the amount of processing
for the OPTIONS requests if just lightweight UDP can be sent for NAT
translations. With this method, I could possibly crank up the
OPTIONS qualifiers to something like 5 minutes, but leave the NAT
translation keepalives down at 20 seconds and hopefully see less load
on my Asterisk servers and network with large numbers of REGISTER'ed
hosts. This is all kind of pointless for 20 users, but Asterisk is
no longer being used only for sites with double or triple-digit
numbers of users, and it makes a difference at scale.
; Hypothetical sip.conf settings for "new" qualify/NAT timers
;
; Send OPTIONS requests to measure latency (450ms in this ex.)
; every 120 seconds. The qualifytime timer starts based
; on the time the last REGISTER was successfully parsed, or
; if a static IP host, then based on the time the entry was
; parsed in this file plus a random number of seconds not
; greater than the value in "qualify=". If "qualify="
; is non-zero but there is no "qualifytime=", then default
; of qualifytime is 60 seconds. If "qualifytime=" is
; non-zero but there is no "qualify=", then qualifytime is
; 500 milliseconds.
qualify=450
qualifytime=120
;
; Send very minimal, one-way packets to hosts in order
; to keep NAT translations open. Send once every 20 seconds.
; No default value.
nat-keepalive=20
;
JT
More information about the asterisk-dev
mailing list