[asterisk-dev] STUN support in chan_sip revisited

Fri Aug 6 09:54:33 CDT 2010

Am 06.08.2010 14:12, schrieb Simon Perreault:
> On 2010-08-06 04:56, Klaus Darilion wrote:
>> Further, STUN should be used also to detect the type of the NAT device.
>
> I disagree. The idea that one can detect the type of the NAT device is
> dead. NAT devices vary much more than STUN is able to detect. To take an
> extreme example, some NAT devices' behaviour changes following current
> load, the time of day, etc.

That's true. But there are also lots of them which do not change 
behavior - and having Asterisk saying "this is a symmetric NAT, VoIP 
will not work" is IMO a useful option to help users with NAT traversal.

> The better approach is to be agnostic to the type of NAT. Just try to
> traverse it using all possible ways, and see what works. Dynamically
> pick the best alternative.

Pick the best alternative based on what?

>
>>> Another useful effect of using STUN support is as a keepalive
>>> mechanism for the state entry on the NAT device.  By setting the
>>> 'externrefresh=180 ;where 180 is the configured number of seconds'
>>> option in sip.conf we should expect the STUN request to be set every
>>> 180 seconds in order to refresh the state entry on the NAT device.
>>
>> The keep-alive is very important. Because if the NAT-mapping times out,
>> incoming calls wont traverse the NAT anymore and also, further outgoing
>> STUN/SIP requests will get assigned a new binding, probably with a new
>> public port. Thus, the STUN keep-alive should be smaller than the NAT
>> timeout to have a constant public address/port.
>
> Note also that keep-alive can be done with pure SIP. This has the
> advantage that the peer doesn't need to support STUN. See RFC 5626
> section 3.5.1.

Of course, yes. But using STUN would also detect changes of the public IP.

>
>>> --- Current STUN implementation problems/limitations
>>>
>>> 1.  When STUN is used, the response's port mapping is internally used
>>> for both the external address for TCP and UDP.  Since STUN queries
>>> are connectionless and use the configured UDP port, the STUN response
>>> is not accurate as an external address for TCP connections.
>>
>> Indeed. Several SIP clients use a different approach for TCP: with the
>> first SIP request sent to a certain target, the response indicates the
>> public IP/port in the Via header. This one will be used for all further
>> requests (mainly Contact header) within this TCP connection.
>>
>>> 2.  When a STUN request is sent out, we block on the SIP UDP socket,
>>> throwing away any non-STUN related traffic until the STUN response is
>>> received.  This means that any SIP signaling received during a STUN
>>> query is just thrown away.  It also makes for some confusing error
>>> messages.
>>>
>>> 3. Using STUN queries as a keepalive mechanism does not work exactly
>>> like it is documented in sip.conf.  According to the documentation by
>>> setting the 'externalrefresh=x ; where x is number of seconds' option
>>> we  should expect an event to fire every 'x' number of seconds.  This
>>> is not an accurate assumption as there is no scheduled event that
>>> causes a STUN request.  We are only guaranteed that a new STUN
>>> request will be sent sometime after 'x' number of seconds after the
>>> ast_sip_ouraddrfor() function is invoked as a result of processing a
>>> new incoming or outgoing SIP request of some sort.
>>>
>>> 4. Asterisk's STUN implementation has no way of determining the
>>> correct external port mappings for media.  This means that while the
>>> SIP signaling may work correctly, the media may.  Because of this, I
>>> question why anyone would even choose to use Asterisk's STUN support
>>> at all.
>>
>> There are some cases where it will work:
>> - NAT device is configured with static port forwarding of port 5060 and
>> RTP port range, and the public IP address is dynamic
>> - NAT devices uses port-preservation (public port = local port)
>>
>>> --- Conclusion.
>>>
>>>  From what I have gathered, STUN support in Asterisk is useless in its
>>> current state.  Even if the current expected behavior worked I am
>>
>> almost :-)
>>
>>> unaware of how it would be useful.  If it is necessary to determine
>>> the external port for SIP traffic because of some NAT device, then it
>>> seems like we would need the correct external media port mappings as
>>> well.
>>
>> yes, except the scenarios described above.
>>
>>> --- What I need to know
>>>
>>> Is my analysis of the current expected behavior of STUN support in
>>> chan_sip accurate?
>>
>> At least you found several issues. Maybe there are other issues as well :-)
>>
>>> Assuming the expected behavior was not broken, what are some
>>> realistic use cases where this current behavior would be used?
>>
>> It is broken.
>>
>>> Forgetting how Asterisk does STUN support in chan_sip all together.
>>> How do you expect/want STUN support to work in relation to chan_sip?
>>
>> As Simon said, SIP-outbound and ICE support would be nice. But to work
>> with old servers too, old-style STUN support (using STUN to detect
>> public addresses and put these addresses in the SIP messages) is needed to.
>>
>> IMO, the STUN client should run in background all the time to detect the
>> NAT type and to detect the public SIP address/port. It should do
>> keep-alive and update the public address information if it changes.
>>
>> If the STUN server is not reachable, or the NAT type does not allow NAT
>> traversal by STUN (e.g. symmetric NAT), it should log warnings and use
>> the local IP only. Maybe a proprietary header can be added to the SIP
>> messages too, e.g. (X-NAT-Info: symmetric NAT, NAT traversal disabled)
>
> This is reinventing the wheel in a broken manner. Why do our own thing
> alone, when there is standard STUN usage that works?

What is the standard STUN usage that works? RFC 5626? How long will it 
take until all SIP proxies/registrars deployed will be updated to 
support RFC 5626?
>
> Working with servers that do not support SIP-outbound and/or ICE is
> simple: don't do anything. It is up to these servers to do their thing
> (i.e. latching). If they don't do any of that, then they can't
> reasonably expect to work with clients behind NATs.

Is this the IETF approach? Non-RFC5626 compatible servers are 
responsible for NAT traversal?

That would be great - then we could drop "old style" STUN from all 
clients immediately.

regards
Klaus