[asterisk-dev] Wrong IP-address in SDP when on NAT in challenged Invite

Fri May 7 09:03:30 CDT 2021

On 07.05.21 at 00:12 Michael Maier wrote:
> Hello Joshua,
> 
> On 06.05.21 at 20:21 Joshua C. Colp wrote:
>> On Wed, May 5, 2021 at 4:40 PM Michael Maier <m1278468 at mailbox.org> wrote:
>>
>>> Hello!
>>>
>>> When running asterisk on a system holding WAN and local IP, the IP used
>>> for SDP in an outgoing call in the challenged INVITE is the local one
>>> instead of the WAN IP when using a NATed transport instead of a
>>> transport bound to the WAN IP.
>>>
>>> The SDP in the initial INVITE is absolutely correct. But the following
>>> Invite with the Auth header contains the wrong IP in SDP (the IP in the
>>> SIP Contact and Via header are correct).
>>>
>>> After digging in the code, I could see, that in
>>> session_outgoing_nat_hook (res_pjsip_session.c) the nat rewrite is
>>> stopped, because of an existing hook (I figured it out by some
>>> additional debug outputs):
>>>
>>>          /* SDP produced by us directly will never be multipart */
>>>          if (!transport_state || hook || !tdata->msg->body ||
>>>
>>> !ast_sip_is_content_type(&tdata->msg->body->content_type, "application",
>>> "sdp") ||
>>>                  ast_strlen_zero(transport->external_media_address)) {
>>>                  return;
>>>          }
>>>
> 
> [...]
> 
>> The only thing that comes to mind is the code in
>> res/res_pjsip/pjsip_message_filter.c that alters the SDP in some scenario
>> to update it for the transport the message is going out on.
> 
> True - there is a dedicated function checking for multihomed
> environments. Maybe this one changes back the IP (not tested though).
> Meanwhile I modified the above check for aborting the nat rewrite this
> way, that the check for an existing hook follows after the execution of
> the NAT rewrite. Now, it's working.
> 
> Another question follows now, which is firewall related: At which point
> exactly starts asterisk to send outgoing RTP?

Problem disappeared after restarting of asterisk. This happened quite often unfortunately, that restart solved strange behavior after doing reloads in consequence of doing changes in FreePBX - 
even if they are not transport related.

The DNAT rules are needed anyway, but now it's possible to add a established INPUT rule, so things are working as expected.

Finally I can say, that I got it working after applying two patches to correctly get NAT working on a multihomed server (including WAN IPv4):


res/res_pjsip_nat.c @find_transport_state_in_use

         if (transport_state && ((details->transport && details->transport == transport_state->transport) ||
                 (details->factory && details->factory == transport_state->factory) ||
                 ((details->type == transport_state->type) && (transport_state->factory) &&
                         !pj_strcmp(&transport_state->factory->addr_name.host, &details->local_address) &&
-                       transport_state->factory->addr_name.port == details->local_port))) {
+                       (transport_state->factory->addr_name.port == details->local_port || transport_state->factory->addr_name.port == 0)))) {
                 return CMP_MATCH;
         }

=> This ensures, that transports used as client can be found, too if they can't be directly found.

This patch solves the problem, that the ACK sent by asterisk after a received 407 during INVITE sequence gets the correct IP in the Via header.


The second patch is needed to ensure, that the INVITE after the 407 containing the auth header gets the correct IP (= WAN) in the SDP:

res/res_pjsip_session.c session_outgoing_nat_hook(pj

         /* SDP produced by us directly will never be multipart */
-       if (!transport_state || hook || !tdata->msg->body ||
+       if (!transport_state || !tdata->msg->body ||
                 !ast_sip_is_content_type(&tdata->msg->body->content_type, "application", "sdp") ||
                 ast_strlen_zero(transport->external_media_address)) {
                 return;
         }


Move the check for the hook after executing the NAT functionality:

@@ -5492,6 +5512,12 @@ static void session_outgoing_nat_hook(pj
                 }
         }

+       /* There is a hook - don't do it again */
+       if (hook) {
+               ast_debug(5, "stop - hook\n");
+               return;
+       }
+
         for (stream = 0; stream < sdp->media_count; ++stream) {
                 /* See if there are registered handlers for this media stream type */
                 char media[20];


If there are others having problems during NAT they may try those patches. I know of at least one provider which doesn't care about those errors, but there are others which care and therefore 
calls are failing.

Some more information:
It's important to add to *all* transports the following parameters:

external_media_address=external.mydom.org
external_signaling_address=external.mydom.org
local_net=192.168.0.0/16 or whatever you need

I enabled dnsmgr, which should take care of always correct WAN IP via periodic lookups of external.mydom.org. But not yet tested, if it finally works (if the WAN IP changed).

That's finally the reason I'm trying to use NAT because this way I can bind asterisk to a static local IP address / device, which doesn't change. My hope is, that it detects and handles 
changed WAN IP now correctly w/o restarting.

Unfortunately it turned out, that after the WAN changed when binding directly to the WAN IP (ppp0), asterisk needs pretty long to notice it and afterwards (after transport did restart itself), 
it doesn't work correctly any more (you get internal server error on dialing out e.g.) - you have to restart it ... .


JFI: I'm using SIP / TLS and asterisk 18.4 rc1.


Thanks
Michael