[Asterisk-Users] Am I missing somthing?

Wed Oct 29 07:27:56 MST 2003

Gus,

There has obviously been a lot of postings relative to "why won't nat work"
and several responses say it does, and many more say it doesn't. In the
interest of using your posting as "just one example", I'm going to walk
through the technical stuff for "only" this one example. There are many
more reasons why nat doesn't work in other examples not addressed in this
posting. (That's a disclaimer. No flames intended; no flames accepted.)

You probably seen my earlier posting relative to nat. It's my opinion that
nat can be made to work, but one needs to understand "exactly" what each
component is really doing, and then configure to address those issues.

> Should the following setup work?
> 
> SIP UA---NAT---Internet---NAT---SIP UA

My generic answer is yes, but not without understanding the two nat boxes,
the interaction with asterisk (assumed to be a third leg in that above),
and the exact sip phones in use.

> If both UA's support STUN and report the external IP address in the SIP 
> packet..

The SIP header thing is absolutely a must to get any of this to work.

> I am trying to get away from using canreinvite=no so that traffic can go 
> directly between the UA's and not via the central server but I can't 
> seem to get it to work..
> 
> Has anyone set this up and can give me some pointers??

We need to better understand those two nat boxes. What are they and can
you get a packet trace from the inside and outside edges of those nat
boxes at the same time?  Only need a few dozen packets of the call set
up in order to "see" what the boxes are doing in terms of "port address 
translation", etc. (Use ethereal, sniffer, tcpdump, or whatever tool you 
have, but need both inside and outside at the same time.)

Since I don't know your background at all, I'll walk through what a nat
box does internally (so we're on the same technical level), and then
show what happens on some boxes that cause issues.

First, every nat box keeps table entries for packets that "originate"
from the inside. (No entries are keep for packets that might originate
from the outside.) The table entry looks like:
   SRC IP -> SRC Port -> DST Port -> DST IP
   192.168.1.7 -> 5060 -> 5060 -> 206.222.193.101

All boxes do this.  If the 206.222.193.101 address is the outside
of another nat box, then that nat box "must" have a static table entry
created by you that tells the box where to send a packet that arrives
with a "destination port of 5060". (Let's assume you have a static entry
that maps 5060 to an inside address of 192.168.7.30, could be asterisk
or it could be another sip phone, but for this example lets assume the
box is an asterisk server.)

Once that packet arrives at the asterisk box, what does * actually
receive in terms of IP addresses and port numbers? If it is purely a nat
box, he "should" see a packet that looks like:
   201.1.2.3 -> 5060 -> 5060 -> 192.168.7.30
(where 201.1.2.3 is the outside address of the originating nat box).

If there is "no port address translation", that packet flow will work
and has been proven by many people to work fine. Remember, at this point
we're only dealing with raw packets; we have not talked as yet about what
happens within the SIP packets in terms of inserting IP addresses in
headers to handle nat'ed traffic, etc.

If the originating sip phone together with asterisk is going to redirect
this session to another sip phone somewhere on the Internet that also
uses nat, here's what happens...

Asterisk tells the originating sip phone to contact the sip phone at
2.3.4.5 via the SIP protocol using Invite, etc. The originating sip phone
then sends a sip packet on port 5060 to 2.3.4.5, and the nat box at this
originating location creates another nat table entry similar to the one 
shown above.

Part of that handshaking process within the SIP protocol says... let's
start an RTP data flow "and" the sip phone uses port 23456 (because that
falls within the port range preprogrammed into the sip phone for RTP
(or Media usage). Now we have the first sip phone trying to contact the
second sip phone with an originating nat table entry that looks like:
   192.168.1.7 -> 23456 -> 5060 -> 2.3.4.5
The packet makes it all the way to the distant nat box, and if that nat
box has been preprogrammed to have DST Port 5060 point to an internal
SIP phone, the packet will get to that distant sip phone. That sip phone
has been preprogrammmed for RTP data flows (Media) to pick a port
number in a certain range as well (lets say it picks 34567) as its
RTP source port.

However, the return path (from the distant sip phone to the 192.168.1.7
sip phone) is going to have a problem. That problem is back at the nat
box at the originating sip phone. The RTP packet arrives as:
  2.3.4.5 -> 34567 -> 23456 -> 201.1.2.3
but there is no nat table entry that matches. The packet is dropped on
the floor and the call essentially is terminated due to no audio (RTP).
Asterisk (located somewhere else as the third leg) doesn't have a clue
that a session problem existed, and drops out of the picture.

What's the nat problem?

The issue in this "one" example is the nat boxes have to deal with:
  1. mapping 5060 to/from "two" different remote nat boxes (* & a sip phone)
  2. negotiation of the RTP session (which udp ports are to be used)
  3. start the flow of RTP traffic between two distant sip phones, both
     located behind nat boxes, and asterisk drops out of the picture
     from an RTP data flow perspective.

The majority of us on the * list understand and can deal with the call
setup on port 5060 and the associated nat table entries needed to support
it, but it seems few understand the SIP protocol and the redirection
that "must" also happen in order to handle RTP packets (which is the actual
voice data and a major source of failure for those trying nat).

The following lists only some of the default RTP ports used by vendors:
  Cisco:  udp 16384 to 32766
  XLite:  udp 8000 to 8012 (or something like that)
  Grandstream:  1024-65535 with port 5004 used as the default (per someone)
  Snom: some other ports that I couldn't find right now

If a Cisco sip phone is going to call a Grandstream sip phone and both
are located behind nat boxes, what ports need to be statically defined
at those nat boxes in order for RTP to function?  (Obviously, it varies
by vendor.) I'll bet 99% of the people on the * list will guess wrong,
and that "is" the underlying root-cause for those postings.

If you are setting up a small number of corporate phones and you have
total control over those setups, you could change each sip phone to only
use udp 21000 to 21008 for RTP (as an example) and statically map the nat 
boxes for those ports. If that can't be done, there is little hope of 
getting nat to work. (I've not tried this with the phones that I have 
available, but I'd bet some money that at least some phone vendors have 
software bugs that don't allow this change.)

There are many other issues involved with nat'ing as well, so might as well
touch on a couple of them...
 1. Port Address Translation: if a sip phone is located behind a nat box
    and creates the "first" table entry for udp 5060 going to an outside
    asterisk box (or another sip phone), what happens when that same sip
    phone tries to create a "second" session on port 5060 to the same
    destination?  Many nat boxes will substitute another port for the 5060
    "source port" (lets assume 5061 for example) and will keep track of
    that change internally (that is port address translation). Since the
    SIP protocol revolves around 5060 only, port 5061 is not recognized
    and will fail.  The failure will show up as either: a) the second
    phone call will fail, b) the second sip phone behind the nat will
    fail, c) the second line on a single sip phone may fail, or something
    like that. (Note: not all nat vendors handle PAT the same either.)
 2. The sip phones (and asterisk) have to be able to deal with handling
    the outside nat address within the sip headers. Obviously not all
    vendors are supporting this, and my guess is not all that support it
    actually support it in the same way.
 3. Some nat boxes provide the ability to statically map a single port
    and not a range of ports. There is little hope for those boxes.
 4. Some nat boxes have been programmed by the vendor to be aware of 
    certain protocols such as ftp passive mode, sip, etc. Those vendors
    actually look inside certain IP packets and "anticipate" what is
    going to happen given their interpretation of the RFC's. Some get
    it right, some get it wrong including some very major networking
    vendors.

The majority of the nat box vendors do not give the user any menues or
other ways to actually "see" the translation tables in actual use, therefore
the only way to resolve nat issues is to obtain a packet trace of the
session setup on the "inside" and the "outside" simultanously to determine
what the nat box and sip phones are actually doing. Once that information 
is available, then its rather clear exactly what needs to be done to fix 
the issues. Without that info the only recoarse is trial & error 
involving parameters in at least six boxes using your original question.

Whether STUN is required or not depends 100% on the answers that become
obvious "after" understanding what is really happening at each box. (In
some cases, VPN might be the answer to resolving nat problems.)

For those that have actually read all the way through this, I'm sure there
are typo's or other errors in the narrative, but the point of the entire
email is valid regardless. Nat can be made to work in most cases; just
need to know exactly what "each" box is actually doing, etc.

Rich