[asterisk-dev] PJSIP and RTP address selection

Tue Sep 11 13:51:16 CDT 2018

Hi,

I've got a scenario where (when using PJSIP, using chan_sip does what I
expect) PJSIP will advertise one address in the SDP during a
conversation but then start transmitting from another.  In my case PJSIP
is advertising 197.96.209.1 in the SDP, but 197.96.209.251 is being used
to send.

I can manipulate that by altering the IPv4 routing table to influence
address selection.

This is due to when using PJSIP the RTP socket is bound against ANY
([::] specifically so that both IPv4 and IPv6 will function).  chan_sip
on the other hand has the RTP port bound to the same address as the
transport.  After discussion with Joshua on IRC it became clear that the
PJSIP behaviour may be preferred in many cases, and that things are
plainly more complicated than one would hope.

I have two potential fixes (and two that aren't practical options I
don't think but might be with knowledge I don't have) both with
advantages and disadvantages:

1.  Bind the socket against the advertised address.
2.  Upon receiving the first rtp, "narrow" the socket listening address
to the received "to" address.
(3.)  Have the RTP sent to my primary address to begin with, not the
socket address as for PJSIP transport.
(4.)  Update the rtp engine to be able to have multiple socket pairs and
switch between them as the remote side does.

The first has various disadvantages as I understand from Joshua.  Most
of them over my head.  The advantage is that the source address would be
(more) deterministic upon sending RTP.  This can be done by passing the
transport address to rtp instance, presumably similar to what chan_sip
does.  This would in some cases break things like signaling on ipv4 and
rtp on ipv6 if pjsip transport is not bound to ANY.  This was as I
understood one of Joshua's bigger concerns.

The second option has the advantage that unless the address to which the
remote side sends changes things should just work.  This can be
implemented by creating a new socket, binding it to the more specific
address and then using dup2() to replace the old socket file descriptor,
before closing the newly creating file descriptor.  It can be returned
to "ANY" in a similar manner if required.  RTCP ports will need to be
re-bound as well.

This should probably be a configurable option either way, and one could
add a transport option "bind_rtp_to_transport_address", and/or a
"narrow_rtp_address" (the latter would make no sense if the former is
active, unless the bind address is an ANY of sorts).  These can be
implemented in conjunction or separately.

The third option basically involves binding the socket to ANY and
pretending to send data to the known addresses for the peer and using
those addresses in the SDP (if we've seen SDP for the conversation
already, those addresses, otherwise for the remote address of the SIP
communication - this would break a number of things potentially, thus
likely not a serious option.  For example, if we're sending an INVITE to
a web-socket transport, then potentially the web-socket connection has
been proxied and the remote address of the web socket connection isn't
actually where the remote side is, for example, if proxying via
httpd/apache to localhost:8088 then asterisk sees 127.0.0.1 as the
"rermote".

I'm tending towards option 2.  This would perhaps also have a side
effect of minimizing attack surface for things like RTP bleed.

I suspect this has not come to light before since most setups is likely
to only have a single IPv4 and single IPv6 global address, or in the
case of multi-homing would have one on each interface with the kernel
RPF filter getting rid of traffic from a source other than where it
would route back to, basically forcing an IP match based on route-based
address selection.

Joshua suggested that before coding on this is started all use-cases
should be explored and documented, which I think is a good idea.  I'd be
happy to drive that process, I'd however need to understand where this
should be documented.  So in this respect this email servers as a
request for pointers.

DISCLAIMER:  As I've realized I'm no SIP expert and anything beyond
what's available in chan_sip currently is for me a massive learning
curse.  A challenge I'm quite enjoying.

For further explanation, my setup is explained below.  This perhaps just
gives more background information to the problem I'm experiencing, and
may or may not be useful to other people reading this.

My setup is a bit convoluted (but no more so than required for my
needs).  I do run multiple asterisk instances on a single host.  For
each instance I assign a unique IP to the host (one IPv4 and one IPv6
where the IPv6 is of the form pre:fix::i.p.v.4 (And I have a /64 prefix
delegated for this purpose).  Currently IPv6 is NOT advertised in DNS
until such time as I can get everything else working.

On the HOST I thus have the following addresses assigned for the host:

    inet 197.96.209.251/24 brd 197.96.209.255 scope global bond0
    inet6 2c0f:f720:0:2:21e:67ff:fea0:671e/64 scope global dynamic
mngtmpaddr

My system has these IPs assigned for my asterisk test instance:

    inet 197.96.209.1/32 scope global bond0
    inet6 2c0f:f720:0:2::c560:d101/128 scope global

IPv6 address selection works differently than IPv4 in the case of ANY,
but I suspect (untested) the same problem will occur.  For IPv4 the
problem lies in:

197.96.209.0/24 proto kernel scope link src 197.96.209.251
default via 197.96.209.252 metric 6

So when the default route is selected, the default src for the local LAN
aplies, which is .251.  I do have a mechanism that can work around this,
which I call rtdaemon.  It's basically given a pcap filter, and it will
dynamically add routes to the routing table to influence the source
address selection, eg:

ip ro ad 165.16.203.126/32 via 197.96.209.252 src 197.96.209.1

I'd prefer to avoid 1500+ routes in my routing table if possible, which
is what I currently have on systems where that is deployed (completely
different use case, and the below "concern" doesn't apply.

Assuming that 165.16.203.126 only needs to communicate with a single IP
address on my side this works.  Unfortunately ... I really am starting
to develop a severe distaste for NAT and ISPs that won't bother giving
their clients publicly routable IPs, but I do understand the IPv4
depletion problem too so won't be too harsh on them.

My PJSIP config has ten transports declared (IPv4+IPv6) x (udp, tcp,
tls, ws, wss), of which at the moment I'm only using IPv4 udp + tcp,
I'll only post the UDP and TCP ones here:

[pjsip-udp](!)
type=transport
protocol=udp
allow_reload=yes

[pjsip-tcp](!)
type=transport
protocol=tcp
allow_reload=yes

[pjsip-4]
local_net=192.168.0.0/16
local_net=10.0.0.0/8
local_net=172.16.0.0/12

[pjsip-udp6](pjsip-udp)
bind=[2c0f:f720:0:2::197.96.209.1]:5060

[pjsip-tcp6](pjsip-tcp)
bind=[2c0f:f720:0:2::197.96.209.1]:5060

[pjsip-udp4](pjsip-udp,pjsip-4)
bind=197.96.209.1:5060

[pjsip-tcp4](pjsip-tcp,pjsip-4)
bind=197.96.209.1:5060

chan_sip is only bound to the IPv4 address:

udpbindaddr=197.96.209.1:5059
tcpbindaddr=197.96.209.1:5059

So in my use case things are actually pretty simple:

I always want exactly two candidate addresses for any given instance: 
197.96.209.1 for IPv4, or 2c0f:f720:0:2::197.96.209.1 for IPv6.  ANY is
not an option due to address selection at kernel routing level picking
the wrong addresses unless I manipulate the routing table, which will
break (existing) use cases where I've got contact from the same external
address to multiple addresses on my side.