[asterisk-dev] PJSIP and RTP address selection

Thu Sep 13 17:00:04 CDT 2018

On Tue, Sep 11, 2018 at 1:51 PM, Jaco Kroon <jaco at uls.co.za> wrote:
> Hi,
>
> I've got a scenario where (when using PJSIP, using chan_sip does what I
> expect) PJSIP will advertise one address in the SDP during a
> conversation but then start transmitting from another.  In my case PJSIP
> is advertising 197.96.209.1 in the SDP, but 197.96.209.251 is being used
> to send.
>
> I can manipulate that by altering the IPv4 routing table to influence
> address selection.
>
> This is due to when using PJSIP the RTP socket is bound against ANY
> ([::] specifically so that both IPv4 and IPv6 will function).  chan_sip
> on the other hand has the RTP port bound to the same address as the
> transport.  After discussion with Joshua on IRC it became clear that the
> PJSIP behaviour may be preferred in many cases, and that things are
> plainly more complicated than one would hope.

Ugh.  This sounds like it's in the belly of the address selection code
of PJSIP and squarely in Josh's territory.

> I have two potential fixes (and two that aren't practical options I
> don't think but might be with knowledge I don't have) both with
> advantages and disadvantages:
>
> 1.  Bind the socket against the advertised address.

That seems interesting, although I'm not sure what that means in a
multi-homed world with multiple address/media streams (IPv4 + IPv6).
Also, I wonder how this works with ICE/STUN/TURN across many
interfaces and address families.  Multi-home is hard to get right for
all scenarios.  I can't help but wonder if instead of binding to the
wildcard address we should be explicitly binding to each
interface/address and making our own source address selection rather
than letting the kernel decide.  Sometimes the kernel will decide in a
way that surprises you and I think that's what you're hitting.

> 2.  Upon receiving the first rtp, "narrow" the socket listening address
> to the received "to" address.

That also doesn't seem unreasonable, but I'd rather hear what Josh
thinks since he spent lots of time with his head in this code.

> (3.)  Have the RTP sent to my primary address to begin with, not the
> socket address as for PJSIP transport.
> (4.)  Update the rtp engine to be able to have multiple socket pairs and
> switch between them as the remote side does.

That seems "most right", and matches my idea solution from above.  But
then again, I'm curious how it would affect our ICE/STUN/TURN stack.

> The first has various disadvantages as I understand from Joshua.  Most
> of them over my head.  The advantage is that the source address would be
> (more) deterministic upon sending RTP.  This can be done by passing the
> transport address to rtp instance, presumably similar to what chan_sip
> does.  This would in some cases break things like signaling on ipv4 and
> rtp on ipv6 if pjsip transport is not bound to ANY.  This was as I
> understood one of Joshua's bigger concerns.

Yeah....

> The second option has the advantage that unless the address to which the
> remote side sends changes things should just work.  This can be
> implemented by creating a new socket, binding it to the more specific
> address and then using dup2() to replace the old socket file descriptor,
> before closing the newly creating file descriptor.  It can be returned
> to "ANY" in a similar manner if required.  RTCP ports will need to be
> re-bound as well.
>
> This should probably be a configurable option either way, and one could
> add a transport option "bind_rtp_to_transport_address", and/or a
> "narrow_rtp_address" (the latter would make no sense if the former is
> active, unless the bind address is an ANY of sorts).  These can be
> implemented in conjunction or separately.

I'd hate having to add another options for this behavior.  It seems
like there should be a path forward that gets most of the right cases
most of the time without it being an optional behavior.

> The third option basically involves binding the socket to ANY and
> pretending to send data to the known addresses for the peer and using
> those addresses in the SDP (if we've seen SDP for the conversation
> already, those addresses, otherwise for the remote address of the SIP
> communication - this would break a number of things potentially, thus
> likely not a serious option.  For example, if we're sending an INVITE to
> a web-socket transport, then potentially the web-socket connection has
> been proxied and the remote address of the web socket connection isn't
> actually where the remote side is, for example, if proxying via
> httpd/apache to localhost:8088 then asterisk sees 127.0.0.1 as the
> "rermote".
>
> I'm tending towards option 2.  This would perhaps also have a side
> effect of minimizing attack surface for things like RTP bleed.

It might be the lowest friction way forward (without rewriting the
RTP/ICE/STUN/TURN layers).

> I suspect this has not come to light before since most setups is likely
> to only have a single IPv4 and single IPv6 global address, or in the
> case of multi-homing would have one on each interface with the kernel
> RPF filter getting rid of traffic from a source other than where it
> would route back to, basically forcing an IP match based on route-based
> address selection.

Multiple IPv4 address are not very common among non-carriers.

> Joshua suggested that before coding on this is started all use-cases
> should be explored and documented, which I think is a good idea.  I'd be
> happy to drive that process, I'd however need to understand where this
> should be documented.  So in this respect this email servers as a
> request for pointers.

+1

> DISCLAIMER:  As I've realized I'm no SIP expert and anything beyond
> what's available in chan_sip currently is for me a massive learning
> curse.  A challenge I'm quite enjoying.
>
> For further explanation, my setup is explained below.  This perhaps just
> gives more background information to the problem I'm experiencing, and
> may or may not be useful to other people reading this.
>
> My setup is a bit convoluted (but no more so than required for my
> needs).  I do run multiple asterisk instances on a single host.  For
> each instance I assign a unique IP to the host (one IPv4 and one IPv6
> where the IPv6 is of the form pre:fix::i.p.v.4 (And I have a /64 prefix
> delegated for this purpose).  Currently IPv6 is NOT advertised in DNS
> until such time as I can get everything else working.
>
> On the HOST I thus have the following addresses assigned for the host:
>
>     inet 197.96.209.251/24 brd 197.96.209.255 scope global bond0
>     inet6 2c0f:f720:0:2:21e:67ff:fea0:671e/64 scope global dynamic
> mngtmpaddr
>
> My system has these IPs assigned for my asterisk test instance:
>
>     inet 197.96.209.1/32 scope global bond0
>     inet6 2c0f:f720:0:2::c560:d101/128 scope global
>
> IPv6 address selection works differently than IPv4 in the case of ANY,
> but I suspect (untested) the same problem will occur.  For IPv4 the
> problem lies in:
>
> 197.96.209.0/24 proto kernel scope link src 197.96.209.251
> default via 197.96.209.252 metric 6
>
> So when the default route is selected, the default src for the local LAN
> aplies, which is .251.  I do have a mechanism that can work around this,
> which I call rtdaemon.  It's basically given a pcap filter, and it will
> dynamically add routes to the routing table to influence the source
> address selection, eg:
>
> ip ro ad 165.16.203.126/32 via 197.96.209.252 src 197.96.209.1
>
> I'd prefer to avoid 1500+ routes in my routing table if possible, which
> is what I currently have on systems where that is deployed (completely
> different use case, and the below "concern" doesn't apply.
>
> Assuming that 165.16.203.126 only needs to communicate with a single IP
> address on my side this works.  Unfortunately ... I really am starting
> to develop a severe distaste for NAT and ISPs that won't bother giving
> their clients publicly routable IPs, but I do understand the IPv4
> depletion problem too so won't be too harsh on them.
>
> My PJSIP config has ten transports declared (IPv4+IPv6) x (udp, tcp,
> tls, ws, wss), of which at the moment I'm only using IPv4 udp + tcp,
> I'll only post the UDP and TCP ones here:
>
> [pjsip-udp](!)
> type=transport
> protocol=udp
> allow_reload=yes
>
> [pjsip-tcp](!)
> type=transport
> protocol=tcp
> allow_reload=yes
>
> [pjsip-4]
> local_net=192.168.0.0/16
> local_net=10.0.0.0/8
> local_net=172.16.0.0/12
>
> [pjsip-udp6](pjsip-udp)
> bind=[2c0f:f720:0:2::197.96.209.1]:5060
>
> [pjsip-tcp6](pjsip-tcp)
> bind=[2c0f:f720:0:2::197.96.209.1]:5060
>
> [pjsip-udp4](pjsip-udp,pjsip-4)
> bind=197.96.209.1:5060
>
> [pjsip-tcp4](pjsip-tcp,pjsip-4)
> bind=197.96.209.1:5060
>
> chan_sip is only bound to the IPv4 address:
>
> udpbindaddr=197.96.209.1:5059
> tcpbindaddr=197.96.209.1:5059
>
> So in my use case things are actually pretty simple:
>
> I always want exactly two candidate addresses for any given instance:
> 197.96.209.1 for IPv4, or 2c0f:f720:0:2::197.96.209.1 for IPv6.  ANY is
> not an option due to address selection at kernel routing level picking
> the wrong addresses unless I manipulate the routing table, which will
> break (existing) use cases where I've got contact from the same external
> address to multiple addresses on my side.

Yeah, I don't see a great way around the kernel address selection
problem without dropping the wildcard binding approach and doing
individual binding on the required interfaces.

Maybe one of your alternatives could get us a little further down the
road though.

Best wishes!

-- 
Matthew Fredrickson
Digium - A Sangoma Company | Asterisk Project Lead
445 Jan Davis Drive NW - Huntsville, AL 35806 - USA