[asterisk-dev] bridge_unreal: An alternative approach to Local/Unreal channel optimization

Matthew Jordan mjordan at digium.com
Sun Mar 9 21:39:36 CDT 2014


On Sat, Mar 8, 2014 at 1:19 PM, Joshua Colp <jcolp at digium.com> wrote:
> Greetings everyone on this glorious weekend!
>
> I've had an idea bouncing around my head for the past many months on an
> alternative approach for optimizing Local/Unreal channels. This morning
> everything finally clicked and I put it together[1] (I'm still working on
> it/tweaking it, but it DOES work).
>
> The traditional approach has been to collapse the chain of Local channels
> down until you are left with the minimum amount required. Unfortunately
this
> can be rather complex and error prone as you need to go through the entire
> chain and then figure out the best way to accomplish this (keeping in mind
> juggling multiple locks and potentially multiple bridges). You also end up
> needing to give information when this happens so consumers know what is
> going on.

In any of the code bases, this is a difficult and complex thing to do. In
12, while I'm not sure we made the problem worse, we certainly didn't make
it any better.

In order to optimize, each Local channel half has to first determine if
they even can optimize. If they are in a bridge with multiple participants,
there are ways in which they can - either by a bridge merge or a bridge
swap. (Merge puts two multi-party bridges together; swap moves a single
participant into another bridge (single or multi)). If they can they then
have to synchronize with the other half, lock both bridges that the halves
are in - including all of the participants (via the bridge lock) - then
move a lot of channels around.

To date, the Local channel optimization test - which collapses 150 Local
channels - is the number one failing test in the test suite. Weird timing
errors cause weird errors. While I'm confident we'll get to the bottom of
all the edge cases, it is very, very, very complex.

We eliminated the vast majority of masquerades - but this particular
operation is, in many ways, just as nasty.

> The bridge_unreal approach doesn't do this. It aims to optimize the path
for
> frames traveling through the chain, allowing them to skip intermediary
hops
> where they don't need to go through. This results in a very similar
> situation for the frames but does not move/change/alter/hangup the
> intermediary channels involved.

It's important to point out that optimization's goal was never the removal
of the channel. If anything, nuking Local channels has - in my opinion -
always made life more difficult for everyone, not easier.

The goal was performance - minimize the frame path. If I'm picturing this
correctly, this doesn't *quite* optimize as efficiently as completely
removing the Local channels - but it may still be sufficient.

Real-01       Local-02;1         Local-02;2         Real-03
------>     <------------->    <------------>    <-------
       \   /               \  /              \  /
        -B0-               -NLB-             -B1-
            Real-03                   Real-01

In this case - and this is assuming I understand the proposed Native Local
Bridge correctly! - Local-02;1 has as its actual destination target
Real-03, while Local-02;2 has as it actual destination Real-01. When B0
pushes a frame to Local-02;1, Local-02;1 knows that it should just pass it
on to its destination. Rather than passing to its bridge, it writes
directly to Real-03. The same happens in reverse for Real-03 to Local-02;2.

Creating a chain of these works by the real 'endpoints' getting passed down
the chain of Local channels via control frames.

There's two issues I can see with this - one minor, one maybe not.
(1) There's a small amount of work here that occurs by the Local channel
passing the frame on to its destination channel. It's minor, but it would
be slightly more work than what occurs during today's optimization.
(2) More seriously: I wonder if the destination shouldn't be a channel but
a bridge. The above optimization cannot work for multi-party bridges: there
is no single channel destination. Today's optimization does work in that
scenario via a bridge swap - the single party on one end gets swapped with
the Local channel in the multi-party bridge. This really is a minor case -
the idea of optimizing channels into multi-party bridges is admittedly
ridiculously new - but it may be useful to think through this use case.

> It does this by passing each far end channel through the entire chain with
> each intermediary hop storing them and the next hop in the chain examining
> and forwarding them on over and over. Once this completes each end has the
> channel that is at the far end and is able to queue frames onto it
directly,
> bypassing the intermediary hops. This happens over time (less than a
second,
> I'm not talking minutes here) but leads to eventual optimization. Even in
a
> compromised optimized state frames will still flow as expected.
>
> This also works perfectly fine when a hop uses /n and wishes to remain in
> the path of frames. Each side of that hop will optimize themselves and
skip
> any intermediary hops. (Although, since channels stick around... when
would
> you need to use /n? Hrm...)
>

I would think you'd need it if you had a hook that needed the audio on that
Local channel - such as a MixMonitor.

In general, I prefer this approach over our current, for a few reasons:

(1) Things are now consistent. The relationship between Local channels is
now explicit as opposed to implicit, and the events that are raised in
relation to them are now the same as all others.

(2) Things are simpler - or at least, complex in the same areas as 'normal'
channels. It's easy to be complex and fast (current Local channel
optimization) and also dead wrong (dead locks, ref leaks, and other
badness). I think Local channels end up being wrong more than we'd like to
admit. Because this relies on the bridging core to get things right, if the
bridging core is right, Local channels will be right.

One other major downside to this: it will change our model of Local
channels in a breaking fashion. Some events will no longer occur (such as
the optimization begin/end events), and new ones will definitely start to
happen (BridgeEnter/Leave messages for the new native bridge). That may be
worth doing - but this isn't a habit we should get into.

Matt

-- 
Matthew Jordan
Digium, Inc. | Engineering Manager
445 Jan Davis Drive NW - Huntsville, AL 35806 - USA
Check us out at: http://digium.com & http://asterisk.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20140309/6419e027/attachment.html>


More information about the asterisk-dev mailing list