[asterisk-dev] Asterisk Media Performance

Mon Jan 6 11:54:53 CST 2014

Greetings everyone,

I've been doing some profiling over the past few days on Asterisk 12 
with a focus on media. I've uncovered some stuff but before I get into 
things let's go back in time a bit... to when things were simpler.

In the old times media formats were represented using a bit field. This 
was fast but had a few limitations. We were limited on how many there 
were and they also did not include any attribute information. It was 
strictly a "this is ulaw". This was changed and ast_format was created, 
which is a structure that contains additional information. Additionally 
ast_format_cap was created to act as a container and another mechanism 
was added to allow logic to be registered which performed format 
negotiation. Everywhere throughout the codebase the code was changed to 
use this strategy but unfortunately this came at a cost...

1. Copying formats is no longer cheap.

The ast_format structure is approximately 320 bytes in size. This 
unfortunately stems from the fact that the structure is not a reference 
counted astobj2 object. You can't attach other data to the structure 
with the expectation that it will be disposed of, thus there is a 
reasonable amount of room for storing arbitrary attribute information.

You might think... "gee, we don't copy formats that often". You would be 
incorrect. Copying formats can be done 5 or more times when passing a 
media frame through Asterisk.

2. Comparing formats is no longer cheap.

The act of comparing two formats now includes a hash table lookup to 
determine if additional logic should be called for a deeper attribute 
comparison.

Yet again you might think... "gee, we don't compare formats that often". 
Code throughout Asterisk is reactive in many places to format changes. 
Instead of the ingress indicating the incoming format has changed 
multiple places make the determination by doing a format comparison. 
This can occur 4 or more times when passing a media frame through Asterisk.

The reason the lookup is done when the comparison is done is because you 
can't attach the underlying logic to the format itself, since no 
guarantee exists it would be disposed of.

3. Finding associated information using a format is no longer cheap.

As discussed above comparing formats is not cheap so using them as a key 
for something hit constantly is expensive. This happens in the RTP layer 
when going between RTP payload and Asterisk format.

4. Code assumes the above is cheap

A lot of code is written under the assumption that the above operations 
are cheap, and thus some things are done in an unoptimized fashion. 
Instead of setting up data structures with the correct information when 
allocating them they are continually set. An example of this are the 
format modules. When reading from a ulaw file each individual frame 
results in a format copy, instead of the format being copied once when 
the file is opened. This also includes translation. Every time a 
translated frame is returned an ast_format_copy happens.

Put all this together and you have an impact on CPU consumption when 
passing frames, and even when setting up calls since that involves 
format negotiation.

To give some scope of just how much of an impact all this makes: on my 
quad core machine with 100 simultaneous channels doing a Playback 
(without transcoding but with bidirectional media) I was able to reduce 
the CPU usage by 8-10% by reducing format copies, reducing format 
comparisons, and tweaking RTP<->Format lookups. This wasn't with 
completely changing everything, which I'm sure would reduce CPU usage 
even more.

So how can we improve this and make it better?

1. Make ast_format an ao2 object

I think what needs to happen is that ast_format becomes an immutable ao2 
reference counted object. Copying becomes bumping the reference count 
and returning it.

Additional related stuff can be attached and guaranteed that it will be 
disposed of. This can include an actual list of attributes, and a 
pointer to the format negotiatior. As a result operations become faster 
to do and memory usage goes down.

2. Audit format usage

Times have changed and what we can do with Asterisk has also. We need to 
look at how we are using formats internally and improve/optimize/change 
them. A perfect example is the format one I used previously. Even though 
copying an ast_format would become cheap there is no need to do it on 
every read frame. Everything format related should be fast and quick.

3. Be less reactive

We need to determine something has changed once (such as the format of 
incoming media) and notify everything else involved. Reacting using the 
same (or potentially more expensive) comparison logic at different 
points in the chain is not needed.

All of the above spans the entire code base. It's big. What we gain 
though is a faster media architecture that is slimmed down. I think it's 
worth it.

What about everyone else? What do you think?

Side note: After reading this back I've discovered I have a theme 
without realizing it! If you can do something only once, do it only once.

Cheers,

-- 
Joshua Colp
Digium, Inc. | Senior Software Developer
445 Jan Davis Drive NW - Huntsville, AL 35806 - USA
Check us out at:  www.digium.com  & www.asterisk.org