[asterisk-dev] Asterisk Media Performance
Joshua Colp
jcolp at digium.com
Mon Jan 6 11:54:53 CST 2014
Greetings everyone,
I've been doing some profiling over the past few days on Asterisk 12
with a focus on media. I've uncovered some stuff but before I get into
things let's go back in time a bit... to when things were simpler.
In the old times media formats were represented using a bit field. This
was fast but had a few limitations. We were limited on how many there
were and they also did not include any attribute information. It was
strictly a "this is ulaw". This was changed and ast_format was created,
which is a structure that contains additional information. Additionally
ast_format_cap was created to act as a container and another mechanism
was added to allow logic to be registered which performed format
negotiation. Everywhere throughout the codebase the code was changed to
use this strategy but unfortunately this came at a cost...
1. Copying formats is no longer cheap.
The ast_format structure is approximately 320 bytes in size. This
unfortunately stems from the fact that the structure is not a reference
counted astobj2 object. You can't attach other data to the structure
with the expectation that it will be disposed of, thus there is a
reasonable amount of room for storing arbitrary attribute information.
You might think... "gee, we don't copy formats that often". You would be
incorrect. Copying formats can be done 5 or more times when passing a
media frame through Asterisk.
2. Comparing formats is no longer cheap.
The act of comparing two formats now includes a hash table lookup to
determine if additional logic should be called for a deeper attribute
comparison.
Yet again you might think... "gee, we don't compare formats that often".
Code throughout Asterisk is reactive in many places to format changes.
Instead of the ingress indicating the incoming format has changed
multiple places make the determination by doing a format comparison.
This can occur 4 or more times when passing a media frame through Asterisk.
The reason the lookup is done when the comparison is done is because you
can't attach the underlying logic to the format itself, since no
guarantee exists it would be disposed of.
3. Finding associated information using a format is no longer cheap.
As discussed above comparing formats is not cheap so using them as a key
for something hit constantly is expensive. This happens in the RTP layer
when going between RTP payload and Asterisk format.
4. Code assumes the above is cheap
A lot of code is written under the assumption that the above operations
are cheap, and thus some things are done in an unoptimized fashion.
Instead of setting up data structures with the correct information when
allocating them they are continually set. An example of this are the
format modules. When reading from a ulaw file each individual frame
results in a format copy, instead of the format being copied once when
the file is opened. This also includes translation. Every time a
translated frame is returned an ast_format_copy happens.
Put all this together and you have an impact on CPU consumption when
passing frames, and even when setting up calls since that involves
format negotiation.
To give some scope of just how much of an impact all this makes: on my
quad core machine with 100 simultaneous channels doing a Playback
(without transcoding but with bidirectional media) I was able to reduce
the CPU usage by 8-10% by reducing format copies, reducing format
comparisons, and tweaking RTP<->Format lookups. This wasn't with
completely changing everything, which I'm sure would reduce CPU usage
even more.
So how can we improve this and make it better?
1. Make ast_format an ao2 object
I think what needs to happen is that ast_format becomes an immutable ao2
reference counted object. Copying becomes bumping the reference count
and returning it.
Additional related stuff can be attached and guaranteed that it will be
disposed of. This can include an actual list of attributes, and a
pointer to the format negotiatior. As a result operations become faster
to do and memory usage goes down.
2. Audit format usage
Times have changed and what we can do with Asterisk has also. We need to
look at how we are using formats internally and improve/optimize/change
them. A perfect example is the format one I used previously. Even though
copying an ast_format would become cheap there is no need to do it on
every read frame. Everything format related should be fast and quick.
3. Be less reactive
We need to determine something has changed once (such as the format of
incoming media) and notify everything else involved. Reacting using the
same (or potentially more expensive) comparison logic at different
points in the chain is not needed.
All of the above spans the entire code base. It's big. What we gain
though is a faster media architecture that is slimmed down. I think it's
worth it.
What about everyone else? What do you think?
Side note: After reading this back I've discovered I have a theme
without realizing it! If you can do something only once, do it only once.
Cheers,
--
Joshua Colp
Digium, Inc. | Senior Software Developer
445 Jan Davis Drive NW - Huntsville, AL 35806 - USA
Check us out at: www.digium.com & www.asterisk.org
More information about the asterisk-dev
mailing list