[asterisk-dev] GPU Audio Codec Transcoding within Asterisk PBX

Fri Jan 2 11:06:07 CST 2009

On Thu, 2009-01-01 at 21:46 -0500, Joseph Benden wrote:
> - CUDA requires the same thread, in a multi-threaded application,  
> operate all aspects of the GPU. Multiple threads may create many  
> different contexts with CUDA; however, performance will decrease as  
> they contend with each other.

Good work.
I do think that CUDA has a good potential for increasing transcoding
capacities in an asterisk server.

I have been working a bit on EC (OSLEC mainly) and G729 on CUDA and the
main stumbling block i found is the above, when trying to use CUDA from
within asterisk.
The rest of the issues you mention are because (as with SIMD) most of
algorithms/code need to be re-written in certain places mainly due to
that fact that you need to "align" data and processes to make use of the
max bw available (and CUDA's way of doing things).
In general i tend to think of the GPU's as a very "large" SIMD
co-processor. 

For the EC there is an added problem that you need to access the CUDA
API from kernel space but haven't figure how to do that yet.
(OLSEC is rather easy to test on user-space)

The best solution i found is to have a daemon like process that accepts
transcoding requests coming from asterisk (or elsewere) and let it
handle all scheduling to/from the CUDA API, creating process packages
that are tailored to the GPU(s) you have.
There is a "break-even point" i.e the overhead you added by the daemon
+CUDA API is rather large if you have to handle a small number of
channels but as channels increase it gets much better.

-- 
Stelios S. Koroneos

Digital OPSiS - Embedded Intelligence

Tel +30 210 9858296 Ext 100
Fax +30 210 9858298
http://www.digital-opsis.com