[asterisk-dev] OpenCL for improved performance transcoding and conferencing

Wed Sep 22 05:05:34 CDT 2010

  On 09/22/2010 12:48 PM, Chris Coleman wrote:
> Hey Asterisk developrs,
>
> I love Asterisk, thanks for making it.
>
> Just got my first system up and running, I'm on PBX-in-a-Flash 1.7.5.5
> on FreePBX 2.8.0.3 on Asterisk 1.4.35.
>
> The biggest drain on CPU cycles in Asterisk, appears to be transcoding
> and conferencing.
>
> I'm connected to the internet on a deluxe DSL connection with only
> 640Kbps of upload bandwidth.
>
> To reserve as much bandwidth as possible for browsing, email, and
> attachments, I want to run VOIP codecs that consume around 16-20 Kbps
> per channel instead of 64 Kbps, with up to 4 simultaneous channels
> active (going up to sipgate and google voice)... total of 64-80Kbps
> instead of 256Kbps..
>
> My hardware platform is running 24/7 so it's a 35-40W energy-saving
> Intel Atom-based PBX server on mini-ITX, and you reach the limit of
> simultaneous transcoded calls pretty quickly.
>
> A quick search and I see that OpenCL has been ported to Linux for about
> a year now.
>
> You probably already know very well about OpenCL -- Apple's open
> standard OpenComputeLanguage that lets you use any supported
> general-purpose GPU for much faster math computing....
Sometimes faster. Often slower. It depends how well the problem fits the 
hardware.
> Dual-core Atom D510 mini-ITX motherboards with the OpenCL-compatible GPU
> Nvidia G210 (ION) are available for slight cost increase over boards
> without the ION.  For example : the Jetway NC98
>
> Theoretically the GPU is 10x more efficient in doing transcoding math,
> than the CPU... so we could get up to 10x as many transcoded and
> conferenced channels...
>
> ....resulting in lower energy consumption, and higher number of
> transcoded + conferenced channels, per PBX server, before hitting the
> limit of CPU math processing horsepower.
CUDA is far more mature than OpenCL right now. Have a lot around the 
internet at attempts to use CUDA to accelerate audio and speech codecs. 
The results aren't good. I have only seem one attempt - a poor 
performing MP3 encoder - where the developers didn't get disheartened 
and abandon work before completion.
> Does anyone else think it'd be brilliant of the * developers to update
> the transcoding and conferencing code for 2010, to use OpenCL (which
> uses any available, compatible GPU) for absolutely awesome performance ??
>
>
Awesome? Really? You have supporting evidence?

The nVidia Fermi may change things quite a bit, as it is a more general 
purpose compute engine than the earlier GPUs. I still haven't seen 
anything impressive for any computation in the ballpark of a speech 
codec, though. I bought a GTX460 card a couple of week ago to do some 
experiments, but I'm struggling to find the time to work on it.

The Toshiba SpursEngine card from Leadtek is certainly capable of 
accelerating speech codecs. The now defunct HowlerTech company was using 
them. Not too expensive. Fairly low power. The snag is they seem to be a 
dead end. Leadtek supply a 32bit Linux SDK, but say a 64 bit SDK will 
not appear. To me, that says abandonware.

Steve