[asterisk-dev] OpenCL for improved performance, transcoding and conferencing

Fri Sep 24 23:34:44 CDT 2010

> Message: 1
> Date: Fri, 24 Sep 2010 21:27:34 +0800
> From: Steve Underwood<steveu at coppice.org>
> Subject: Re: [asterisk-dev] OpenCL for improved performance
> 	transcoding and conferencing
> To: asterisk-dev at lists.digium.com
> Message-ID:<4C9CA746.6020309 at coppice.org>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
>    On 09/24/2010 06:02 PM, Chris Coleman wrote:
>    
>> Steve, thanks for the input.
>>
>> You encouraged me to delve deeper.
>>
>> So, I did, and have some good news.
>>
>> There is a company in the UK that makes and sells EXACTLY the kind of
>> thing I'm talking about.
>>
>> It is a general purpose GPU, on a PCIe card, with a module for asterisk,
>> made to accelerate and offload computation for transcoding and
>> conferencing !!
>>
>> The general-purpose GPU it uses is the IBM CELL processor, same as in
>> the Xbox 360 and Playstation 3.
>>
>> They talk about power savings, and allowing something like 460 channels
>> of transcoding, from for example gsm to g.729, without bringing the CPU
>> to its knees transcoding the audio, because the GPU is SO MUCH better
>> suited to this math work of transcoding.
>>
>> Here is the source I'm quoting:
>>
>> http://www.youtube.com/watch?v=0dnFD_vaJ6s
>>
>> Would like to have the opinion of the group.
>>
>> Maybe someone feels up to the challenge of implementing some test code....
>>      
> Howler are out of business, but they didn't make that card. Its
> available from Leadtek. The Windows and Linux SDK is free, and you can
> download it and experiment with the potential of the Cell processor for
> speeding up algorithms. I bought one a few months ago to experiment
> with, and its fairly easy to achieve interesting levels of performance.
> Sadly...
>
> - the Linux SDK is 32 bit only
>
> - a 64 bit Linux SDK will not be made available
>
> - the kernel driver module is supplied as object code, so it can only be
> run with supported kernels (a couple of RHEL/Centos revisions)
>
> - source code is not available for most of the SDK, so 64 bit support
> can't be developed by the user.
>
> So, at the of the day the whole thing looks like a dead end.
>
> The Cell is *nothing* like an nVidia or ATI GPU. It is a far more
> general purpose compute engine. Its much closer to the currently stalled
> Larrabee project at Intel. It is a very good platform for things like
> G.729. A quad core Xeon can easily do more G.729 channels than the Cell
> based chip (actually a Toshiba Spurs Engine chip) on these cards.
> However, the card takes<20W, and working alongside the main quad core
> CPU it is capable of achieving a pretty reasonable balance.
>
> Steve
>    

Steve, again I really appreciate the insight.

It sounds like this Leadtek board I discovered is the the same one you'd 
been referring to.  Good stuff...

Then I had a question: just how much higher math performance do u get on 
the ION gpu vs. the cpu ??

Quick search and I'm seeing 2.1 GigaFLOPS on the Atom's inbuilt math 
unit.  50 GigaFLOPS on the DirectX 10.1/CUDA/OpenCL-enabled GT218 
graphics chip aka Nvidia ION GP-GPU.

Atom's 1.66 GHz clock speed means it can crunch 1-1.5 floating point 
operation per clock cycle. <--- that's why the Atom is so easy to 
saturate and bog down when transcoding and or conferencing.

The ION can crunch about 30 per CPU clock... freeing up the cpu to do 
other stuff.

That 50 GigaFLOPS of the ION (or any compatible computing unit that 
OpenCL is able to detect) is looking pretty darn attractive compute 
engine to tap into... and it would be a waste of computing resources, as 
well as energy, and pollution/carbon footprint, NOT to.

I did a search and found that this EXACT issue -- was brought up 3.5 
years ago on this list, March 2007 -- using GP-GPU for 
codecs/conferencing with the Nvidia 8800GT.

http://lists.digium.com/pipermail/asterisk-dev/2007-March/026431.html

I would be curious to see that code -- or a more udpated verion -- using 
TODAY's GP-GPU libraries to talk to the ION, a 2 years newer gp-gpu chip 
-- and see how it performs.

Only then will we REALLY have the answer, to the question -- how much 
will the asterisk community benefit from 50 GigaFLOPS of free GP-GPU 
horsepower offered by the Nvidia ION ??

It's hard to know, until you try it out and see...

Chris