[asterisk-users] Use the NEW ulaw/alaw codecs (slower, but cleaner)

Matthew Fredrickson creslin at digium.com
Mon Nov 17 17:54:26 CST 2008


Steve Underwood wrote:
> Matthew Fredrickson wrote:
>> Actually, with the way caching is done on nearly all modern processors, 
>> it is debatable whether or not a look up table is the optimal way to do 
>> the conversion, at least on such a simple codec such as ulaw or alaw. 
>> In fact, the amount of time it takes to fetch memory from a cache miss 
>> can easily ruin the single element lookup performance in a look up 
>> table.  And if you have large tables (such as in the linear to ulaw or 
>> alaw table), the tradeoff of having to service a cache miss versus a few 
>> cached instructions executing a native CPU clock speed makes it almost a 
>> no brainer (IMHO).
>>
>> You'll pay a cache miss on the first time your run the routine, but the 
>> instructions running the routine will take up much less CPU cache space 
>> than the look up tables, increasing the likelihood of them being evicted 
>> (whereas the lookup table, taking up a lot more space, has a much better 
>> chance of causing a cache miss whenever you access).
>>
>> Obviously, if you're running on a CPU with no cache, a look up table is 
>> a good way to do it.  I'm just saying that very few processors that are 
>> running Asterisk are running it on processors without processor caches.
>>
>> Matthew Fredrickson
>> Digium, Inc.
>>   
> In spandsp I do the G.711 conversions algorithmically. Most modern 
> processors have a "where is the top 1" instruction, and that reduces the 
> calculations to something very fast. When I first did this it was a lot 
> slower than a lookup if I tested it on its own, but faster in a real 
> workload where the cache was working hard. That was in the days of 256k 
> caches, though. Now the latest Intels have 12M the picture may be 
> different. That 12M is L3 cache, which is a lot slower than the small L1 
> cache, but I suspect it make mean the lookup approach is as good as 
> calculation with any workload.

Or (in continuation of my email I just sent), the better chances of it 
fitting in L1 (or event L2) cache, the quicker it's going to run :-) 
Maybe that's a better way to look at it.

Matthew Fredrickson
Digium, Inc.



More information about the asterisk-users mailing list