[asterisk-dev] Hoard Memory Allocator

Sun Jun 10 20:08:24 MST 2007

On Mon, 2007-06-04 at 10:23 -0500, Russell Bryant wrote:
> I saw this library mentioned in an article I read recently.  Here is a 
> copy/paste description from their web site:
> 
> http://www.hoard.org/
> 
> "The Hoard memory allocator is a fast, scalable, and memory-efficient 
> memory allocator. It runs on a variety of platforms, including Linux, 
> Solaris, and Windows. Hoard is a drop-in replacement for malloc() that 
> can dramatically improve application performance, especially for 
> multithreaded programs running on multiprocessors. No change to your 
> source is necessary. Just link it in or set just one environment 
> variable (see Using Hoard for more information)."
> 
> People running Asterisk on Solaris have reported significant scalability 
> improvements by using the mtmalloc library.  For example:
> 
> http://www.thrallingpenguin.com/articles/asterisk-solaris.htm
> 
> The Hoard web site claims that hoard is even better than mtmalloc.  If 
> anyone is interested in doing some load testing, I would really be 
> interested in seeing what kind of improvements this would do for an 
> Asterisk system.
> 

I would imagine that such a memory allocator would provide a marked
performance improvement but it would be nice to see some benchmarks
against the Asterisk code base.

Along these same lines would be to look at compiling Asterisk with
compilers specifically made for the underlying chip architectures.  What
I mean by this is to use a compiler such as those from Intel/Sun which
will optimize more for the underlying architecture than will just plain
gcc.

I know there was some discussion about Asterisk on Sun systems using Sun
compilers at AstriDevCon, Stephen Uhler could speak more to this than
could I.

Back when I was building Beowulf HPC clusters, we would see almost 100%
efficiency, in terms of FLOPs from a given CPU, when running HPL
compiled using Intel compilers on Xeon based systems vs 60-70% using
plain gcc on a Xeon system.

All sorts of games can be played when trying to benchmark code, it
really just comes down to does a given architecture or compiler better
fit your code and what you're trying to accomplish.

Additionally, at some time there is a point of diminishing return.  The
question has to be asked, It is more economical to throw more/faster
hardware at the problem instead of spending hundreds of hours trying to
super optimize code?

In the end, you have to pick the right tool(s) for the job and ask if
the desired result(s) can be obtained from the given code base on the
given hardware.

-Curt