[Asterisk-Dev] potentially nasty bugs in string pools usage

Luigi Rizzo rizzo at icir.org
Fri Jan 6 12:22:17 MST 2006


On Fri, Jan 06, 2006 at 11:44:50AM -0600, Kevin P. Fleming wrote:
> Luigi Rizzo wrote:
> 
> > To come to the problem: i kind of understand the motivation for
> > using a single pool -- this way, one does not have to call
> > a malloc() for each string in a large object e.g. sip_pvt and the like.
> 
> Right, and we also don't have to waste 75% of the structure allocated 
> space in the normal case where not all the fields are used or are 
> populated with shorter strings.

FWIW, for that you could just use regular malloc'ed strings.

> >  - at least in the current implementation, setting a string field
> >    may cause a realloc(), which in turn may change the base addresses
> >    of all string fields in the group, which in turn means that
> >    we cannot safely use these string fields by reference if
> >    something calls ast_string_field_set() in the middle.
> 
> Agreed. I will work on a solution for this... I have some thoughts on
> ways we can avoid it.

ok. but please also write down some commments on who is allowed to work
on the allocator and when. With malloc, you rely on malloc's implementation;
with a different allocator, the allocator needs to rely on some other lock,
but this cannot be the structure's lock because not all structs have one,
and even for those who do, it is not clear (and not documented) which
strings are protected by the lock and which are not.

> How can you avoid calling snprintf twice? There is no way to have any 

of course sometimes you can't, so you do what you have to do.
But anyways, as long as you don't evaluate the arguments twice,
this is only a performance issue.

For performance, you could just vsnprintf() into a reasonably large
local buffer (say 1k instead of 1 byte as you do now), and then
copy the result. The copy is a lot cheaper than re-parsing the
format string and running expensive conversions e.g. from numbers
to digits (anyways, rerunning the snprintf involves copying as
well).  Probably a size of 1024 will cover 99% of the cases.

If you overflow the local buffer, you can still decide to re-run
the vsnprintf, or truncate the string, on a case-by-case fashion,
by having an extra 'size' argument in the function. 0 or some special
value could be used for 'unbounded', 

BTW i don't understad why you want to use a macro for
ast_string_field_index_build() when you can just make it a var-adic
function which then calls vsnprintf as many times as needed.
You have to go through the var-adic function anyways.

See chan_sip.c::append_history_full() for an example, or the ast_log() stuff.

> > Given all the above, i do not feel very safe in using a single pool
> > of memory for all strings.
> 
> It's not a single pool for all strings, it's a pool per structure instance.

i know, i did mean for structure instance.

> For a short time it was implemented this way while I worked on the API 
> itself; it worked, but it was slower and obviously used more memory. I'd 

i don't buy how it individually malloc'ed strings could use more
memory than a single pool. unless the malloc in your system has an
unreasonably large minimum chunk size. Can you explain that ?

> rather try to work out the issues with the current implementation, since 
> this is a development branch it's allowed to be 'broken' while we find 
> the best solution.

i wish this statement would apply even to my patches with non-doxigen
comments :) http://bugs.digium.com/view.php?id=6101

But seriously: as usual, this is your baby so you pick the strategy
you see fit.

I was just commenting that since some of the bugs introduced by this
change could be rather obscure (see the realloc one and the locking
issues), personally i would first try to push the new interface in
as many places as possible, and at a later time (or in parallel,
if you like) address the performance issues underneath.
I'd be happy with a 'safe_stringfield' mode where strings are
malloced individually, and a default mode where you use your
allocator. So when a bug comes out you try to see if it is also
present in the safe mode and this can help a lot in debugging.

	cheers
	luigi



More information about the asterisk-dev mailing list