[Asterisk-Dev] potentially nasty bugs in string pools usage

Luigi Rizzo rizzo at icir.org
Thu Jan 5 05:28:44 MST 2006


first of all, despite the subject, i think that the new approach
to handle string is a great step forward, because it hides the
implementation details and a lot of trouble in playing with fixed
or variable-size strings.

What i don't understand/like is the mechanism used for handling
storage, i.e. the use of a single pool of memory for all strings
in the same object, because it has the potential of causing extremely
nasty and subtle bugs in the code, as i will show below.

Nevertheless, because the approach hides implementation
details, i think this can be fixed with reasonable ease.

To come to the problem: i kind of understand the motivation for
using a single pool -- this way, one does not have to call
a malloc() for each string in a large object e.g. sip_pvt and the like.

However, look at the potentially nasty side effects:

 - at least in the current implementation, setting a string field
   may cause a realloc(), which in turn may change the base addresses
   of all string fields in the group, which in turn means that
   we cannot safely use these string fields by reference if
   something calls ast_string_field_set() in the middle.

	For instance, this specific bug is present in
	chan_sip.c::initreqprep(), where at some point the
	code saves a pointer l = p->fromuser, but then
	we can have a call to ast_string_field_set(p, fromname, n)
	thus invalidating the previous value of l, which however
	is used a few lines below.

   I haven't checked if there are other places like this, but this was
   just an example to show that even the author of this API will
   have a hard time using it because of this kind of subtle side effects;

 - at least in the current implementation, ast_string_field_index_build()
   ends up calling s[n]printf() twice, which is very prone to
   errors e.g. when computing the arguments has side effects,
   e.g. x++ . Additionally, s[n]printf() is an expensive function
   so one would rather not have to pay the price twice.
   This can be optimized a bit (e.g. by keeping track, for each
   string field, of the available buffer space, and trying to reuse
   it if possible, so only one s[n]printf is needed.
   But the problem with side effects is actually aggravated because
   at this point one wouldn't know if the argument has been evaluated
   once or twice.

 - at least in the current implementation, freeing a string
   does not allow to recover the storage (this can be overcome,
   but i doubt we want to do that);

Given all the above, i do not feel very safe in using a single pool
of memory for all strings.
As a first step, I'd rather change the allocator so that
each string is allocated independently, and pay the price for the
extra malloc's() while the new macros start to get used in the code.
At a later time, we can always replace the allocator underneath with
one that may use pools if needed, but trying to address the
correctness problem related to realloc and ast_string_field_index_build(),
and maybe even to memory efficiency if we really want to.

comments welcome, as usual.

cheers
luigi



More information about the asterisk-dev mailing list