[Asterisk-Dev] potentially nasty bugs in string pools usage
Kevin P. Fleming
kpfleming at digium.com
Fri Jan 6 10:44:50 MST 2006
Luigi Rizzo wrote:
> To come to the problem: i kind of understand the motivation for
> using a single pool -- this way, one does not have to call
> a malloc() for each string in a large object e.g. sip_pvt and the like.
Right, and we also don't have to waste 75% of the structure allocated
space in the normal case where not all the fields are used or are
populated with shorter strings.
> - at least in the current implementation, setting a string field
> may cause a realloc(), which in turn may change the base addresses
> of all string fields in the group, which in turn means that
> we cannot safely use these string fields by reference if
> something calls ast_string_field_set() in the middle.
Agreed. I will work on a solution for this... I have some thoughts on
ways we can avoid it.
> - at least in the current implementation, ast_string_field_index_build()
> ends up calling s[n]printf() twice, which is very prone to
> errors e.g. when computing the arguments has side effects,
> e.g. x++ . Additionally, s[n]printf() is an expensive function
> so one would rather not have to pay the price twice.
> This can be optimized a bit (e.g. by keeping track, for each
> string field, of the available buffer space, and trying to reuse
> it if possible, so only one s[n]printf is needed.
> But the problem with side effects is actually aggravated because
> at this point one wouldn't know if the argument has been evaluated
> once or twice.
This is very true, but the problem can be solved by making the macro
pass the arguments to a function instead, so they won't be evaluated
more than once. I'll take care of that in a few minutes.
How can you avoid calling snprintf twice? There is no way to have any
clue how much space the resulting formatted string is going to require.
Knowing how much space the string field is already using won't help,
because it might be the first time it has been set. I could see how
storing the known buffer size could help in the sense that we could call
snprintf() on the existing buffer and hope it works out... I'll think
about that.
> - at least in the current implementation, freeing a string
> does not allow to recover the storage (this can be overcome,
> but i doubt we want to do that);
Correct. The vast majority of the usage of this API will be for strings
that are set once (or twice) in the lifetime of the structure.
> Given all the above, i do not feel very safe in using a single pool
> of memory for all strings.
It's not a single pool for all strings, it's a pool per structure instance.
> As a first step, I'd rather change the allocator so that
> each string is allocated independently, and pay the price for the
> extra malloc's() while the new macros start to get used in the code.
> At a later time, we can always replace the allocator underneath with
> one that may use pools if needed, but trying to address the
> correctness problem related to realloc and ast_string_field_index_build(),
> and maybe even to memory efficiency if we really want to.
For a short time it was implemented this way while I worked on the API
itself; it worked, but it was slower and obviously used more memory. I'd
rather try to work out the issues with the current implementation, since
this is a development branch it's allowed to be 'broken' while we find
the best solution.
More information about the asterisk-dev
mailing list