[Asterisk-Dev] potentially nasty bugs in string pools usage
Steve Murphy
murf at e-tools.com
Thu Jan 5 06:13:17 MST 2006
On Thu, 2006-01-05 at 05:51 -0700, asterisk-dev-request at lists.digium.com
wrote:
> first of all, despite the subject, i think that the new approach
> to handle string is a great step forward, because it hides the
> implementation details and a lot of trouble in playing with fixed
> or variable-size strings.
>
> What i don't understand/like is the mechanism used for handling
> storage, i.e. the use of a single pool of memory for all strings
> in the same object, because it has the potential of causing extremely
> nasty and subtle bugs in the code, as i will show below.
>
> Nevertheless, because the approach hides implementation
> details, i think this can be fixed with reasonable ease.
>
> To come to the problem: i kind of understand the motivation for
> using a single pool -- this way, one does not have to call
> a malloc() for each string in a large object e.g. sip_pvt and the
> like.
>
> However, look at the potentially nasty side effects:
>
> - at least in the current implementation, setting a string field
> may cause a realloc(), which in turn may change the base addresses
> of all string fields in the group, which in turn means that
> we cannot safely use these string fields by reference if
> something calls ast_string_field_set() in the middle.
>
> For instance, this specific bug is present in
> chan_sip.c::initreqprep(), where at some point the
> code saves a pointer l = p->fromuser, but then
> we can have a call to ast_string_field_set(p, fromname, n)
> thus invalidating the previous value of l, which however
> is used a few lines below.
>
> I haven't checked if there are other places like this, but this was
> just an example to show that even the author of this API will
> have a hard time using it because of this kind of subtle side
> effects;
>
> - at least in the current implementation,
> ast_string_field_index_build()
> ends up calling s[n]printf() twice, which is very prone to
> errors e.g. when computing the arguments has side effects,
> e.g. x++ . Additionally, s[n]printf() is an expensive function
> so one would rather not have to pay the price twice.
> This can be optimized a bit (e.g. by keeping track, for each
> string field, of the available buffer space, and trying to reuse
> it if possible, so only one s[n]printf is needed.
> But the problem with side effects is actually aggravated because
> at this point one wouldn't know if the argument has been evaluated
> once or twice.
>
> - at least in the current implementation, freeing a string
> does not allow to recover the storage (this can be overcome,
> but i doubt we want to do that);
>
> Given all the above, i do not feel very safe in using a single pool
> of memory for all strings.
> As a first step, I'd rather change the allocator so that
> each string is allocated independently, and pay the price for the
> extra malloc's() while the new macros start to get used in the code.
> At a later time, we can always replace the allocator underneath with
> one that may use pools if needed, but trying to address the
> correctness problem related to realloc and
> ast_string_field_index_build(),
> and maybe even to memory efficiency if we really want to.
I've written code that uses a central pool for string storage. The pool
area is a linked list of 1 meg memory blocks. Each string is registered
via a hash table. If a new string to be entered is bigger than what will
fit in the memory pool, a new block is allocated, and it is put there.
No realloc, no changing of all the string addresses. And not that hard
to implement.
In the application I did this for, I found a 10:1 ratio of commonality
of strings entered. Thus, the savings on heap space of about 90%... your
mileage may vary considerably, but in general, string pools like this
are great, they reduce memory requirements, they speed things up by not
calling malloc/free as much, they reduce the complexity of your
interface, as far as copying a string vs. storing the address of the
string type issues.
murf
--
Steve Murphy <murf at e-tools.com>
Electronic Tools Company
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3317 bytes
Desc: not available
Url : http://lists.digium.com/pipermail/asterisk-dev/attachments/20060105/6c970eee/smime.bin
More information about the asterisk-dev
mailing list