[asterisk-dev] A problem with Hash Tables...

Sat Sep 8 02:04:43 CDT 2007

On Fri, Sep 07, 2007 at 09:53:39PM -0600, Steve Murphy wrote:
> Luigi--
> 
> I have a big Hash Table problem, and your ast_obj2 is part of it! ;)
> You see, I've developed a competing Hash table implementation (I call
> them hashtabs), and now I have to decide whether to scrap mine, merge
> the neat stuff in mine into yours, or vice-versa, or let the two coexist
> peaceably.
> So, perhaps you can help me with this baffling situation!

Hi,
first of all, do you have a pointer to your hashtab code (is it in
the datastructs branch ?), and especially to the performance test
code where astobj2 is much slower, so we can see where is the
performance problem).

The main goal of astobj2 was to be as safe as possible in presence
of concurrent usage - this in turn focused on two parts:
1. making sure, as much as possible, that we never follow a stale
   pointer (as it could happen if e.g.  objects are deleted from a
   container or destroyed by another thread);
2. reducing the risk of deadlock, by not requiring to grab object locks
   on certain operations, e.g. when using iterators.

Refcounting is a key feature to implement these two features, so
it is certainly something that we should not lose, even if expensive
(because it involves atomic ops, each of which costs maybe 200 clock
cycles or more, as it has to punch through all caches).

Coming to the limitations of astobj2 that you mention,
i think they can be retrofitted relatively easily:
- Container resizing is relatively straightforward and you are
  probably right that is is a useful function to have - even if
  astobj2 objects cannot be resized, one can always allocate the
  bucket array independently, same as it is done for individual
  bucket entries). Indeed, this is definitely one thing worth doing.
- certainly the use of doubly linked lists for
  buckets is straightforward, and it will improve delete performance.
  But, given the memory overhead it brings in, I would like to
  see some performance numbers to see how much it helps.

As for iterators and the list of all objects, the issue is
a lot more complex and i am not sure that your hashtabs are correct.

Again, astobj2 has been designed so that iterators
do not depend on the status of the container, nor they need to
hold any lock while they exist. If, as in your case

http://svn.digium.com/view/asterisk/team/murf/datastructs/include/asterisk/hashtab.h?revision=80538&view=markup

the iterator has only a pointer to the bucket, you are in trouble
because the bucket could disappear from the container, and the iterator
does not know, and next time you use it you will reference invalid memory
(in fact, i wonder how you did not hit this problem in your tests).

This is the reason why astobj2 iterators are more complex, and have
object id and container version number, and only point to the current
object (and even that, protected by a version number!),
never to the 'next' one whose fate is unknown, nor to a 'bucket'
which is a container element which also may disappear while we
are busy on one object.

Of course this is more expensive to manage, and if one object is removed
from a cointainer while an iterator is on it, finding the next one
is more complex. This is not to say that our technique based on
version numbers is the only possible one - we could do something
more effective, but still expensive.

All of this said, i am certainly very interested to understand why
astobj2 performance is so much worse in this case you mention below
(of course to be repeated once you have fixed the vulnerability in
your code, otherwise there is nothing to compare):

> Mine is noticeably faster in some situations. For a test set of 10
> independent threads, each doing 100,000 random operations on the hashtab
> (which slowly builds from 0 to ~220K objects), mine takes about 2.17
> sec, and ast_obj2 takes about 3.32 sec. With a 1% mix of complete
> traversals, mine goes to 52 secs, and ast_obj2 takes about 15 minutes. I

and certainly open to suggestions on how to implement resizable containers
(considering the impact this operation may have on iterators).

	cheers
	luigi

[1] a note about iterators (which is probably worth adding to the
    source code, as it is tricky to understand in some parts)

In astobj2, we designed iterators so they can be manipulated without
holding any lock on the container or creating a refcount to it.
The rule of the game are that in order to create an iterator you
need a reference to a container, but other than that you can dispose
of the iterator without telling anyone; and that when ao2_iterator_next()
returns an object, it also gives you a refcount to the object.

In order to implement this, iterators must be very careful in playing
with pointers. They can store a pointer to the container as it will not
disappear as long as we keep a reference to it. They can store a pointer
to the current object, _but_ this must be protected by a "container version"
field to make sure that we reference it only if we are 100% sure that
the pointer is still valid. Furthermore, this requires locking/unlocking
the cointaner every time we iterate on it.

When the current object is removed from the container (and as far
as i can tell, this is a case that your code is unable to handle),
astobj2 cannot follow pointers anymore, and can only scan the list
in the current bucket (or the full list, if we decide to implement
one) to find which element was next in the container. This is where
the 'object_id' comes handy. This is an expensive operation, hopefully
an infrequent one (because this sort of operation can be done more
efficiently using ao2_callback()), but having a double linked
list is of no help here - what we would need to reduce complexity
would be a tree where, given an object_id, we can find in O(log N)
time the next object.

An alternatives to make iterators more efficient after a deletion
of the current object would be to give iterators a refcount to the
current bucket entry, so even if the object is deleted from the
container, the bucket entry remains alive. But this
would require iterators to be explicitly released.
I am surely interested in evaluating this approach if it can be
beneficial in terms of performance - it still requires
to lock/unlock the container on each ao2_iterator_next(),
but removes the need for a list traversal in case the current
object is deleted, and perhaps removes the need for the
'object_id' aka 'version' field in the iterator.

	cheers again
	luigi