[asterisk-dev] bug 6002, new Algorithm for merge_contexts_and_delete

Thu Mar 1 13:19:50 MST 2007

On Thu, Mar 01, 2007 at 12:01:52PM -0700, Steve Murphy wrote:
> I want folks to review this before I implement it. Hate to do all this
> work,
> and end up with a lemon.
> 
> Please read bug 6002, which was introduced by rizzo last year, and try
> to understand the shortcomings of the current merge_contexts_and_delete,
> when it comes to regcontext; and see if the below algorithm will meet
> the needs.
> 
> I've posted the text below to bug 6002, but I think it would be good to
> have the community as a whole review the proposed algorithm, and let me
> know if it's flawed.
> 
> (Also, it won't hurt to look at bug 8574-- Call processing stops while
> reloading
> large dialpan)
> 
> OK, been looking at the code. I have to add one more requirement to
> merge_contexts_and_delete: that it hold the locks for less than a frame
> time. Since the freeing and destruction of list elements is the most
> time-consuming part of the operation, (or, at least, WAS), I propose
> this algorithm:
> 
> 4 lists are involved:
>     1. the list of contexts to merge into the dialplan (extcontexts)
>     2. the existing contexts (contexts)
>     3. a list containing extens to free
>     4. a list containing just contexts to free
> 
> The algorithm would go something like this:
>     1. get the conlock & hintlock
>     2. preserve the watchers as before
>     3. traverse the dp, and unlink all exten/prio that match registrar.
> Do Not
>        remove any contexts (yet). Unlink them from the contexts list,
> and
>        link them instead to the list of extens to free.
>     4. traverse the dp again, and for any empty contexts, that match
> registrar,
>        unlink from contexts, and link to the contexts to free list. this
> and #3
>        might be tied into a single traversal.
>     5. Now, for each context in extcontexts, search for a match in
> contexts.
>          (THIS MAKES ME NERVOUS IF THE DP IS BIG!)
>        if found:
>          either the context or something in it has a different
> registrar.
>          go thru the contexts entry, and relink the exten/prios into the
>          matching extcontext's entry. If there are exten
>          collisions, (THIS SEARCH MAKES ME NERVOUS IF THE DP IS BIG!) 
>          then take all the contexts prios for that exten, and insert
> them into
>          the collided extcontexts exten. Issue a warning only if there
> are
>          collisions. Keep the extcontexts version. After all the prios
> are
>          merged, then put the contexts exten into the free list.
> 
>          Now, Move the now empty contexts' context into the free context
> list.
>          Then move the merged context into the contexts list from the
>          extcontext list. 
>        if not found:
>          link the context from extcontexts to contexts. This is the
> "quick and 
>          easy" path.
> 
>     6. Restore the watchers, as is now being done.
>     
>     7. Unlock the above locks,
> 
>     8. Destroy the stuff in the to-be-freed lists
> 
>     9. Return.
> 
> This will keep the regcontexts. If the dp is big, or the extcontexts
> big, this
> operation will run dangerously slow! Having O(1) search times for all 3
> types of search (context, exten, prio) would be a big plus.
> 
> Will this be sufficient? The key is getting part 5 right.

pardon me if i am too generic but have not looked at the actual code for
a while.

It seems that the changes to the code are a bit extensive so it would
be a good thing to do them the right way, and for this i mean
keep the lock for a constant time independently of the size of the
data structure.

The general approach that i have seen used in these cases (and i
mention it because the astobj2 code i have been developing last
year already implements a lot of this, so it might be something
reusable):

+ add reference counts to individual objects to make sure that
  deletions can be done without causing panics;

+ add a version number to the high-level structures that you want
  to manipulate (either the whole dialplan or just a single context).
  Let's assume a single context is our high-level data structure.

+ when you want to 'update' a context, don't keep it locked while you
  perform a potentially long update with traversals and removals.
  Rather, do the following:
   snapshot(context_name) {
        dialplan.lock()
        head = dialplan.head(context_name);
        version = dialplan.version(context_name);
        dialplan.unlock();
        return <head, version>
   }

   replace(context_name, old_version, new_list) {
        dialplan.lock();
        if (old_version != dialplan.version(context_name)) {
                dialplan.unlock();
                free(new_list);
                return FAIL;
        }
        old_head = dialplan.head(context_name);
        dialplan.head(context_name) = new_list;
        dialplan.version(context_name) += 1;
        dialplan.unlock();
        free(old_head);
        return SUCCESS;
   }

   merge_contexts_and_delete(context_name, arguments) {
        for (i = 0; i < MAX_RETRIES; i++) {
                <head, version> = snapshot(context_name);
                temp = make_copy(head)
                new_list = expensive_merge(temp, arguments);
                if (replace(context_name, version, new_list) == SUCCESS)
                        break;
        }
   }

As you see the expensive op is outside the lock, and you let the
calls proceed while an update is in progress.

Note that, as i said, astobj2() already implements refcounts so if
you want to use it to implement the individual lists of entries
that make up a context, that is relatively easy. In fact, it even
supports navigation on lists that change dynamically.
All the rest (version numbers etc.) is rather trivial to implement.

The length of a critical section here is O(1) and the retries only
occur when there are concurrent requests to update a dialplan,
which i don't think occur too often (i may be wrong, but probably
this is limited to the regexten case which involve single entries
so it is likely that we can make special cases for these situations.

how does it sounds ?

	cheers
	luigi