[asterisk-dev] Dialplan proposal-- Killing bug 6002
Steve Murphy
murf at digium.com
Tue Feb 26 12:49:30 CST 2008
For those interested in core dialplan issues, read on...
In response to bug 6002, submitted ages (aeons) ago, by Luigi,
in that duplicate contexts could end up in the dialplan...
Matt O'Gorman made a change long ago in response, that forced
extensions.conf and extensions.ael to load early, which spared
us all from the consequences of this bug, for the most part.
But Luigi wanted the bug kept open and fixed properly, as the window
for failure is still there.
And since 6002 is one of our older, more revered/feared/hated/complex
issues, we wanted to finally see an end to it, and I was asked to
review the issue and see what it would take to finally kill and
bury it. (It was also assigned to me way back when...)
So finally, it's time pull out the bug spray and fly swatter.
As far as I can tell, there are 3 places where a duplicate context
can be created:
1. AEL module
2. extensions.conf module
3. app_queue module
All the other modules that want to create a context check for
an existing context first.
By advancing the first two in the load order, we have been spared
problems for the most part, but problems remain, because it is possible
to enter extensions and priorities into contexts registrar'd by other
modules. These tend to get deleted on reloads, and this is not good.
Plus, Luigi pointed out that you can force the order of loading, which
can lead to unexpected problems.
AEL & extensions.conf need to be able to specify contexts that might
already exist, in order to add to them, especially automatically
generated contexts, like regcontext in the channel drivers (sip, iax,
etc).
During a dialplan reload (ael reload, extensions reload, etc), in order
to avoid long lockups where the dialplan cannot be accessed, the
function ast_merge_contexts_and_delete() func was written, basically to
delete all contexts (and everything in them) 'registrar''d to the
current
module, and then append the proposed dialplan to whatever might remain.
It was fairly fast, and has sufficed except in extremely huge dialplans.
The goals of any changes are:
(a) no two contexts will have the same name
(b) the speed of ast_merge_contexts_and_delete is not compromised
(c) the order of loading makes no difference
But, with the advent of hash-table based dialplans, any mucking with
merge_context_and_delete() has to be done carefully, and with
forethought.
Thus, I reveal my designs to all of you, and request that you
cast your eyeballs upon them, and see if I forgot any major
concern.
New Principles to Guide the Dialplan:
1. All modules that request to form a context, do so if the
context does not already exist, including AEL and PBX_CONFIG.
They all do the find_or_create thing. No more existsokay.
If a context already exists, it is added to (or [re]used).
2. Ownership of a context can be recorded via a
refcount; when a module would have created
a context, it will inc its refcount.
When a module would have destroyed a context,
it will dec its refcount. Contexts will only be deleted
if they are empty. It is no crime to have empty contexts.
Apps may need to add priorities in realtime, and it helps
performance to have the context already in existence.
(I imagine/guess)...
3. Instead of deleting whole contexts by registrar, we will
now delete priorities and extensions by registrar.
The context will be deleted only if the refcount is zero.
This might seem slower, but really, as long as we do all
the work, then lock the contexts and swap in a new dialplan,
we will be okay.
4. Since merge_and_delete will have to be re-engineered to a
degree, not only for this bug, but because of the use of
hash-tables that has been added, we can also get a bit of
performance increase, by saving the free operations until
after the new dialplan is unlocked.
5. Extens/priorities have a registrar tag associated with them.
While more than one module might lay claim to a context, only
one module can own a priority or extension. If a module wants
to add an extension that already exists, the new extension/priority
is given the new registrar, and the old registrar's ownership is
lost. You can only execute one instruction at a time...
Proposed Flow of merge_and_delete:
1. The configuration is read, and a proposed new dialplan is formed,
both in hash tables, and in the linked list format.
2. The contexts are read-locked (allowing the pbx to still keep
processing the dialplan).
Since nothing in the active dialplan is changing, there is no
need (yet) to muck with the hints.
3. We traverse the "old" list, and seek info to copy into the new:
1. contexts whose registrar is not that of the current, will
create a duplicate context struct in the proposed dialplan.
The same is done for any context whose registrar DOES match,
but the refcount is not 1.
2. Any context/prio not matching the current registrar will
be copied into the proposed dialplan. If there is a
conflict, then issue a warning as appropriate and ignore
the item in the old dialplan.
4. Obtain a write lock on the dialplan, and the hints, and do the
hint-store thing.
5. save the old dialplan hashtab and list pointers, and set them to
the proposed dialplan.
6. restore the hint data and release the locks.
7. destroy the old dialplan lists and hashtabs.
Here's an interesting fact about this algorithm-- It will
definitely take longer to accomplish than the current method,
because of collision checking, etc. --- BUT: it will only
write-lock the dialplan for a some small number of microseconds,
which is much better than the current method.
Am I missing something?
--
Steve Murphy
Software Developer
Digium
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3227 bytes
Desc: not available
Url : http://lists.digium.com/pipermail/asterisk-dev/attachments/20080226/252921a9/attachment.bin
More information about the asterisk-dev
mailing list