[asterisk-dev] [Code Review]: Deadlocks dealing with dialplan hints during reload.
rmudgett
reviewboard at asterisk.org
Fri Jul 8 16:23:47 CDT 2011
> On July 8, 2011, 4:01 p.m., David Vossel wrote:
> > I've looked over the code and functionally I don't see any problems. I'm still having a bit of trouble wrapping my head around why we need the new context_merge_lock though. If locking order inversion is taking place I don't understand how creating a new lock helps the situation. Can you clarify in a little more detail how this helps?
The ast_merge_contexts_and_delete() function restructures the contexts and hints which is why nobody else should be doing anything with those containers when it is doing its job. Part of what ast_merge_contexts_and_delete() does is to remove all watcher callbacks from an old hint and then put them back into a new merged hint.
Since handle_statechange() cannot notify the watchers with the normal locks held because of deadlocks, a new lock is needed to ensure that those two functions do not interfere with each other. The other functions use the hints container lock to do the same thing.
- rmudgett
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/1313/#review3844
-----------------------------------------------------------
On July 8, 2011, 11:01 a.m., rmudgett wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/1313/
> -----------------------------------------------------------
>
> (Updated July 8, 2011, 11:01 a.m.)
>
>
> Review request for Asterisk Developers, Russell Bryant and David Vossel.
>
>
> Summary
> -------
>
> There are two remaining different deadlocks reported dealing with dialplan
> hints.
>
> The deadlock in ASTERISK-17666 is caused by invalid locking order in
> ast_remove_hint(). The hints container must be locked before the hint
> object.
>
> The deadlock in ASTERISK-17760 is caused by a catch-22 situation in
> handle_statechange(). The deadlock is caused by not having the conlock
> before calling the watcher callbacks. Unfortunately, having that lock
> causes a different deadlock as reported in ASTERISK-16961.
>
> * Fixed ast_remove_hint() locking order.
>
> * Made handle_statechange() no longer call the watcher callbacks holding
> any locks that matter.
>
> * Made hint ao2 destructor do the watcher callbacks for extension
> deactivation to guarantee that they get called.
>
> * Fixed hint reference leak in ast_add_hint() if the callback container
> constructor failed.
>
> * Fixed hint reference leak in complete_core_show_hint() for every hint it
> found for CLI tab completion.
>
> * Adjusted locking in ast_merge_contexts_and_delete() for safety.
>
> * Added context_merge_lock to prevent ast_merge_contexts_and_delete() and
> handle_statechange() from interfering with each other.
>
>
> This addresses bugs ASTERISK-17666 and ASTERISK-17760.
> https://issues.asterisk.org/jira/browse/ASTERISK-17666
> https://issues.asterisk.org/jira/browse/ASTERISK-17760
>
>
> Diffs
> -----
>
> /branches/1.8/main/pbx.c 327043
>
> Diff: https://reviewboard.asterisk.org/r/1313/diff
>
>
> Testing
> -------
>
> * Did some calls that changed the state of some hints.
> * Reloaded the dialplan.
>
>
> Thanks,
>
> rmudgett
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20110708/1ec71b95/attachment.htm>
More information about the asterisk-dev
mailing list