[asterisk-dev] [Code Review]: Deadlocks dealing with dialplan hints during reload.

rmudgett reviewboard at asterisk.org
Fri Jul 8 16:23:47 CDT 2011



> On July 8, 2011, 4:01 p.m., David Vossel wrote:
> > I've looked over the code and functionally I don't see any problems.  I'm still having a bit of trouble wrapping my head around why we need the new context_merge_lock though.  If locking order inversion is taking place I don't understand how creating a new lock helps the situation.  Can you clarify in a little more detail how this helps?

The ast_merge_contexts_and_delete() function restructures the contexts and hints which is why nobody else should be doing anything with those containers when it is doing its job.  Part of what ast_merge_contexts_and_delete() does is to remove all watcher callbacks from an old hint and then put them back into a new merged hint.

Since handle_statechange() cannot notify the watchers with the normal locks held because of deadlocks, a new lock is needed to ensure that those two functions do not interfere with each other.  The other functions use the hints container lock to do the same thing.


- rmudgett


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/1313/#review3844
-----------------------------------------------------------


On July 8, 2011, 11:01 a.m., rmudgett wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/1313/
> -----------------------------------------------------------
> 
> (Updated July 8, 2011, 11:01 a.m.)
> 
> 
> Review request for Asterisk Developers, Russell Bryant and David Vossel.
> 
> 
> Summary
> -------
> 
> There are two remaining different deadlocks reported dealing with dialplan
> hints.
> 
> The deadlock in ASTERISK-17666 is caused by invalid locking order in
> ast_remove_hint().  The hints container must be locked before the hint
> object.
> 
> The deadlock in ASTERISK-17760 is caused by a catch-22 situation in
> handle_statechange().  The deadlock is caused by not having the conlock
> before calling the watcher callbacks.  Unfortunately, having that lock
> causes a different deadlock as reported in ASTERISK-16961.
> 
> * Fixed ast_remove_hint() locking order.
> 
> * Made handle_statechange() no longer call the watcher callbacks holding
> any locks that matter.
> 
> * Made hint ao2 destructor do the watcher callbacks for extension
> deactivation to guarantee that they get called.
> 
> * Fixed hint reference leak in ast_add_hint() if the callback container
> constructor failed.
> 
> * Fixed hint reference leak in complete_core_show_hint() for every hint it
> found for CLI tab completion.
> 
> * Adjusted locking in ast_merge_contexts_and_delete() for safety.
> 
> * Added context_merge_lock to prevent ast_merge_contexts_and_delete() and
> handle_statechange() from interfering with each other.
> 
> 
> This addresses bugs ASTERISK-17666 and ASTERISK-17760.
>     https://issues.asterisk.org/jira/browse/ASTERISK-17666
>     https://issues.asterisk.org/jira/browse/ASTERISK-17760
> 
> 
> Diffs
> -----
> 
>   /branches/1.8/main/pbx.c 327043 
> 
> Diff: https://reviewboard.asterisk.org/r/1313/diff
> 
> 
> Testing
> -------
> 
> * Did some calls that changed the state of some hints.
> * Reloaded the dialplan.
> 
> 
> Thanks,
> 
> rmudgett
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20110708/1ec71b95/attachment.htm>


More information about the asterisk-dev mailing list