[asterisk-dev] Locking, coding guidelines addition

Fri Jun 27 17:14:02 CDT 2008

On Fri, Jun 27, 2008 at 01:17:53PM -0500, Russell Bryant wrote:
> After doing some code review and responding to changes to this list,  
> and then the following discussion on the #asterisk-dev channel, it has  
> occurred to me that the fundamental locking rules for the boundary  
> between channel drivers and the Asterisk core have not sufficiently  
> been documented.  Thus, I propose the following addition to the coding  
> guidelines document.
> 
> I welcome any feedback.  If anything is not clear, I will do my best  
> to clarify it.  If anything seems wrong, then we have an even bigger  
> problem, because this describes how the code is written today.  This  
> is essentially an attempt to document some of what has come out of the  
> evolution of the Asterisk code base, not what I would consider ideal.
> 
> ---------------------------------------------------------
> --- Locking
> ---------------------------------------------------------
> 
> A) Locking Fundamentals
> 
> Asterisk is a heavily multithreaded application.  It makes extensive  
> use of locking to ensure safe access to shared resources between  
> different threads.
> 
> When more that one lock is involved in a given code path, there is the  
> potential for deadlocks.  A deadlock occurs when a thread is stuck  
> waiting for a resource that it will never acquire.  Here is a classic  
> example of a deadlock:
> 
>    Thread 1                           Thread 2
>    ------------                           ------------
>    Holds Lock A                   Holds Lock B
>    Waiting for Lock B          Waiting for Lock A
> 
> In this case, there is a deadlock between threads 1 and 2.  However,  
> this deadlock would have been avoided if both threads 1 and 2 had  
> agreed that one must acquire Lock A before Lock B, or the other way  
> around.
> 
> We have now arrived at the fundamental rule for dealing with multiple  
> locks.
> 
>     Thou must not violate the rules of locking order.
> 
> The agreed upon locking order depends on the code, but when using  
> multiple locks, an order _must_ be established, and once established,  
> must be respected.
> 
> 
> B) Minding the boundary between channel drivers and the Asterisk core
> 
> The #1 cause of deadlocks in Asterisk is by not properly following the  
> locking rules that exist at the boundary between channel drivers and  
> the Asterisk core.  The Asterisk core allocates an ast_channel, and  
> channel drivers allocate technology specific private data that is  
> associated with an ast_channel.  Typically, both the ast_channel and  
> technology pvt (private) data have their own lock.  There are _many_  
> code paths that require both objects to be locked.
> 
> The locking order in this situation is the following:
> 
>     1) ast_channel
>     2) technology pvt data
> 
> Channel drivers implement the ast_channel_tech interface to provide a  
> channel implementation for Asterisk.  Most of the channel_tech  
> interface callbacks are called with the associated ast_channel  
> locked.  When accessing technology pvt data, the pvt can be locked  
> directly.
> 
> The complication arises in the opposite code path.  Consider the  
> following situation:
> 
>     1) A message comes in over the "network"
>     2) The CD (channel driver) monitor thread receives the message
>     3) The CD associates the message with a pvt struct and locks the  
> pvt
>     4) While processing the message, the CD must do something that  
> requires
>          locking the ast_channel
> 
> This is the point that must be handled carefully.  Given the specified  
> locking order, it seems as if the following psuedo-code would be  
> sufficient:
> 
>       unlock(pvt);
>       lock(ast_channel);
>       lock(pvt);

Putting on my Captain Obvious mask:

(The resulting code is not fundementally different from the code here)

If we can just give up the lock, why was it held in the first place?

-- 
               Tzafrir Cohen
icq#16849755              jabber:tzafrir.cohen at xorcom.com
+972-50-7952406           mailto:tzafrir.cohen at xorcom.com
http://www.xorcom.com  iax:guest at local.xorcom.com/tzafrir