[asterisk-dev] Locking, coding guidelines addition
Tzafrir Cohen
tzafrir.cohen at xorcom.com
Fri Jun 27 17:14:02 CDT 2008
On Fri, Jun 27, 2008 at 01:17:53PM -0500, Russell Bryant wrote:
> After doing some code review and responding to changes to this list,
> and then the following discussion on the #asterisk-dev channel, it has
> occurred to me that the fundamental locking rules for the boundary
> between channel drivers and the Asterisk core have not sufficiently
> been documented. Thus, I propose the following addition to the coding
> guidelines document.
>
> I welcome any feedback. If anything is not clear, I will do my best
> to clarify it. If anything seems wrong, then we have an even bigger
> problem, because this describes how the code is written today. This
> is essentially an attempt to document some of what has come out of the
> evolution of the Asterisk code base, not what I would consider ideal.
>
> ---------------------------------------------------------
> --- Locking
> ---------------------------------------------------------
>
> A) Locking Fundamentals
>
> Asterisk is a heavily multithreaded application. It makes extensive
> use of locking to ensure safe access to shared resources between
> different threads.
>
> When more that one lock is involved in a given code path, there is the
> potential for deadlocks. A deadlock occurs when a thread is stuck
> waiting for a resource that it will never acquire. Here is a classic
> example of a deadlock:
>
> Thread 1 Thread 2
> ------------ ------------
> Holds Lock A Holds Lock B
> Waiting for Lock B Waiting for Lock A
>
> In this case, there is a deadlock between threads 1 and 2. However,
> this deadlock would have been avoided if both threads 1 and 2 had
> agreed that one must acquire Lock A before Lock B, or the other way
> around.
>
> We have now arrived at the fundamental rule for dealing with multiple
> locks.
>
> Thou must not violate the rules of locking order.
>
> The agreed upon locking order depends on the code, but when using
> multiple locks, an order _must_ be established, and once established,
> must be respected.
>
>
> B) Minding the boundary between channel drivers and the Asterisk core
>
> The #1 cause of deadlocks in Asterisk is by not properly following the
> locking rules that exist at the boundary between channel drivers and
> the Asterisk core. The Asterisk core allocates an ast_channel, and
> channel drivers allocate technology specific private data that is
> associated with an ast_channel. Typically, both the ast_channel and
> technology pvt (private) data have their own lock. There are _many_
> code paths that require both objects to be locked.
>
> The locking order in this situation is the following:
>
> 1) ast_channel
> 2) technology pvt data
>
> Channel drivers implement the ast_channel_tech interface to provide a
> channel implementation for Asterisk. Most of the channel_tech
> interface callbacks are called with the associated ast_channel
> locked. When accessing technology pvt data, the pvt can be locked
> directly.
>
> The complication arises in the opposite code path. Consider the
> following situation:
>
> 1) A message comes in over the "network"
> 2) The CD (channel driver) monitor thread receives the message
> 3) The CD associates the message with a pvt struct and locks the
> pvt
> 4) While processing the message, the CD must do something that
> requires
> locking the ast_channel
>
> This is the point that must be handled carefully. Given the specified
> locking order, it seems as if the following psuedo-code would be
> sufficient:
>
> unlock(pvt);
> lock(ast_channel);
> lock(pvt);
Putting on my Captain Obvious mask:
(The resulting code is not fundementally different from the code here)
If we can just give up the lock, why was it held in the first place?
--
Tzafrir Cohen
icq#16849755 jabber:tzafrir.cohen at xorcom.com
+972-50-7952406 mailto:tzafrir.cohen at xorcom.com
http://www.xorcom.com iax:guest at local.xorcom.com/tzafrir
More information about the asterisk-dev
mailing list