[asterisk-dev] Locking, coding guidelines addition

Stephen Davies stephen.l.davies at gmail.com
Thu Jul 3 00:47:22 CDT 2008


Hi,

Call me a troublemaker, or maybe Captain Obvious' cousin, but
locking/deadlock issues have been behind 90% or more of the issues I've got
involved in, bug reported, or fixed, over the last few years.   So Russell:
I think _most_ Asterisk issues are to do with this problem, and we _do_ have
an even bigger problem.  (Maybe that's because I do lots of IAX2, and iax2
has really suffered from these issues since the worker thread stuff was put
in).

Unfortunately its an architectural problem.

I think the discussion reveals that the magic trylock while loop looks neato
but doesn't really solve the problem.  Tzafrir and Luigi pointed out some
flaws.

Mutexes and all this stuff are safe _provided_ that a locking order is
respected.  If locking order isn't respected, you're always on the bleeding
edge where some innocuous change tips something else over the edge and,
voila, a deadlock.

This is all harder in Asterisk because locks are help sometimes over large
areas of code - making it hard to see all the possible code paths.

So it seems to me that the nettle must be grasped, a strict locking order
defined, and the code needs to be gradually adjusted to match that
definition.  I don't see any other way.

In your example of an incoming call for a channel - well, the channel must
lock the ast_channel before the pvt structure.  (or, remove the pvt lock as
Steve suggested).

It would also help to clean up the coding style so that the "scope" of a
held lock is more clearly visible in the code structure.  This helps
comprehension.

By the way - a locking order is a tricky thing.  It needs to be defined
*within* a class of object too.  Like: if a thread needs to lock more than
one channel - there must be a defined order between the channels, too.

I note also that introducing astobj2 into chan_iax2 also had an initial bad
effect on stability, showing that there are dragons down that road, too.  I
guess they are in the requirement to reference count.

Regards,
Steve
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.digium.com/pipermail/asterisk-dev/attachments/20080703/fc19f19c/attachment.htm 


More information about the asterisk-dev mailing list