[asterisk-dev] [Code Review] 4247: DEBUG_THREADS: Fix regression and lock tracking initialization problems.

rmudgett reviewboard at asterisk.org
Fri Dec 12 17:31:42 CST 2014


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/4247/
-----------------------------------------------------------

(Updated Dec. 12, 2014, 11:31 p.m.)


Status
------

This change has been marked as submitted.


Review request for Asterisk Developers.


Changes
-------

Committed in revision 429539


Bugs: ASTERISK-19463, ASTERISK-22455 and ASTERISK-24614
    https://issues.asterisk.org/jira/browse/ASTERISK-19463
    https://issues.asterisk.org/jira/browse/ASTERISK-22455
    https://issues.asterisk.org/jira/browse/ASTERISK-24614


Repository: Asterisk


Description
-------

This patch started with David Lee's patch at
https://reviewboard.asterisk.org/r/2826/ and includes a regression fix
introduced by the ASTERISK-22455 patch.

The initialization of a mutex's lock tracking structure was not protected
in a critical section.  This is fine for any mutex that is explicitly
initialized, but a static mutex may have its lock tracking double
initialized if multiple threads attempt the first lock simultaneously.

* Added a global mutex to properly serialize initialization of the lock
tracking structure.  The painful global lock can be mitigated by adding a
double checked lock flag as discussed on the original review request.

* Defer lock tracking initialization until first use.

* Don't be "helpful" and initialize an uninitialized lock when
DEBUG_THREADS is enabled.  Debug code is not supposed to fix or change
normal code behavior.  We don't need a lock initialization race that would
force a re-setup of lock tracking.  Lock tracking already handles
initialization on first use.

* Properly handle allocation failures of the lock tracking structure.

* No need to initialize tracking data in __ast_pthread_mutex_destroy()
just to turn around and destroy it.


The regression introduced by ASTERISK-22455 is the result of manipulating
a pthread_mutex_t struct outside of the pthread library code.  The
pthread_mutex_t struct seems to have a global linked list pointer member
that can get changed by other threads.  Therefore, saving and restoring
the contents of a pthread_mutex_t struct is a bad thing.

Thanks to Thomas Airmont for finding this obscure regression.

* Don't overwrite the struct ast_lock_track.reentr_mutex member to restore
tracking data in __ast_cond_wait() and __ast_cond_timedwait().  The
pthread_mutex_t struct must be treated as a read-only opaque variable.


Miscellaneous other items fixed by this patch:

* Match ast_suspend_lock_info() with ast_restore_lock_info() in
__ast_cond_timedwait().

* Made some uninitialized lock sanity checks return EINVAL and try a
DO_THREAD_CRASH.

* Fix bad canlog initialization expressions.


NOTE: The first diff on this review is the unmodified
https://reviewboard.asterisk.org/r/2826/ patch for comparison with the
updated patch.


Diffs
-----

  /branches/1.8/main/lock.c 429174 
  /branches/1.8/include/asterisk/lock.h 429174 

Diff: https://reviewboard.asterisk.org/r/4247/diff/


Testing
-------

Without the patch on v1.8, I repeatedly ran the testsuite masquerade
supertest and it died an hour or two later.  With the patch, it ran over
the weekend without a problem.

Since the DEBUG_THREADS locking issues on Asterisk startup
(ASTERISK-19463) have been a hard problem to reproduce, I propose we setup
Bamboo to run the TestSuite with DEBUG_THREADS enabled on the
http://svn.asterisk.org/svn/asterisk/team/rmudgett/debug_threads branch
nightly for a few weeks.


Thanks,

rmudgett

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20141212/b2626158/attachment.html>


More information about the asterisk-dev mailing list