[Asterisk-code-review] sched: Don't allow ast_sched_del to deadlock ast_sched_runq from same... (...asterisk[13])

Richard Mudgett asteriskteam at digium.com
Wed May 29 17:07:10 CDT 2019


Richard Mudgett has posted comments on this change. ( https://gerrit.asterisk.org/c/asterisk/+/10991 )

Change subject: sched: Don't allow ast_sched_del to deadlock ast_sched_runq from same thread
......................................................................


Patch Set 2: Code-Review-1

(5 comments)

https://gerrit.asterisk.org/#/c/10991/2/include/asterisk/sched.h 
File include/asterisk/sched.h:

https://gerrit.asterisk.org/#/c/10991/2/include/asterisk/sched.h@84 
PS2, Line 84: 			usleep(1); \
Since id is a variable that could theoretically change on us while we are attempting to delete the scheduled item, we need to refresh the _id value in the loop after the usleep().

while (...) {
  usleep(1);
  _id = id;
}

I do think it is bad if id changes while we are trying to delete the scheduled item.  But paranoia. :)


https://gerrit.asterisk.org/#/c/10991/2/main/sched.c 
File main/sched.c:

https://gerrit.asterisk.org/#/c/10991/2/main/sched.c@121 
PS2, Line 121: 	pthread_t currently_executing_on_thread_id;
A better member name with doxygen:
/*! Valid while currently_executing is not NULL */
pthread_t executing_thread_id;


https://gerrit.asterisk.org/#/c/10991/2/main/sched.c@630 
PS2, Line 630: 	} else if (con->currently_executing_on_thread_id == pthread_self()) {
             : 		/* We might trample on deleted memory at this point. Not good,
             : 		 * but it's better than a deadlock.
             : 		 * Thou shalt not reschedule things from a scheduled callback!
             : 		 */
             : 		ast_log(LOG_ERROR,
             : 			"BUG! Trying to delete sched %d from the same callback %p (sched %d). "
             : 			"Ignoring so we don't deadlock\n",
             : 			id, con->currently_executing->callback, con->currently_executing->sched_id->id);
             : 		ast_log_backtrace();
             : 		/* We'll return -1 below, because s is NULL. The caller
             : 		 * will rightly assume that the unscheduling failed. */
This is the wrong place for this check.  You are restricting a valid scenario.  It is permissible for a callback to delete other scheduled items.  It cannot delete itself because that is a deadlock and there is a defined way for a scheduled callback to delete itself.

At this point we could be trying to delete a scheduled item that simply doesn't exist or the scheduled callback has already deleted itself because it returned zero.


https://gerrit.asterisk.org/#/c/10991/2/main/sched.c@642 
PS2, Line 642: 	} else if (con->currently_executing && (id == con->currently_executing->sched_id->id)) {
Within this if clause we have determined that we are trying to delete the scheduled item while it is running.  If the scheduled item is trying to delete itself using this function then that is when you should trigger your error message with backtrace and ignore the deletion to prevent the deadlock.

} else if (con->currently_executing && (id == ...)) {
   if (con->currently_executing_on_thread_id == pthread_self()) {
      /* The scheduled callback is trying to delete itself.  Not good as that is a deadlock. */
      ast_log(LOG_ERROR,
         "BUG! Trying to delete sched %d from within the callback %p.  Ignoring so we don't deadlock\n",
         id, con->currently_executing->callback);
      ast_log_backtrace();
      /* We'll return -1 below because s is NULL.  The caller will rightly assume that the unscheduling failed. */
   } else {
       ... what was previously in the if clause ...
   }
}


https://gerrit.asterisk.org/#/c/10991/2/main/sched.c@796 
PS2, Line 796: 		con->currently_executing_on_thread_id = 0;
This is not needed when the ast_sched_del() check is fixed.



-- 
To view, visit https://gerrit.asterisk.org/c/asterisk/+/10991
To unsubscribe, or for help writing mail filters, visit https://gerrit.asterisk.org/settings

Gerrit-Project: asterisk
Gerrit-Branch: 13
Gerrit-Change-Id: Ic26777fa0732725e6ca7010df17af77a012aa856
Gerrit-Change-Number: 10991
Gerrit-PatchSet: 2
Gerrit-Owner: Walter Doekes <walter+asterisk at wjd.nu>
Gerrit-Reviewer: Friendly Automation
Gerrit-Reviewer: Joshua Colp <jcolp at digium.com>
Gerrit-Reviewer: Richard Mudgett <rmudgett at digium.com>
Gerrit-Reviewer: Walter Doekes <walter+asterisk at wjd.nu>
Gerrit-Reviewer: mattf <creslin at digium.com>
Gerrit-Comment-Date: Wed, 29 May 2019 22:07:10 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: Yes
Gerrit-MessageType: comment
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-code-review/attachments/20190529/13252a19/attachment.html>


More information about the asterisk-code-review mailing list