[asterisk-commits] mmichelson: branch 1.6.0 r108739 - in /branches/1.6.0: ./ channels/chan_sip.c

SVN commits to the Asterisk project asterisk-commits at lists.digium.com
Fri Mar 14 11:54:36 CDT 2008


Author: mmichelson
Date: Fri Mar 14 11:54:36 2008
New Revision: 108739

URL: http://svn.digium.com/view/asterisk?view=rev&rev=108739
Log:
Merged revisions 108738 via svnmerge from 
https://origsvn.digium.com/svn/asterisk/trunk

................
r108738 | mmichelson | 2008-03-14 11:52:51 -0500 (Fri, 14 Mar 2008) | 41 lines

Merged revisions 108737 via svnmerge from 
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r108737 | mmichelson | 2008-03-14 11:44:08 -0500 (Fri, 14 Mar 2008) | 33 lines

Fix a race condition in the SIP packet scheduler which could cause a crash.

chan_sip uses the scheduler API in order to schedule retransmission of reliable
packets (such as INVITES). If a retransmission of a packet is occurring, then the
packet is removed from the scheduler and retrans_pkt is called. Meanwhile, if
a response is received from the packet as previously transmitted, then when we 
ACK the response, we will remove the packet from the scheduler and free the packet.

The problem is that both the ACK function and retrans_pkt attempt to acquire the
same lock at the beginning of the function call. This means that if the ACK function
acquires the lock first, then it will free the packet which retrans_pkt is about to
read from and write to. The result is a crash.

The solution:

1. If the ACK function fails to remove the packet from the scheduler and the retransmit
   id of the packet is not -1 (meaning that we have not reached the maximum number of 
   retransmissions) then release the lock and yield so that retrans_pkt may acquire the
   lock and operate.

2. Make absolutely certain that the ACK function does not recursively lock the lock in
   question. If it does, then releasing the lock will do no good, since retrans_pkt will
   still be unable to acquire the lock.

(closes issue #12098)
Reported by: wegbert
(closes issue #12089)
Reported by: PTorres
Patches:
      12098-putnopvutv3.patch uploaded by putnopvut (license 60)
Tested by: jvandal


........

................

Modified:
    branches/1.6.0/   (props changed)
    branches/1.6.0/channels/chan_sip.c

Propchange: branches/1.6.0/
------------------------------------------------------------------------------
Binary property 'trunk-merged' - no diff available.

Modified: branches/1.6.0/channels/chan_sip.c
URL: http://svn.digium.com/view/asterisk/branches/1.6.0/channels/chan_sip.c?view=diff&rev=108739&r1=108738&r2=108739
==============================================================================
--- branches/1.6.0/channels/chan_sip.c (original)
+++ branches/1.6.0/channels/chan_sip.c Fri Mar 14 11:54:36 2008
@@ -2935,13 +2935,12 @@
 	return res;
 }
 
-/*! \brief Acknowledges receipt of a packet and stops retransmission */
+/*! \brief Acknowledges receipt of a packet and stops retransmission 
+ * called with p locked*/
 static void __sip_ack(struct sip_pvt *p, int seqno, int resp, int sipmethod)
 {
 	struct sip_pkt *cur, *prev = NULL;
 	const char *msg = "Not Found";	/* used only for debugging */
-
-	sip_pvt_lock(p);
 
 	/* If we have an outbound proxy for this dialog, then delete it now since
 	  the rest of the requests in this dialog needs to follow the routing.
@@ -2964,20 +2963,39 @@
 				if (sipdebug)
 					ast_debug(4, "** SIP TIMER: Cancelling retransmit of packet (reply received) Retransid #%d\n", cur->retransid);
 			}
-			AST_SCHED_DEL(sched, cur->retransid);
+			/* This odd section is designed to thwart a 
+			 * race condition in the packet scheduler. There are
+			 * two conditions under which deleting the packet from the
+			 * scheduler can fail.
+			 *
+			 * 1. The packet has been removed from the scheduler because retransmission
+			 * is being attempted. The problem is that if the packet is currently attempting
+			 * retransmission and we are at this point in the code, then that MUST mean
+			 * that retrans_pkt is waiting on p's lock. Therefore we will relinquish the
+			 * lock temporarily to allow retransmission.
+			 *
+			 * 2. The packet has reached its maximum number of retransmissions and has
+			 * been permanently removed from the packet scheduler. If this is the case, then
+			 * the packet's retransid will be set to -1. The atomicity of the setting and checking
+			 * of the retransid to -1 is ensured since in both cases p's lock is held.
+			 */
+			while (cur->retransid > -1 && ast_sched_del(sched, cur->retransid)) {
+				sip_pvt_unlock(p);
+				usleep(1);
+				sip_pvt_lock(p);
+			}
 			UNLINK(cur, p->packets, prev);
 			dialog_unref(cur->owner);
 			ast_free(cur);
 			break;
 		}
 	}
-	sip_pvt_unlock(p);
 	ast_debug(1, "Stopping retransmission on '%s' of %s %d: Match %s\n",
 		p->callid, resp ? "Response" : "Request", seqno, msg);
 }
 
 /*! \brief Pretend to ack all packets
- * maybe the lock on p is not strictly necessary but there might be a race */
+ * called with p locked */
 static void __sip_pretend_ack(struct sip_pvt *p)
 {
 	struct sip_pkt *cur = NULL;
@@ -9085,9 +9103,11 @@
 		/* Unlink us, destroy old call.  Locking is not relevant here because all this happens
 		   in the single SIP manager thread. */
 		p = r->call;
+		sip_pvt_lock(p);
 		p->needdestroy = 1;
 		/* Pretend to ACK anything just in case */
-		__sip_pretend_ack(p); /* XXX we need p locked, not sure we have */
+		__sip_pretend_ack(p);
+		sip_pvt_unlock(p);
 
 		/* decouple the two objects */
 		/* p->registry == r, so r has 2 refs, and the unref won't take the object away */




More information about the asterisk-commits mailing list