<html>

<head>

    <base href="https://wiki.asterisk.org/wiki">

            <link rel="stylesheet" href="/wiki/s/2042/1/7/_/styles/combined.css?spaceKey=TOP&amp;forWysiwyg=true" type="text/css">

    </head>

<body style="background: white;" bgcolor="white" class="email-body">

<div id="pageContent">

<div id="notificationFormat">

<div class="wiki-content">

<div class="email">

    <h2><a href="https://wiki.asterisk.org/wiki/display/TOP/SIP+Component+Threading+Model">SIP Component Threading Model</a></h2>

    <h4>Page <b>edited</b> by             <a href="https://wiki.asterisk.org/wiki/display/~kpfleming">Kevin P. Fleming</a>

    </h4>

        <div id="versionComment">

        <b>Comment:</b>

        lots of little edits here and there<br />

    </div>

        <br/>

                         <h4>Changes (12)</h4>

                                 
<div id="page-diffs">

                    <table class="diff" cellpadding="0" cellspacing="0">

    
            <tr><td class="diff-snipped" >...<br></td></tr>

            <tr><td class="diff-unchanged" >{warning} <br> <br></td></tr>

            <tr><td class="diff-changed-lines" >h1. The current model <span class="diff-added-words"style="background-color: #dfd;">and its problems</span> <br></td></tr>

            <tr><td class="diff-unchanged" > <br></td></tr>

            <tr><td class="diff-changed-lines" >The PJSIP manager creates a single thread that constantly services queued events (e.g. incoming SIP messages). <span class="diff-deleted-words"style="color:#999;background-color:#fdd;text-decoration:line-through;">The biggest issue is that</span> <span class="diff-added-words"style="background-color: #dfd;">Unfortunately,</span> the time taken to process certain types of incoming SIP messages, like INVITEs, can <span class="diff-deleted-words"style="color:#999;background-color:#fdd;text-decoration:line-through;">take an</span> <span class="diff-added-words"style="background-color: #dfd;">be</span> inordinately <span class="diff-changed-words">long<span class="diff-deleted-chars"style="color:#999;background-color:#fdd;text-decoration:line-through;"> time</span>.</span> Specifically, remote procedure calls in the INVITE-handling code cause the thread to wait a long time. The waiting period is long enough that the queue of unhandled messages may grow faster than our code can handle them. <br></td></tr>

            <tr><td class="diff-unchanged" > <br></td></tr>

            <tr><td class="diff-changed-lines" >In addition, the <span class="diff-deleted-words"style="color:#999;background-color:#fdd;text-decoration:line-through;">Ice-induced</span> <span class="diff-added-words"style="background-color: #dfd;">Ice-invoked</span> operations run in multiple threads, not in the same thread that the PJSIP manager uses for presenting incoming messages. This means that there is contention for resources common to the Ice threads and the PJSIP manager thread. <br></td></tr>

            <tr><td class="diff-unchanged" > <br>h1. Potential Methods of improvement <br></td></tr>

            <tr><td class="diff-snipped" >...<br></td></tr>

            <tr><td class="diff-unchanged" >h3. More AMI <br> <br></td></tr>

            <tr><td class="diff-changed-lines" >One potential fix for the problem is to use AMI for all RPCs in the code that handles incoming messages. Using AMI means that the thread is not stuck waiting, thus the growing message queue may be serviced. AMI is already used in some places in the SIP <span class="diff-added-words"style="background-color: #dfd;">component</span> code, especially for operations that are known to potentially block for a long time. AMI has some disadvantages though: <br></td></tr>

            <tr><td class="diff-unchanged" > <br>* Using AMI can decrease the readability of the code, especially for operations that make multiple RPCs. <br></td></tr>

            <tr><td class="diff-changed-lines" >* Object lifetime <span class="diff-added-words"style="background-color: #dfd;">management</span> can be a <span class="diff-deleted-words"style="color:#999;background-color:#fdd;text-decoration:line-through;">pitfall.</span> <span class="diff-added-words"style="background-color: #dfd;">complicated.</span> <br></td></tr>

            <tr><td class="diff-unchanged" >* Adding AMI does nothing to alleviate any resource contention issues. If anything, it may increase the number of threads potentially vying for the same resources. <br> <br></td></tr>

            <tr><td class="diff-snipped" >...<br></td></tr>

            <tr><td class="diff-unchanged" >h3. Funnel tasks into a thread pool <br> <br></td></tr>

            <tr><td class="diff-unchanged" >Using a thread pool would have similar advantages to using multiple threads for handling PJSIP events. That is, if a single thread is blocked, there may be other threads currently available to handle the task. What&#39;s different though is that since we will be delegating the work ourselves, we can distribute work based on application-level concepts. As an example, a single thread can be responsible for handling all tasks related to a specific session. This can eliminate some of the contention between threads. In addition, tasks that originate from PJSIP and tasks that originate from Ice can use the same thread pool, once again leading to less contention between threads. <br></td></tr>

            <tr><td class="diff-unchanged" > <br></td></tr>

            <tr><td class="diff-deleted-lines" style="color:#999;background-color:#fdd;text-decoration:line-through;">The issue about implementing a thread pool is that currently there is no thread pool implementation in Asterisk SCF. Ice and Boost do not contain thread pool implementations that Asterisk SCF could use either. There is a thread pool proposal [here|Thread Pools], but it has yet to be finalized. Use of a thread pool also is complicated because there are lots of ways one could potentially be used. <br> <br></td></tr>

            <tr><td class="diff-unchanged" >h3. Conclusions <br> <br></td></tr>

            <tr><td class="diff-changed-lines" >To maximize efficiency of the SIP component, it seems that the use of a thread pool to place tasks into logical areas is a good start to making <span class="diff-deleted-words"style="color:#999;background-color:#fdd;text-decoration:line-through;">the SIP</span> <span class="diff-added-words"style="background-color: #dfd;">it</span> component run more smoothly. As new RPCs are added to SIP components, a decision can be made regarding the feasibility of making the operation use AMI or not. <br></td></tr>

            <tr><td class="diff-unchanged" > <br>h1. An investigation of locks in PJSIP <br></td></tr>

            <tr><td class="diff-snipped" >...<br></td></tr>

            <tr><td class="diff-unchanged" ># The transaction layer passes the message up to the user agent layer. <br># The user agent layer locks the *user agent layer lock* to search for a matching dialog. <br></td></tr>

            <tr><td class="diff-changed-lines" ># Once the dialog is found, the &quot;user agent layer <span class="diff-changed-words">lock<span class="diff-added-chars"style="background-color: #dfd;">\</span>*</span> is released, and the *dialog lock* is acquired. <br></td></tr>

            <tr><td class="diff-changed-lines" ># The dialog layer then passes the message to the next layer up. In the case of <span class="diff-deleted-words"style="color:#999;background-color:#fdd;text-decoration:line-through;">media</span> <span class="diff-added-words"style="background-color: #dfd;">INVITE-created</span> sessions, this is the INVITE session module of PJSIP, followed then by our PJSipSessionModule. In other cases, there may not be another layer between the user agent layer and us. The important thing to note is that there are no additional locks acquired during this time since data at this point is protected by the *dialog lock*. <br></td></tr>

            <tr><td class="diff-unchanged" > <br>This specific sequence was chosen because it encounters the most locks and illustrates where in the chain of calls the specific locks are acquired. <br></td></tr>

            <tr><td class="diff-snipped" >...<br></td></tr>

            <tr><td class="diff-unchanged" >* The *user agent layer lock* is acquired when a dialog is registered or looked up. <br> <br></td></tr>

            <tr><td class="diff-deleted-lines" style="color:#999;background-color:#fdd;text-decoration:line-through;">h1. Thread pool usage with SIP <br></td></tr>

            <tr><td class="diff-added-lines" style="background-color: #dfd;">h1. Thread pool usage in the SIP component(s) <br></td></tr>

            <tr><td class="diff-unchanged" > <br>h3. Incoming messages <br></td></tr>

            <tr><td class="diff-snipped" >...<br></td></tr>

            <tr><td class="diff-unchanged" >h5. The drawback of both methods <br> <br></td></tr>

            <tr><td class="diff-changed-lines" >Both methods have a common flaw. Incoming message callbacks in PJSIP modules always take a {{pjsip_rx_data}} pointer (referred to as rdata from here on) as a parameter. This structure is large and contains all relevant data pertaining to the message. The problem lies in an optimization <span class="diff-deleted-words"style="color:#999;background-color:#fdd;text-decoration:line-through;">of PJSIP&#39;s.</span> <span class="diff-added-words"style="background-color: #dfd;">present in PJSIP.</span> For any given PJSIP transport, there is only a single rdata structure. This rdata structure is reused on each incoming SIP message. In a single-threaded environment, this is just fine. If we dispatch a message to a separate thread, though, it means that we cannot simply pass an rdata pointer to the dispatched thread. Instead, we have to be safe and do a deep copy of the structure. [Here|http://www.pjsip.org/pjsip/docs/html/structpjsip__rx__data.htm] is a breakdown of the rdata structure. There are four structures that make up the rdata. <br></td></tr>

            <tr><td class="diff-unchanged" > <br># Transport data ({{tp_data}}). This is information about the transport itself. This can be shallow copied since the transport data is constant for all messages. <br></td></tr>

            <tr><td class="diff-snipped" >...<br></td></tr>

    
            </table>

    </div>                            <h4>Full Content</h4>

                    <div class="notificationGreySide">

        <div class='panelMacro'><table class='warningMacro'><colgroup><col width='24'><col></colgroup><tr><td valign='top'><img src="/wiki/images/icons/emoticons/forbidden.gif" width="16" height="16" align="absmiddle" alt="" border="0"></td><td><b>Achtung!</b><br />This is a work in progress.</td></tr></table></div>


<h1><a name="SIPComponentThreadingModel-Thecurrentmodelanditsproblems"></a>The current model and its problems</h1>


<p>The PJSIP manager creates a single thread that constantly services queued events (e.g. incoming SIP messages). Unfortunately, the time taken to process certain types of incoming SIP messages, like INVITEs, can be inordinately long. Specifically, remote procedure calls in the INVITE-handling code cause the thread to wait a long time. The waiting period is long enough that the queue of unhandled messages may grow faster than our code can handle them.</p>


<p>In addition, the Ice-invoked operations run in multiple threads, not in the same thread that the PJSIP manager uses for presenting incoming messages. This means that there is contention for resources common to the Ice threads and the PJSIP manager thread.</p>


<h1><a name="SIPComponentThreadingModel-PotentialMethodsofimprovement"></a>Potential Methods of improvement</h1>


<h3><a name="SIPComponentThreadingModel-MoreAMI"></a>More AMI</h3>


<p>One potential fix for the problem is to use AMI for all RPCs in the code that handles incoming messages. Using AMI means that the thread is not stuck waiting, thus the growing message queue may be serviced. AMI is already used in some places in the SIP component code, especially for operations that are known to potentially block for a long time. AMI has some disadvantages though:</p>


<ul>

        <li>Using AMI can decrease the readability of the code, especially for operations that make multiple RPCs.</li>

        <li>Object lifetime management can be a complicated.</li>

        <li>Adding AMI does nothing to alleviate any resource contention issues. If anything, it may increase the number of threads potentially vying for the same resources.</li>

</ul>


<h3><a name="SIPComponentThreadingModel-ListenforSIPmessagesinmultiplethreads."></a>Listen for SIP messages in multiple threads.</h3>


<p>PJSIP's event-handling functions are thread safe, so they may be called from multiple threads at once. This means there can be more concurrent threads handling incoming SIP messages. If one thread is blocking, there may be another thread ready to handle the message. The problem with this approach is that there is no application-level logic behind which thread handles which message. The result is that like with AMI, this actually results in more concurrent threads attempting to access the same resources.</p>


<h3><a name="SIPComponentThreadingModel-Funneltasksintoathreadpool"></a>Funnel tasks into a thread pool</h3>


<p>Using a thread pool would have similar advantages to using multiple threads for handling PJSIP events. That is, if a single thread is blocked, there may be other threads currently available to handle the task. What's different though is that since we will be delegating the work ourselves, we can distribute work based on application-level concepts. As an example, a single thread can be responsible for handling all tasks related to a specific session. This can eliminate some of the contention between threads. In addition, tasks that originate from PJSIP and tasks that originate from Ice can use the same thread pool, once again leading to less contention between threads.</p>


<h3><a name="SIPComponentThreadingModel-Conclusions"></a>Conclusions</h3>


<p>To maximize efficiency of the SIP component, it seems that the use of a thread pool to place tasks into logical areas is a good start to making it component run more smoothly. As new RPCs are added to SIP components, a decision can be made regarding the feasibility of making the operation use AMI or not.</p>


<h1><a name="SIPComponentThreadingModel-AninvestigationoflocksinPJSIP"></a>An investigation of locks in PJSIP</h1>


<h3><a name="SIPComponentThreadingModel-AnoverviewofanincomingPJSIPmessage"></a>An overview of an incoming PJSIP message</h3>


<p>A message starts out within the PJSIP endpoint. The PJSIP endpoint contains a list of all registered PJSIP modules. The endpoint iterates through the list of modules, passing the message to each one until a module reports that it has handled the message. What's important to note are the major locks that are encountered during this process.</p>


<p>Assume that a message arrives that belongs to an in-dialog transaction. Here is a breakdown of the steps:</p>


<ol>

        <li>Endpoint receives the message, passes the message to the transaction layer</li>

        <li>Transaction layer locks the <b>transaction layer lock</b> to search for a matching transaction.</li>

        <li>Once the transaction is found, the <b>transaction layer lock</b> is released, and the <b>transaction lock</b> is acquired.</li>

        <li>The transaction layer passes the message up to the user agent layer.</li>

        <li>The user agent layer locks the <b>user agent layer lock</b> to search for a matching dialog.</li>

        <li>Once the dialog is found, the "user agent layer lock&#42; is released, and the <b>dialog lock</b> is acquired.</li>

        <li>The dialog layer then passes the message to the next layer up. In the case of INVITE-created sessions, this is the INVITE session module of PJSIP, followed then by our PJSipSessionModule. In other cases, there may not be another layer between the user agent layer and us. The important thing to note is that there are no additional locks acquired during this time since data at this point is protected by the <b>dialog lock</b>.</li>

</ol>


<p>This specific sequence was chosen because it encounters the most locks and illustrates where in the chain of calls the specific locks are acquired.</p>


<h3><a name="SIPComponentThreadingModel-AnoverviewofanoutgoingPJSIPmessage"></a>An overview of an outgoing PJSIP message</h3>


<p>Well...it's the opposite of an incoming one. Like with the incoming message example, we'll show an outgoing message that belongs both to an existing dialog and transaction.</p>


<ol>

        <li>Our application gives the signal to send a message. It will either use a function provided by a layer between us and the dialog layer (like the INVITE session module) or a function provided by the user agent layer.</li>

        <li>Once within the user agent layer, the <b>dialog lock</b> is acquired, and the message is passed down to the transaction layer</li>

        <li>Once within the transaction layer, the <b>transaction lock</b> is acquired, and the message is passed down to the transport layer</li>

</ol>


<p>Beyond this point is unimportant to us. Again, take note of the locks that are acquired in this sequence.</p>


<h3><a name="SIPComponentThreadingModel-Othertimeslocksareheld"></a>Other times locks are held</h3>


<ul>

        <li>The <b>dialog lock</b> is acquired in most pjsip_dlg_*() API calls.</li>

        <li>The <b>transaction lock</b> is acquired in most pjsip_tsx_*() API calls.</li>

        <li>The <b>transaction layer lock</b> is acquired when a transaction is registered or looked up.</li>

        <li>The <b>user agent layer lock</b> is acquired when a dialog is registered or looked up.</li>

</ul>


<h1><a name="SIPComponentThreadingModel-ThreadpoolusageintheSIPcomponent%28s%29"></a>Thread pool usage in the SIP component(s)</h1>


<h3><a name="SIPComponentThreadingModel-Incomingmessages"></a>Incoming messages</h3>


<p>A single thread will handle incoming messages. At some point during the steps shown in the previous section, we will distribute the message out to the thread pool. There are a couple of places that this can be done.</p>


<h5><a name="SIPComponentThreadingModel-Method1%3ABetweentheEndpointandTransactionLayer"></a>Method 1: Between the Endpoint and Transaction Layer</h5>


<p><span class="image-wrap" style=""><img src="/wiki/download/attachments/12550192/Method1.png?version=1&amp;modificationDate=1298394519751" style="border: 0px solid black" /></span></p>


<p>We would write a PJSIP module with a priority that is lower than the transaction layer. This module would need to have a copy of the list of all registered PJSIP modules with a lower priority than it. Here is how the module would operate:</p>


<ol>

        <li>The endpoint calls into our module</li>

        <li>Our module determines which thread the message should go to.</li>

        <li>Our module dispatches the message to the appropriate thread, and returns PJ_TRUE to the PJSIP endpoint.</li>

        <li>The dispatched thread passes the message to each registered module, starting with the transaction layer, until either all modules have been called into or one returns PJ_TRUE.</li>

</ol>


<p>With this method, dialog and transaction locks are only taken within their own thread, so they never actually have to contend with other threads. The transaction layer lock and user agent layer lock, though, will still be in contention since transactions and dialogs are looked up and registered in all threads. The toughest parts with this method are 1) maintaining the list of registered PJSIP modules, and 2) determining which thread to dispatch a message to. PJSIP offers methods for looking up specific transactions and dialogs based on message data, but finding items like registrations or publications will not be easy.</p>


<h5><a name="SIPComponentThreadingModel-Method2%3ABetweentheuseragentlayerandourcode"></a>Method 2: Between the user agent layer and our code</h5>


<p><span class="image-wrap" style=""><img src="/wiki/download/attachments/12550192/Method2.png?version=1&amp;modificationDate=1298393568943" style="border: 0px solid black" /></span></p>


<p>In this version, each PJSIP module we write has its own thread pool. As a message reaches one of our modules, we determine if we are the appropriate module to handle the message. If so, we dispatch the message to our thread pool.</p>


<p>With this method, dialog and transaction locks are not taken in the thread to which the message is dispatched. However, PJSIP could be modified to simply not take the dialog or transaction locks any more since all action taken on those objects will always happen in their own thread.</p>


<h5><a name="SIPComponentThreadingModel-Thedrawbackofbothmethods"></a>The drawback of both methods</h5>


<p>Both methods have a common flaw. Incoming message callbacks in PJSIP modules always take a <tt>pjsip_rx_data</tt> pointer (referred to as rdata from here on) as a parameter. This structure is large and contains all relevant data pertaining to the message. The problem lies in an optimization present in PJSIP. For any given PJSIP transport, there is only a single rdata structure. This rdata structure is reused on each incoming SIP message. In a single-threaded environment, this is just fine. If we dispatch a message to a separate thread, though, it means that we cannot simply pass an rdata pointer to the dispatched thread. Instead, we have to be safe and do a deep copy of the structure. <a href="http://www.pjsip.org/pjsip/docs/html/structpjsip__rx__data.htm" class="external-link" rel="nofollow">Here</a> is a breakdown of the rdata structure. There are four structures that make up the rdata.</p>


<ol>

        <li>Transport data (<tt>tp_data</tt>). This is information about the transport itself. This can be shallow copied since the transport data is constant for all messages.</li>

        <li>Packet info (<tt>pkt_info</tt>). This is data about the packet itself, including the source IP address and port, the raw text of the message, and the time the packet arrived. This information will need to be deep copied since it will be overwritten by the transport on reception of the next packet. This is easy, though, since there are no pointers in this structure. A memcpy will do the trick just fine.</li>

        <li>Message info (<tt>msg_info</tt>). This is the fun one, but luckily it's not as bad as it looks. We can set the <tt>msg_buffer</tt> and <tt>len</tt> fields ourselves based on the <tt>pkt_info</tt>. Then, there's a handy function in the PJSIP parser called <tt>pjsip_parse_rdata()</tt> we can call to set the rest of the pointers to point into the appropriate sections of the message.</li>

        <li>Endpoint info (<tt>endpt_info</tt>). This contains module-specific data. Since this is an array of void pointers, it will be impossible to copy all of the data properly. If we go with method 2, then we will absolutely need to copy two pieces of information from here. Specifically, we'll need to copy the transaction and dialog from here. Since we know the size of the structures to copy and we can find the module ids of the transaction layer and user agent layer, this is not hard to do. If we go with method 1, then copying this data is not necessary since the transaction and user agent layers will not have placed their data in the <tt>endpt_info</tt> yet and will do so once they are called into.</li>

</ol>


<h3><a name="SIPComponentThreadingModel-OriginatingfromIce"></a>Originating from Ice</h3>


<p>If a SIP operation originates from Ice (say via a SipSession method call), then the operation is much simpler. On session creation, the <tt>start()</tt> operation is dispatched to a particular thread, and the ID of the thread is saved as a member of the session. Further operations on the session can be dispatched to the same thread. If an operation requires a value to be returned, then AMD can be used to return a value appropriately. Operations for other types of SIP subcomponents will use similar strategies.</p>


<p>Unlike with <tt>pjsip_rx_data</tt>, <tt>pjsip_tx_data</tt> is not a reused structure; a new one is created for each outgoing message.</p>

    </div>

        <div id="commentsSection" class="wiki-content pageSection">

        <div style="float: right;">

            <a href="https://wiki.asterisk.org/wiki/users/viewnotifications.action" class="grey">Change Notification Preferences</a>

        </div>

        <a href="https://wiki.asterisk.org/wiki/display/TOP/SIP+Component+Threading+Model">View Online</a>

        |

        <a href="https://wiki.asterisk.org/wiki/pages/diffpagesbyversion.action?pageId=12550192&revisedVersion=9&originalVersion=8">View Changes</a>

                |

        <a href="https://wiki.asterisk.org/wiki/display/TOP/SIP+Component+Threading+Model?showComments=true&amp;showCommentArea=true#addcomment">Add Comment</a>

            </div>

</div>

</div>

</div>

</div>

</body>

</html>