[asterisk-bugs] [JIRA] (ASTERISK-25645) res_rtp_asterisk: Lock inversion

Dade Brandon (JIRA) noreply at issues.asterisk.org
Wed Aug 17 11:05:57 CDT 2016


    [ https://issues.asterisk.org/jira/browse/ASTERISK-25645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=231861#comment-231861 ] 

Dade Brandon commented on ASTERISK-25645:
-----------------------------------------

@hexanol

We aren't using Asterisk 13+, so I've neither verified that the issue in here was actually present, nor looked in to resolving what would be the pjproject issue of locking around user-space code.  I personally don't think this bug exists since the person who opened this issue was using a trial and error approach to locating the commit that began causing is problem, and was using a weird use-case of making several calls "to/from the same endpoint" along with "inband signalling" (I assume he means DTMF, since inband means in the audio stream, and I doubt he means he's interpreting busy signals etc).  It probably would have been a lesser evil for the most users of Asterisk-13 + DTLS to deal with his perils, rather than reverting this patch, unless it was a lot larger than the limited case he described.

You could try something else that's more of a middle-ground... reverting whatever the Asterisk-13 equivalent of commit 1ad827 is, which is the commit that introduced the delay problem.  My patch was to try and resolve that more responsibly.   Here is the one line that was removed from 1ad827, which caused a delay to be introduced on the dtls negotiation: (your line numbers will differ)
{quote}{noformat}
@@ -1997,8 +2029,6 @@ static int __rtp_recvfrom(struct ast_rtp_instance *instance, void *buf, size_t s
        }

 #ifdef HAVE_OPENSSL_SRTP
{noformat}{color:red}{noformat}-       dtls_srtp_check_pending(instance, rtp, rtcp);{noformat}{color}{noformat}

        /* If this is an SSL packet pass it to OpenSSL for processing. RFC section for first byte value:
         * https://tools.ietf.org/html/rfc5764#section-5.1.2 */

{noformat}{quote}


Putting that back in is still the wrong way to fix this.  The problem is that the DTLS Hello from the client isn't responded to by Asterisk due to the client's remote IP address not being stored on an internal structure that dtls_srtp_check_pending is dependent on existing, so it buffers the response rather than sending it immediately.   My patch made it flush the buffer when the remote IP is set, however this previous code worked because the next UDP packet received on the same port (ie stun or rtp) would trigger the buffer flush.   The reason the connection delay is consistent, is because the client will re-try the ssl handshake if no response is received after a certain amount of time, which for Chrome happens after one second.  

> res_rtp_asterisk: Lock inversion
> --------------------------------
>
>                 Key: ASTERISK-25645
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-25645
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Resources/res_rtp_asterisk
>            Reporter: Steve Davies
>         Attachments: deadlocked_threads.txt, experimental_anti_deadlock, unlock_ice_before_callback
>
>
> Reported by Steve Davies on asterisk-dev:
> commit 5e6b1476a087407a052f007d326c504cfeefebe7
> ASTERISK-25614
> 2 code paths which approximate the following will cause a lock-inversion deadlock:
> approximate call orders are:
> a)
> pj_timer_heap_poll (PJ_LOCK)
> ast_rtp_on_ice_complete
> ast_rtp_instance_set_remote_address
> remote_address_set
> ast_rtp_remote_address_set
> (DTLS_LOCK)
> ...
> b)
> ast_pbx...
> app_dial
> bridge...
> read
> rtp_read
> ...
> __rtp_recvfrom
> (DTLS_LOCK)
> dtls_srtp_check_pending
> __rtp_sendto
> pj_ice_sess_send_data
> (PJ_LOCK)



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list