[asterisk-bugs] [JIRA] (ASTERISK-24651) [patch] Fix race condition in DTLS
Thomas Guebels (JIRA)
noreply at issues.asterisk.org
Fri Feb 13 03:47:35 CST 2015
[ https://issues.asterisk.org/jira/browse/ASTERISK-24651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=224900#comment-224900 ]
Thomas Guebels commented on ASTERISK-24651:
-------------------------------------------
Hi,
I'm having the same kind of crashes as in ASTERISK-24131 when calling from asterisk to jssip/chrome and even the latest fixes in libssl did not change anything.
However, I tested the attached patch and it fixes the crashes for me, so there's definitely something interesting in all the locking that it adds.
I'm still trying to find out what is happening but apparently preventing dtls_srtp_check_pending (which can be launched from the channel thread, the scheduler thread or the pj worker thread) to run concurrently with each other, or concurrently with dtls_perform_handshake seems to do the trick.
Also note that I can reproduce these crashes with only one call running at a time.
> [patch] Fix race condition in DTLS
> ----------------------------------
>
> Key: ASTERISK-24651
> URL: https://issues.asterisk.org/jira/browse/ASTERISK-24651
> Project: Asterisk
> Issue Type: Bug
> Security Level: None
> Components: Resources/res_rtp_asterisk
> Affects Versions: 11.15.0
> Reporter: Badalian Vyacheslav
> Severity: Critical
> Attachments: fix_dtls_race_conditions.diff
>
>
> You code have race condition. Its broke {{SSL *}} struct and {{BIO *}}.
> Before apply patch to test need
> 1. Apply ASTERISK-24650
> 2. Get last openssl openssl-OpenSSL_1_0_1-stable from trunk https://github.com/openssl/openssl/tree/OpenSSL_1_0_1-stable (we fix in it DTLS issues) and configure (for centos 6) - {{./config --prefix=/usr --openssldir=/etc/pki/tls shared no-ssl2 zlib enable-camellia enable-seed enable-tlsext enable-rfc3779 enable-cms enable-md2 no-mdc2 no-rc5 no-ec2m no-gost --with-krb5-flavor=MIT --with-krb5-dir=/usr}}. Then make, make tests, make install.
> ! Patch tested in production server.
> ! Added Mutex tested with debug_threads. No deadlocks.
> Asterisk without fixed crash after 5-100 calls in havy concourent calls. After patch applyed 100 000 calls do normal.
> Race condition in 2 situations
> 1. One thread change or broke {{SSL}} struct then SSL code is executed. Added mutex to protect it. Also change {{BIO}} buffers to NULL after free {{SSL}} struct. Its help to detect SSL that allready free and don't write to {{BIO}}. {{SSL_free}} is free all assigned BIO buffers. No need to BIO_free (its may cause double free issue).
> 2. big touble with {{dtls_srtp_check_pending}} code...i divide retransmition scheclude to RTP and RCTP to fix race conditions...
> Ready to discuss or review patch.
> Do not have to be very strict with me, I am sharing practices on their own initiative.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
More information about the asterisk-bugs
mailing list