[asterisk-bugs] [JIRA] (ASTERISK-24651) [patch] Fix race condition in DTLS
Badalian Vyacheslav (JIRA)
noreply at issues.asterisk.org
Wed Jan 21 01:50:34 CST 2015
[ https://issues.asterisk.org/jira/browse/ASTERISK-24651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=224586#comment-224586 ]
Badalian Vyacheslav commented on ASTERISK-24651:
------------------------------------------------
I can explain in detail, if you give me the Russian-speaking developer. I am very hard to explain in detail in English.
But I will try through Google translator.
1. You do not worry about the safety of SSL structure. As a result, one thread Asterisk it changes when the OpenSSL (in another thread) is code execution. I added a lot of mutex just to try to disable simultaneous access to SSL structures.
2. Your scheduler launches the second stream. He calls the callback function that for RTP and RTCP is tested and each test causes the timer. The result - a cascading effect.
Along the way, there was still a few things that I thought it necessary to correct, to improve stability.
Destroy example can be invoked by another thread, and we are not protected against this. Had to insert in rtp_engine many checks as otherwise accesses the already remote structures.
Or such malfunction with BIO also present when the structure SSL scrape, you do not need to do for FREE BIO. All related to BIO dokusentatsii deleted.
My patch unfortunately does not solve all problems. But most of the addresses, while the other cases to prevent a crash.
If an asterisk before the patch was falling every 15 minutes, but now we have a 2-3 fall in the month. Unfortunately work with OpenSSL you stretched in many places and there is a piece of falling I rules for the same reason - SSL structure has already died, and the other thread is using it.
For a global solution is required to rewrite a large piece of code to work with SSL structure to prevent simultaneous access to multiple streams. It needs to be thread safe to do so.
Hopefully Google translator was able to translate my thoughts so that they are understood)
> [patch] Fix race condition in DTLS
> ----------------------------------
>
> Key: ASTERISK-24651
> URL: https://issues.asterisk.org/jira/browse/ASTERISK-24651
> Project: Asterisk
> Issue Type: Bug
> Security Level: None
> Components: Resources/res_rtp_asterisk
> Affects Versions: 11.15.0
> Reporter: Badalian Vyacheslav
> Severity: Critical
> Attachments: fix_dtls_race_conditions.diff
>
>
> You code have race condition. Its broke {{SSL *}} struct and {{BIO *}}.
> Before apply patch to test need
> 1. Apply ASTERISK-24650
> 2. Get last openssl openssl-OpenSSL_1_0_1-stable from trunk https://github.com/openssl/openssl/tree/OpenSSL_1_0_1-stable (we fix in it DTLS issues) and configure (for centos 6) - {{./config --prefix=/usr --openssldir=/etc/pki/tls shared no-ssl2 zlib enable-camellia enable-seed enable-tlsext enable-rfc3779 enable-cms enable-md2 no-mdc2 no-rc5 no-ec2m no-gost --with-krb5-flavor=MIT --with-krb5-dir=/usr}}. Then make, make tests, make install.
> ! Patch tested in production server.
> ! Added Mutex tested with debug_threads. No deadlocks.
> Asterisk without fixed crash after 5-100 calls in havy concourent calls. After patch applyed 100 000 calls do normal.
> Race condition in 2 situations
> 1. One thread change or broke {{SSL}} struct then SSL code is executed. Added mutex to protect it. Also change {{BIO}} buffers to NULL after free {{SSL}} struct. Its help to detect SSL that allready free and don't write to {{BIO}}. {{SSL_free}} is free all assigned BIO buffers. No need to BIO_free (its may cause double free issue).
> 2. big touble with {{dtls_srtp_check_pending}} code...i divide retransmition scheclude to RTP and RCTP to fix race conditions...
> Ready to discuss or review patch.
> Do not have to be very strict with me, I am sharing practices on their own initiative.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
More information about the asterisk-bugs
mailing list