[asterisk-bugs] [JIRA] (ASTERISK-18345) [patch] sips connection dropped by asterisk with a large INVITE
Elazar Broad (JIRA)
noreply at issues.asterisk.org
Thu Jul 31 11:50:56 CDT 2014
[ https://issues.asterisk.org/jira/browse/ASTERISK-18345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Elazar Broad updated ASTERISK-18345:
------------------------------------
Attachment: tcptls_pollv2.diff
This version removes the separate while loop for the read and instead continues the main while loop which already has the timeout implemented.
> [patch] sips connection dropped by asterisk with a large INVITE
> ---------------------------------------------------------------
>
> Key: ASTERISK-18345
> URL: https://issues.asterisk.org/jira/browse/ASTERISK-18345
> Project: Asterisk
> Issue Type: Bug
> Security Level: None
> Components: Channels/chan_sip/TCP-TLS
> Affects Versions: SVN, 1.8.4, 11.4.0, 11.5.0
> Reporter: Stephane Chazelas
> Attachments: tcptls_poll.diff, tcptls_pollv2.diff, tlsBigSDPdebug.patch, tlsBigSDP.patch, tls_read_fix_try1_1.8.11.1.diff, tls_read_fix_try2_1.8.11.1.diff, tls_read_fix_try3_1.8.11.1.diff, tls_read.patch
>
>
> When using jitsi (http://jitsi.org) (debian amd64 one) as sip-tls extension, one can see the SSL connection to asterisk being dropped (abnormally, but that seems due to ASTERISK-18342) during the registration and placing calls don't work.
> I first thought it was a SSL method issue as jitsi doesn't seem to support SSLv3 or TLSv1 and I was able to make it work by using a MitM that proxied the connection through socat: jitsi was able to talk to socat OK and socat to asterisk OK.
> But it looks more like a timing/undeterministic issue. I then had a look at the code, added a little logging and found out that the connection was closed because of fgets() returning NULL in _sip_tcp_helper_thread().
> I then added logging to ssl_read() to see if SSL_read() ever failed, but it doesn't so I don't understand how that fgets could return eof/error. In that case. Then, I had a hard time understanding that business of need_poll/after_poll.
> If I understand correctly, tcptls_session->fd is the network socket that carries the encrypted data and other ssl out-of-band stuff and has been made non-blocking, and tcptls_session->f which is a funopen(tcptls_session->ssl, ssl_read, ssl_write, NULL, ssl_close) (or fopencookie Linux equivalent). polls are made on the fd before doing fgets that eventually call SSL_read. That sounds to me like a recipe for catastrophy, deadlocks and the like but I have to admit I have not understood/seen the design fully.
> I still don't get how fgets() can return NULL here but I tried to bring the need_poll/after_poll trick further by doing:
> {code}
> @@ -2659,7 +2637,7 @@ static void *_sip_tcp_helper_thread(stru
> * TLS layer */
> if (!tcptls_session->ssl || need_poll) {
> need_poll = 0;
> - after_poll = 1;
> + after_poll++;
> res = ast_wait_for_input(tcptls_session->fd, timeout);
> if (res < 0) {
> ast_debug(2, "SIP TCP server :: ast_wait_for_input returned %d\n", res);
> @@ -2674,7 +2654,7 @@ static void *_sip_tcp_helper_thread(stru
> ast_mutex_lock(&tcptls_session->lock);
> if (!fgets(buf, sizeof(buf), tcptls_session->f)) {
> ast_mutex_unlock(&tcptls_session->lock);
> - if (after_poll) {
> + if (after_poll > 1) {
> goto cleanup;
> } else {
> need_poll = 1;
> {code}
> and it fixed the issue.
> So, there's something definitely wrong though I couldn't tell exactly what.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
More information about the asterisk-bugs
mailing list