[asterisk-dev] [Code Review] 3653: chan_sip: (Optionally) poll even on first part of TLS message

rmudgett reviewboard at asterisk.org
Thu Jul 31 11:34:36 CDT 2014


> On June 27, 2014, 9:42 a.m., Matt Jordan wrote:
> > trunk/channels/chan_sip.c, line 2755
> > <https://reviewboard.asterisk.org/r/3653/diff/2/?file=59976#file59976line2755>
> >
> >     So, I had to go back and do some research to see if this solution was correct - and I think it is a step in the right direction, but may still be missing some key pieces to properly handling large messages (or multiple messages) received over a TLS socket.
> >     
> >     Examining _sip_tcp_helper_thread, a socket event may not always have tripped the file descriptor when sip_tls_read has been called. Another way in which the method can be called is when tcptls_session->overflow_buf has data within it. This buffer is used to store the remaining bytes when multiple SIP messages are read in a single read from a socket; the buffer holds the bytes that were read from the previous read operation. When that occurs, a previous ast_poll will not necessarily have been called, and it is up to the sip_*_read function to use the contents of the buffer as the beginning of the next message.
> >     
> >     However, this clearly does not occur with sip_tls_read. The sip_tcp_read is the only function that handles this case; sip_tls_read does not append data to or otherwise use tcptls_session->overflow_buf.
> >     
> >     The question then becomes: should sip_tls_read be cognizant of overflow? If so, it should be appending data into the overflow buffer, in which case this patch's assumption that ast_poll has already been called and a fd has been tripped is wrong.
> >     
> >     If, however, the messages are guaranteed to be framed, then the current implementation of not bothering with the overflow buffer is correct, and you can assume that ast_poll already detected a tripped fd prior to sip_tls_read.
> >     
> >     I think, unfortunately, that you cannot expect that the latter is the case. This is for two reasons:
> >     (1) Looking at the documentation for SSL_Read (https://www.openssl.org/docs/ssl/SSL_read.html):
> >     
> >     "SSL_read() works based on the SSL/TLS records. The data are received in records (with a maximum record size of 16kB for SSLv3/TLSv1). Only when a record has been completely received, it can be processed (decryption and check of integrity). Therefore data that was not retrieved at the last call of SSL_read() can still be buffered inside the SSL layer and will be retrieved on the next call to SSL_read(). If num is higher than the number of bytes buffered, SSL_read() will return with the bytes buffered. If no more bytes are in the buffer, SSL_read() will trigger the processing of the next record. Only when the record has been received and processed completely, SSL_read() will return reporting success. At most the contents of the record will be returned. As the size of an SSL/TLS record may exceed the maximum packet size of the underlying transport (e.g. TCP), it may be necessary to read several packets from the transport layer before the record is complete and SSL_read() can succeed."
> >     
> >     Ideally, the tcptls code already handles that for us (although, in point #2, we'll see that it doesn't). However, the fact that a single read operation is not sufficient to request all data (which is part of the poll + try again code in sip_tls_read) also would make me think that - for sufficiently fast sending of data - you will get more than a single message when you complete the reading of the first message. This would imply that sip_tls_read *should* be making use of the overflow buffer in the same fashion as sip_tcp_read, and thus you cannot assume directly that ast_poll has already been called when this function is called. That changes the patch substantially.
> >     
> >     (2) Worse, you cannot assume that SSL_Read has finished reading the buffer. When tcptls_stream_read is called (as a result of the fgets call), SSL_Read may return SSL_ERROR_WANT_READ. Due to the 'shared' nature of the the stream, chan_sip has to set stream->exclusive_input to 0, which means we'll return EAGAIN as opposed to calling SSL_Read a subsequent time. sip_tls_read currently will not handle this situation correctly; I think your change may help slightly in that it will force a second poll, but that is still no guarantee that all data will be read. This does mean however that we are not necessarily sure that we have read an entire message in sip_tls_read, and hence a 'retry once' poll may not be sufficient regardless to read all of the data from the socket.
> 
> wdoekes wrote:
>     > This would imply that sip_tls_read *should* be making use of the overflow buffer in
>     > the same fashion as sip_tcp_read, and thus you cannot assume directly that ast_poll
>     > has already been called when this function is called.
>     
>     Pointing to: https://reviewboard.asterisk.org/r/2123/
>     
>     > [...] The patch specifically targets TCP connections and not TLS. TLS connections
>     > were not reported as having the issue, plus changing TLS would be a much more
>     > invasive operation.
>     > 
>     > In my opinion, we should remove the use of FILE handles altogether in the TCP/TLS
>     > code [...]
>     
>     Yes. There still is fgets() in chan_sip.c, it should be killed.
> 
> rmudgett wrote:
>     With the recent security fix that affected TCP/TLS with HTTP, eliminating the fgets() usage is much easier.  The SSL_read/SSL_write functions are used correctly (or used more correctly :) ) with that change.  It should be fairly straight forward now to change the code to use the sip_tcp_read instead and remove sip_tls_read with its weird need_poll/after_poll flags.  The sip_tcp_read uses the mentioned tcptls_session->overflow_buf since that overflow buffer was created specifically for the sip_tcp_read function.  Even better would be to change chan_sip to not use the tcptls_session->f pointer at all and call the ast_tcptls_server_read() and ast_tcptls_server_write() functions instead.
> 
> ebroad wrote:
>     I added a patch to the issue in Jira which replaces sip_tls_read() and sip_tcp_read() with a single sip_tcptls_read(), which is a copy of sip_tcp_read(), except that it utilizes ast_tcptls_server_read(). My only concern is the while loop which handles EAGAIN, it could block infinitely, any suggestions on how to handle this? I am considering adding a counter that breaks the loop once it is equal to timeout..
> 
> rmudgett wrote:
>     In this case the EAGAIN can be interpreted as we didn't get any data.  Return and let the caller poll/wait for more data to come in.  When you're ready put that patch on this review as the next revision.
> 
> ebroad wrote:
>     Thanks! To clarify, get rid of the loop in sip_tcptls_read() and move it(or something like it) to *_sip_tcp_helper_thread()?

After looking at the sip_tcp_read() function when you get an EAGAIN just do a continue that goes to the top of the main while loop in sip_tcp_read().  The loop is already there with a timeout since you are replacing the recv() with the ast_tcptls_server_read() call.


- rmudgett


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/3653/#review12371
-----------------------------------------------------------


On June 20, 2014, 9:06 a.m., Alexander Traud wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/3653/
> -----------------------------------------------------------
> 
> (Updated June 20, 2014, 9:06 a.m.)
> 
> 
> Review request for Asterisk Developers.
> 
> 
> Bugs: ASTERISK-18345
>     https://issues.asterisk.org/jira/browse/ASTERISK-18345
> 
> 
> Repository: Asterisk
> 
> 
> Description
> -------
> 
> With some large SDP, a *second* poll is required on the first part of a TLS message.
> 
> The current code did not poll a second time because the variable need_poll was inited with yes (1). That poll was a no-operation because there was a socket event already (which mandates fgets without poll). In the current code, poll returned immediately, fgets returned NULL, after_poll was yes (1), sip_tls_read returned failed (-1), _sip_tcp_helper_thread went to cleanup, called ast_tcptls_close_session_file, which closed the TLS connection.
> 
> The proposed patch, reads the gets the first message. If that failed, it does poll. This fixed all large SDP issues with SIP over TLS which I faced.
> 
> I am aware there were changes committed to tcptls.c just recently (revision 415907). Anyway, let us fix this bug as well.
> 
> 
> Diffs
> -----
> 
>   trunk/channels/chan_sip.c 416319 
> 
> Diff: https://reviewboard.asterisk.org/r/3653/diff/
> 
> 
> Testing
> -------
> 
> Asterisk 12.3
> 
> 
> Thanks,
> 
> Alexander Traud
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20140731/3ea5bf5e/attachment-0001.html>


More information about the asterisk-dev mailing list