<html>

 <body>

  <div style="font-family: Verdana, Arial, Helvetica, Sans-Serif;">

   <table bgcolor="#f9f3c9" width="100%" cellpadding="8" style="border: 1px #c9c399 solid;">

    <tr>

     <td>

      This is an automatically generated e-mail. To reply, visit:

      <a href="https://reviewboard.asterisk.org/r/3653/">https://reviewboard.asterisk.org/r/3653/</a>

     </td>

    </tr>

   </table>

   <br />

 <p>On June 28th, 2014, 8:03 a.m. UTC, <b>Alexander Traud</b> wrote:</p>

 <blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">

  <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">> hence a 'retry once' poll may not be sufficient regardless to read all of the data from the socket.

I am not sure, I understand you guys. Just to clarify my intentions: The proposed patch is not about to fix/resolve all issues in TLS reading. Actually, I think the latest changes in tcptls.c do this already. However here, this patch tries to workaround one bug in the existing code. Our current code does:

1. ast_wait_for_input (no operation, from my point of view)

2. fgets

3. (optionally) ast_wait_for_input

4. (optionally) fgets

… and so on.

Here, the proposed patch changes this to

1. fgets

2. (optionally) ast_wait_for_input 

3. (optionally) fgets 

4. (optionally) ast_wait_for_input

… and so on.

The released code (as of Asterisk 12.3.2) fails in my corner case (see the appended bug), because the underlying SSL_read returned SSL_ERROR_WANT_READ. Therefore in step 2, fgets returned -1 already, therefore the while loop is exited with the failure code -1.

Yes, theoretically, fgets could return -1 more than once. Therefore a retry-once *might* not be sufficient. However, this is not what this patch is about to fix. I do not face that particular issue (retry-n required), nor do I try to solve that. Until someone offers a patch which introduces a retry-n-times, I would like to see this patch to pass.</pre>

 </blockquote>

 <p>On June 28th, 2014, 8:21 p.m. UTC, <b>Matt Jordan</b> wrote:</p>

 <blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">

  <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">I agree that the current situation is not good for large packets coming across a TLS connection.

Generally, however, when we fix a problem, we really do try to fix it for good. Sometimes we don't succeed - but the goal is usually to make sure that when an issue gets closed, it stays closed for good.

I'll grant that there are times this rule gets bent - particularly when there are major restructuring issues that would have to occur to fix a problem. Sometimes, then, a band-aid is sufficient. That probably was the case here once upon a time, however, several security vulnerabilities later, and the vast majority of the restructuring has already been done. The TCP/TLS structure supports the concept of overflow; the underlying tcptls layer has been structured better to handle the reading; the chan_sip code needs to be updated.

I don't think this patch resolves the issue. It may fix it for some people - maybe even most people - but all that means is that someone else will run into it again, and the problem is now more insidious: we "fixed it", without actually fixing it. Now they come to the bug report, it's closed, and they open up another bug report. More bug marshalling ensues, trying to understand how we fixed it and its still broken. Or we leave the bug report open, and we magically "fix" it for some people without them knowing. Neither solution is good.

If we're going to make it so that reading large packets over a TLS socket works, I'd prefer to do it correctly, or not at all.</pre>

 </blockquote>

<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">> When we fix a problem, we really do try to fix it for good.

Just for your information:

Revision 415907 (part of Asterisk 12.4.0 and Asterisk 11.11) changed the behaviour of this bug: There is no console message at all, anymore (with or without this patch here). I thought, revision 415907 covers the case more-than-one-SSL_ERROR_WANT_READ already.

> they come to the bug report

Just for your information:

It took me three weeks from that bug to that report.

> It may fix it for some people - maybe even most people.

Just for your information:

https://code.google.com/p/csipsimple/issues/detail?id=2280

https://code.google.com/p/csipsimple/issues/detail?id=2614

https://code.google.com/p/csipsimple/issues/detail?id=2678

https://groups.google.com/d/topic/csipsimple-dev/V1xu4rAEYHA

https://groups.google.com/d/msg/csipsimple-users/O-tzZmGuO8s/0p5QTCyKHfcJ

http://lists.digium.com/pipermail/asterisk-users/2014-June/283636.html

> There still is fgets() in chan_sip.c, it should be killed.

> It should be fairly straight forward now to change the code to use the sip_tcp_read.

This review was not closed, yet. Is something expected from me?

I do not have the equipment for a more-than-one-SSL_ERROR_WANT_READ.</pre>

<br />

<p>- Alexander</p>

<br />

<p>On June 20th, 2014, 2:06 p.m. UTC, Alexander Traud wrote:</p>

<table bgcolor="#fefadf" width="100%" cellspacing="0" cellpadding="8" style="background-image: url('https://reviewboard.asterisk.org/static/rb/images/review_request_box_top_bg.ab6f3b1072c9.png'); background-position: left top; background-repeat: repeat-x; border: 1px black solid;">

 <tr>

  <td>

<div>Review request for Asterisk Developers.</div>

<div>By Alexander Traud.</div>

<p style="color: grey;"><i>Updated June 20, 2014, 2:06 p.m.</i></p>

<div style="margin-top: 1.5em;">

 <b style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Bugs: </b>

 <a href="https://issues.asterisk.org/jira/browse/ASTERISK-18345">ASTERISK-18345</a>

</div>

<div style="margin-top: 1.5em;">

 <b style="color: #575012; font-size: 10pt;">Repository: </b>

Asterisk

</div>

<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Description </h1>

 <table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: 1px solid #b8b5a0">

 <tr>

  <td>

   <pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">With some large SDP, a *second* poll is required on the first part of a TLS message.

The current code did not poll a second time because the variable need_poll was inited with yes (1). That poll was a no-operation because there was a socket event already (which mandates fgets without poll). In the current code, poll returned immediately, fgets returned NULL, after_poll was yes (1), sip_tls_read returned failed (-1), _sip_tcp_helper_thread went to cleanup, called ast_tcptls_close_session_file, which closed the TLS connection.

The proposed patch, reads the gets the first message. If that failed, it does poll. This fixed all large SDP issues with SIP over TLS which I faced.

I am aware there were changes committed to tcptls.c just recently (revision 415907). Anyway, let us fix this bug as well.</pre>

  </td>

 </tr>

</table>

<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Testing </h1>

<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: 1px solid #b8b5a0">

 <tr>

  <td>

   <pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Asterisk 12.3</pre>

  </td>

 </tr>

</table>

<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Diffs</b> </h1>

<ul style="margin-left: 3em; padding-left: 0;">

 <li>trunk/channels/chan_sip.c <span style="color: grey">(416319)</span></li>

</ul>

<p><a href="https://reviewboard.asterisk.org/r/3653/diff/" style="margin-left: 3em;">View Diff</a></p>

  </td>

 </tr>

</table>

  </div>

 </body>

</html>