[asterisk-bugs] [Asterisk 0016299]: [patch] pedantic sip checking needed to generate valid messages (but broken)

Mon Dec 21 14:09:31 CST 2009

A NOTE has been added to this issue. 
====================================================================== 
https://issues.asterisk.org/view.php?id=16299 
====================================================================== 
Reported By:                wdoekes
Assigned To:                dvossel
====================================================================== 
Project:                    Asterisk
Issue ID:                   16299
Category:                   Channels/chan_sip/General
Reproducibility:            always
Severity:                   trivial
Priority:                   normal
Status:                     acknowledged
Target Version:             1.6.1.13
Asterisk Version:           SVN 
JIRA:                       SWP-451 
Regression:                 No 
Reviewboard Link:            
SVN Branch (only for SVN checkouts, not tarball releases):  trunk 
SVN Revision (number only!):  
Request Review:              
====================================================================== 
Date Submitted:             2009-11-21 15:03 CST
Last Modified:              2009-12-21 14:09 CST
====================================================================== 
Summary:                    [patch] pedantic sip checking needed to generate
valid messages (but broken)
Description: 
In function 'initreqprep' in channels/chan_sip.c, the following code can be
found:

if (sip_cfg.pedanticsipchecking) {
  ast_uri_encode(n, tmp_n, sizeof(tmp_n), 0);
  n = tmp_n;
  ast_uri_encode(l, tmp_l, sizeof(tmp_l), 0);
  l = tmp_l;
}
<...snip...>
snprintf(from, sizeof(from), "\"%s\" <sip:%s@%s>;tag=%s", n, l, d,
p->tag);

The function ast_uri_encode encodes chars < 32 and > 127 -- perhaps one
should replace that with ((signed char)*ptr < 32) ;-) -- as %HH hex
escapes.

A couple of problems (all minor):
- ast_uri_encode forgets to escape % and 0x7F (RFC2396 2.4.2 and 2.4.3)
- ast_uri_encode does not escape <, >, @ and some other characters that
'l' would've liked to be escaped
- 'n' is not supposed to be hex-escaped (RFC4475 3.1.1.5 writes """The
display name portion of the To and From header fields is "%Z%45". Note that
this is not the same as %ZE.""")
- 'n' does however like the double-quote to be escaped, by a backslash
- ast_uri_decode is called on entire messages, not on already broken up
parts

Browsing through chan_sip.c, I see pedanticsipchecking used in these
cases:
- allow blanks between the header key and the colon
- allow multiline sip headers
- compare the from-tag/to-tag/branches as well instead of only the
call-id
- check that a packet really is for us (handle_incoming)
- encode/decode reserved characters

In my humble opinion, I don't think creating valid output (correctly
encoding illegal characters) should be enabled only by a flag that is
reported as being 'slow'. And, not as relevant to me in this case, but
decoding valid hex-escapes from peers does not sound like too much to ask,
either.

What to do?
- I can easily write a patch that fixes my minor issue: always -- not
dependent on the pedanticsipchecking -- run a s/"/\\"/g (instead of
ast_uri_encode) on the name part in the From.
- I can also easily fix ast_uri_encode to escape %, 0x7f and the others as
mentioned in RFC2396 2.4.3.
- Fixing all ast_uri_decode to operate first after the data has been
broken up is a bit more tedious, so I can't promise I'll do that.

Regards,
Walter Doekes
OSSO B.V.
====================================================================== 

---------------------------------------------------------------------- 
 (0115598) wdoekes (reporter) - 2009-12-21 14:09
 https://issues.asterisk.org/view.php?id=16299#c115598 
---------------------------------------------------------------------- 
dvossel:

I mean that the display name portion of the caller id ('n' being the
variable pointing to that in the particular piece of code) should not be
percent-encoded. Examples of display names:
- ABC => "ABC" (obviously, the double quotes being optional but
recommended) 
- A%C => "A%C" (do not percent-encode the %)
- A"C => "A\"C" (do backslash-escape the ")
- A<C => "A<C" (here, the double quotes are not optional)

oej:

Your comments about UTF8 and ASCII do not make much sense to me:
- ASCII is the 7bit subset of UTF8. If the domain part is ASCII, it might
still need to be encoded if characters that delimit the domain in SIP are
allowed in a domain name (the bad example being ">" which is not allowed in
a domain name). Further, I am not sure that domains really needs to be
limited to ASCII only (see IDN domain names).
- You do not *have* to handle UTF8 if you do not want to(*). You can pass
it along from the config files and out to the wire without knowing that
you're dealing with UTF8. Only if you're dealing with multiple encodings or
if you have to process characters string lengths (as opposed to byte string
lengths) do you *need* to explicitly work with UTF8.

(*)You really should guard against overlong encodings. But as you're not
processing the UTF8, you won't be affected by them. But a piece of code
that trusts your output might choke on it / be exploitable through it. 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2009-12-21 14:09 wdoekes        Note Added: 0115598                          
======================================================================