[asterisk-bugs] [Asterisk 0016299]: [patch] pedantic sip checking needed to generate valid messages (but broken)

Wed Dec 23 03:18:47 CST 2009

A NOTE has been added to this issue. 
====================================================================== 
https://issues.asterisk.org/view.php?id=16299 
====================================================================== 
Reported By:                wdoekes
Assigned To:                dvossel
====================================================================== 
Project:                    Asterisk
Issue ID:                   16299
Category:                   Channels/chan_sip/General
Reproducibility:            always
Severity:                   trivial
Priority:                   normal
Status:                     acknowledged
Target Version:             1.6.1.13
Asterisk Version:           SVN 
JIRA:                       SWP-451 
Regression:                 No 
Reviewboard Link:            
SVN Branch (only for SVN checkouts, not tarball releases):  trunk 
SVN Revision (number only!):  
Request Review:              
====================================================================== 
Date Submitted:             2009-11-21 15:03 CST
Last Modified:              2009-12-23 03:18 CST
====================================================================== 
Summary:                    [patch] pedantic sip checking needed to generate
valid messages (but broken)
Description: 
In function 'initreqprep' in channels/chan_sip.c, the following code can be
found:

if (sip_cfg.pedanticsipchecking) {
  ast_uri_encode(n, tmp_n, sizeof(tmp_n), 0);
  n = tmp_n;
  ast_uri_encode(l, tmp_l, sizeof(tmp_l), 0);
  l = tmp_l;
}
<...snip...>
snprintf(from, sizeof(from), "\"%s\" <sip:%s@%s>;tag=%s", n, l, d,
p->tag);

The function ast_uri_encode encodes chars < 32 and > 127 -- perhaps one
should replace that with ((signed char)*ptr < 32) ;-) -- as %HH hex
escapes.

A couple of problems (all minor):
- ast_uri_encode forgets to escape % and 0x7F (RFC2396 2.4.2 and 2.4.3)
- ast_uri_encode does not escape <, >, @ and some other characters that
'l' would've liked to be escaped
- 'n' is not supposed to be hex-escaped (RFC4475 3.1.1.5 writes """The
display name portion of the To and From header fields is "%Z%45". Note that
this is not the same as %ZE.""")
- 'n' does however like the double-quote to be escaped, by a backslash
- ast_uri_decode is called on entire messages, not on already broken up
parts

Browsing through chan_sip.c, I see pedanticsipchecking used in these
cases:
- allow blanks between the header key and the colon
- allow multiline sip headers
- compare the from-tag/to-tag/branches as well instead of only the
call-id
- check that a packet really is for us (handle_incoming)
- encode/decode reserved characters

In my humble opinion, I don't think creating valid output (correctly
encoding illegal characters) should be enabled only by a flag that is
reported as being 'slow'. And, not as relevant to me in this case, but
decoding valid hex-escapes from peers does not sound like too much to ask,
either.

What to do?
- I can easily write a patch that fixes my minor issue: always -- not
dependent on the pedanticsipchecking -- run a s/"/\\"/g (instead of
ast_uri_encode) on the name part in the From.
- I can also easily fix ast_uri_encode to escape %, 0x7f and the others as
mentioned in RFC2396 2.4.3.
- Fixing all ast_uri_decode to operate first after the data has been
broken up is a bit more tedious, so I can't promise I'll do that.

Regards,
Walter Doekes
OSSO B.V.
====================================================================== 

---------------------------------------------------------------------- 
 (0115712) Nick_Lewis (reporter) - 2009-12-23 03:18
 https://issues.asterisk.org/view.php?id=16299#c115712 
---------------------------------------------------------------------- 
Thanks for the info on IDNA

I think that the most internationalization friendly character set that can
practically be supported should be used internally for caller ID names. The
chan_techs should be responsible if necessary for converting them to the
character set supported by the protocols for the techs. I guess that the
conversion could be simply to replace unrepresentable characters with a
"?". Since the caller id name may be used in the dialplan I guess that
extensions.conf would need to support the full character set or at least
have some way to represent all characters perhaps using % escaping 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2009-12-23 03:18 Nick_Lewis     Note Added: 0115712                          
======================================================================