[asterisk-bugs] [Asterisk 0016299]: [patch] pedantic sip checking needed to generate valid messages (but broken)
Asterisk Bug Tracker
noreply at bugs.digium.com
Wed Dec 9 05:29:13 CST 2009
A NOTE has been added to this issue.
======================================================================
https://issues.asterisk.org/view.php?id=16299
======================================================================
Reported By: wdoekes
Assigned To: dvossel
======================================================================
Project: Asterisk
Issue ID: 16299
Category: Channels/chan_sip/General
Reproducibility: always
Severity: trivial
Priority: normal
Status: acknowledged
Target Version: 1.6.1.13
Asterisk Version: SVN
JIRA: SWP-451
Regression: No
Reviewboard Link:
SVN Branch (only for SVN checkouts, not tarball releases): trunk
SVN Revision (number only!):
Request Review:
======================================================================
Date Submitted: 2009-11-21 15:03 CST
Last Modified: 2009-12-09 05:29 CST
======================================================================
Summary: [patch] pedantic sip checking needed to generate
valid messages (but broken)
Description:
In function 'initreqprep' in channels/chan_sip.c, the following code can be
found:
if (sip_cfg.pedanticsipchecking) {
ast_uri_encode(n, tmp_n, sizeof(tmp_n), 0);
n = tmp_n;
ast_uri_encode(l, tmp_l, sizeof(tmp_l), 0);
l = tmp_l;
}
<...snip...>
snprintf(from, sizeof(from), "\"%s\" <sip:%s@%s>;tag=%s", n, l, d,
p->tag);
The function ast_uri_encode encodes chars < 32 and > 127 -- perhaps one
should replace that with ((signed char)*ptr < 32) ;-) -- as %HH hex
escapes.
A couple of problems (all minor):
- ast_uri_encode forgets to escape % and 0x7F (RFC2396 2.4.2 and 2.4.3)
- ast_uri_encode does not escape <, >, @ and some other characters that
'l' would've liked to be escaped
- 'n' is not supposed to be hex-escaped (RFC4475 3.1.1.5 writes """The
display name portion of the To and From header fields is "%Z%45". Note that
this is not the same as %ZE.""")
- 'n' does however like the double-quote to be escaped, by a backslash
- ast_uri_decode is called on entire messages, not on already broken up
parts
Browsing through chan_sip.c, I see pedanticsipchecking used in these
cases:
- allow blanks between the header key and the colon
- allow multiline sip headers
- compare the from-tag/to-tag/branches as well instead of only the
call-id
- check that a packet really is for us (handle_incoming)
- encode/decode reserved characters
In my humble opinion, I don't think creating valid output (correctly
encoding illegal characters) should be enabled only by a flag that is
reported as being 'slow'. And, not as relevant to me in this case, but
decoding valid hex-escapes from peers does not sound like too much to ask,
either.
What to do?
- I can easily write a patch that fixes my minor issue: always -- not
dependent on the pedanticsipchecking -- run a s/"/\\"/g (instead of
ast_uri_encode) on the name part in the From.
- I can also easily fix ast_uri_encode to escape %, 0x7f and the others as
mentioned in RFC2396 2.4.3.
- Fixing all ast_uri_decode to operate first after the data has been
broken up is a bit more tedious, so I can't promise I'll do that.
Regards,
Walter Doekes
OSSO B.V.
======================================================================
----------------------------------------------------------------------
(0114971) Nick_Lewis (reporter) - 2009-12-09 05:29
https://issues.asterisk.org/view.php?id=16299#c114971
----------------------------------------------------------------------
May I suggest that the basic char building blocks of the ABNF are built in
to chan_sip.c as bitmapped defined constants so that comparison can be done
quickly and efficiently to determine whether escaping is required. The
outgoing char would be compared with the defined constants by shifting 0x01
by value of the char.
(If bitmaps have to be kept to 32 bits then individual bitmaps can be made
for ascii ranges 32-63, 64-95, and 96-127. The lower 6 bits of the outgoing
char can be used as the shift index and the upper 2 bits of the outgoing
char can be used as the table index.)
I think the basic ABNF char building blocks are:
* ALPHA,
* DIGIT,
* HEX,
* LHEX,
* alphanum,
* reserved,
* mark,
* separators,
* token-unreserved, where token = 1*( alphanum / token-unreserved )
* word-unreserved, where word = 1*( alphanum / word-unreserved )
* user-unreserved,
* password-unreserved, where password = *( unreserved / escaped /
password-unreserved )
* param-unreserved,
* hnv-unreserved,
* uric-no-slash-unreserved, where uric-no-slash = unreserved / escaped /
uric-no-slash-unreserved
* pchar-unreserved, where pchar = unreserved / escaped / pchar-unreserved
* scheme-unreserved, where scheme = ALPHA *( ALPHA / DIGIT /
scheme-unreserved )
* reg-name-unreserved, where reg-name = 1*( unreserved / escaped /
reg-name-unreserved )
For example the defined constant SCHEME_UNRESERVED = 114349209288704
(or SCHEME_UNRESERVED[1] = 26624 if bitmaps are limited to 32 bits)
The basic ABNF char building blocks can be combined by bitwise ORing the
defined constants.
For example user-char where user = 1*( user-char ) can be expressed as:
user-char = ( unreserved / escaped / user-unreserved ) and the pseudo
code to determine whether escaping is needed could be:
while (user_char) {
if ((1 << user_char) | UNRESERVED | ESCAPED | USER_UNRESERVED) {
asis_char(user_char, &user_output);
} else {
escape_char(user_char, &user_output);
}
userchar++
}
Issue History
Date Modified Username Field Change
======================================================================
2009-12-09 05:29 Nick_Lewis Note Added: 0114971
======================================================================
More information about the asterisk-bugs
mailing list