[Asterisk-Dev] IAX spec: Text formats and character sets

Olle E. Johansson oej at edvina.net
Thu Apr 28 09:02:22 MST 2005


Steve Underwood wrote:
> Olle E. Johansson wrote:
> 
>> Steve Underwood wrote:
>>
>>> Hi,
>>>
>>> I raised this with Mark ages ago, when I started putting Chinese into 
>>> IAX2 messages. I thought it should be specified that all text is 
>>> Unicode in UTF-8 form, but he seemed pretty indifferent to specifying 
>>> anything.
>>>
>>> There is no need to have ASCII + UTF-8. ASCII is a subset of UTF-8, 
>>> so they are fully compatible. Its only when you have 8 bit sets, like 
>>> the PC ones, that compatibility is an issue. Just define that all 
>>> strings in IAX2 are UTF-8, and that is the end of it.
>>>
>> ...yes, I'll admit that is an easy way out. But we still need to handle
>> conversion to ISO8859-1 caller ID's and find a way to do pattern 
>> matching and how to use "." and "@" in IAX to call SIP uri's - there 
>> are many things to consider. (The @ in an IAX2 dialstring separates 
>> extension from context...)
> 
> 
> Caller IDs are normally ASCII, not ISO8859-1. The other characters are 
> no problem. UTF-8 will pass them through without trouble. UTF-8 is 
> highly compatible with ASCII. URIs are kind of nasty, as they were not 
> internationalised from day one.
No - "Ö" in ISO-8859-1 is not "Ö" in UTF-8, it's a two byte character...
So ASCII works, but nothing else. And we still have the function in 
asterisk that strips characters from dial strings that are essential in 
SIP and the use of the @ character as a separator.

I believed a lot of people in the IRC confirmed that CID names are 
really ISO-8859-1, but that may apply outside US. We don't have them 
here in Sweden, so I don't really now.

So, even if we change all strings to UTF8, we still have to change 
extension handling quite a lot to have a transparent solution.

/O




More information about the asterisk-dev mailing list