[Asterisk-Dev] IAX spec: Text formats and character sets

Fri Apr 29 16:32:35 MST 2005

On Apr 29, 2005, at 2:37 PM, Michael Giagnocavo wrote:

>>> Yea, I guess agreed on UTF-8 was too strong a phrase. People have  
>>> said
>>> ISO8859-1 is still in use by certain systems. I don't think  
>>> anyone is
>>> arguing that IAX shouldn't use Unicode.
>>>
>>> I'm guessing that the main issue would be to use UTF-8 or UCS-2.  
>>> It really
>>> depends how much text is going to be non-ASCII. If you think that  
>>> Asian
>>> language users will be big consumers of IAX, then UTF-8 would be  
>>> more of a
>>> burden (processing as well as size).
>>>
>>>
>>>
>> UCS-2 is totally brain dead. That is a non-starter. UCS-4 is too  
>> bulky.
>> Only UTF-8 makes any sense, and its ASCII compatible. Using UTF-8  
>> is a
>> no-brainer.
>>
>
> Well, then, is it agreed? :). And I don't see why UCS-2 is "brain  
> dead",
> except for backwards compatibility (which is a good enough point,  
> and what I
> pointed out in my original post). If you do a lot of work in non- 
> ASCII,
> UTF-8 doesn't provide an advantage in space. As far as the higher  
> Unicode
> codepoints (above the 16-bit ones), I'm not aware of any systems  
> actually
> using them, so UCS-4 doesn't help things anyways.

UCS-2 is brain-dead because it isn't big enough to hold all of the  
Unicode characters currently in use.  So you'd need to use UTF-16  
instead of UCS-2, but then you're back to multiple-length character  
encodings; at that point you might as well just drop back to UTF-8.

Scott