[Asterisk-Dev] IAX spec: Text formats and character sets

Steve Underwood steveu at coppice.org
Sat Apr 30 03:05:01 MST 2005


Michael Giagnocavo wrote:

>>UCS-2 is totally brain dead. That is a non-starter. UCS-4 is too bulky. 
>>Only UTF-8 makes any sense, and its ASCII compatible. Using UTF-8 is a 
>>no-brainer.
>>    
>>
>
>Well, then, is it agreed? :). And I don't see why UCS-2 is "brain dead",
>except for backwards compatibility (which is a good enough point, and what I
>pointed out in my original post). If you do a lot of work in non-ASCII,
>UTF-8 doesn't provide an advantage in space. As far as the higher Unicode
>codepoints (above the 16-bit ones), I'm not aware of any systems actually
>using them, so UCS-4 doesn't help things anyways.
>  
>
Unicode extends way beyond 16 bits these days. When they dumped the last 
48,000 Hanzi into Unicode, UCS-2 became obsolete. It is really UTF-16 
these days, and the term UCS-2 is fading out. That means all the 
complexity of UTF-8, and none of the benefits. Microsoft now seems stuck 
with this stupidity, due to their usual lack for foresight. SMS, and 
some other phone related stuff also uses it. Fortunately, few other 
things do.

For Asian users UTF-8 basically makes all files 50% bigger than they 
used to be. However, that is just the way things are and everyone is 
slowly learning to live with it.

Regards,
Steve




More information about the asterisk-dev mailing list