[Asterisk-Dev] IAX spec: Text formats and character sets
Steve Underwood
steveu at coppice.org
Sat Apr 30 03:05:01 MST 2005
Michael Giagnocavo wrote:
>>UCS-2 is totally brain dead. That is a non-starter. UCS-4 is too bulky.
>>Only UTF-8 makes any sense, and its ASCII compatible. Using UTF-8 is a
>>no-brainer.
>>
>>
>
>Well, then, is it agreed? :). And I don't see why UCS-2 is "brain dead",
>except for backwards compatibility (which is a good enough point, and what I
>pointed out in my original post). If you do a lot of work in non-ASCII,
>UTF-8 doesn't provide an advantage in space. As far as the higher Unicode
>codepoints (above the 16-bit ones), I'm not aware of any systems actually
>using them, so UCS-4 doesn't help things anyways.
>
>
Unicode extends way beyond 16 bits these days. When they dumped the last
48,000 Hanzi into Unicode, UCS-2 became obsolete. It is really UTF-16
these days, and the term UCS-2 is fading out. That means all the
complexity of UTF-8, and none of the benefits. Microsoft now seems stuck
with this stupidity, due to their usual lack for foresight. SMS, and
some other phone related stuff also uses it. Fortunately, few other
things do.
For Asian users UTF-8 basically makes all files 50% bigger than they
used to be. However, that is just the way things are and everyone is
slowly learning to live with it.
Regards,
Steve
More information about the asterisk-dev
mailing list