[Asterisk-Dev] IAX spec: Text formats and character sets
Scott Laird
scott at sigkill.org
Fri Apr 29 16:32:35 MST 2005
On Apr 29, 2005, at 2:37 PM, Michael Giagnocavo wrote:
>>> Yea, I guess agreed on UTF-8 was too strong a phrase. People have
>>> said
>>> ISO8859-1 is still in use by certain systems. I don't think
>>> anyone is
>>> arguing that IAX shouldn't use Unicode.
>>>
>>> I'm guessing that the main issue would be to use UTF-8 or UCS-2.
>>> It really
>>> depends how much text is going to be non-ASCII. If you think that
>>> Asian
>>> language users will be big consumers of IAX, then UTF-8 would be
>>> more of a
>>> burden (processing as well as size).
>>>
>>>
>>>
>> UCS-2 is totally brain dead. That is a non-starter. UCS-4 is too
>> bulky.
>> Only UTF-8 makes any sense, and its ASCII compatible. Using UTF-8
>> is a
>> no-brainer.
>>
>
> Well, then, is it agreed? :). And I don't see why UCS-2 is "brain
> dead",
> except for backwards compatibility (which is a good enough point,
> and what I
> pointed out in my original post). If you do a lot of work in non-
> ASCII,
> UTF-8 doesn't provide an advantage in space. As far as the higher
> Unicode
> codepoints (above the 16-bit ones), I'm not aware of any systems
> actually
> using them, so UCS-4 doesn't help things anyways.
UCS-2 is brain-dead because it isn't big enough to hold all of the
Unicode characters currently in use. So you'd need to use UTF-16
instead of UCS-2, but then you're back to multiple-length character
encodings; at that point you might as well just drop back to UTF-8.
Scott
More information about the asterisk-dev
mailing list