[Asterisk-Dev] IAX spec: Text formats and character sets
Steve Underwood
steveu at coppice.org
Sat Apr 30 03:07:16 MST 2005
Michael Giagnocavo wrote:
>>Michael Giagnocavo wrote:
>>
>>
>>>Hmm, you're right. That's doesn't look bad at all.
>>>
>>>But... what about for comparisons and other Unicode operations? Do the
>>>libraries available support some UTF-8 version of strcmp, strchr,
>>>strcasecmp, etc.?
>>>
>>>
>>>
>>Some of them are easy (strcmp, for example). Most of them are harder,
>>because they either need to know character boundaries, or need case
>>mappings (strcasecmp, for example). Any function that searches for a
>>'char' in a string also won't work if the character being searched for
>>is a multi-byte one.
>>
>>
>
>Not even strcmp works, because you have things like combinations where you
>can represent in Unicode a character using different code points, but it's
>still considered the same. Say, a Latin o with an accent mark. Using wide
>char internally solves these issues, and is most likely faster, depending on
>the data.
>
>
Too right. Look at IBM's internationalisation classes for Unicode. It
takes megabytes of code to compare two strings.
Regards,
Steve
More information about the asterisk-dev
mailing list