[Asterisk-Dev] IAX spec: Text formats and character sets

Olle E. Johansson oej at edvina.net
Sat Apr 30 02:22:12 MST 2005


Kevin P. Fleming wrote:
> Kristian Nielsen wrote:
> 
>> Well, it is easy to implement our own strncpy_utf8() that copies only up
>> to and including the last utf-8 character not going over the maximum
>> specified byte length. Then we could also fix it to actually
>> zero-terminate the copy (strncpy() doesn't always zero-terminate the
>> destination as I am _sure_ everyone remebers :-).
> 
> 
> I think 'easy' is an overstatement here. Any function that does this 
> needs to understand the _entire_ UTF-8 space to know which characters 
> are multibyte, and how many bytes they take up. This is not trivial, 
> although it's also not very complicated... just some tables and keeping 
> track of where you are so you can backtrack if needed.
> 
> The bigger issue is the performance hit this function will cause... if 
> we do it at all, it will have to be compile-time selectable as to 
> whether is uses raw strncpy() or utf8strnpcy().
That is no good. If you actually read my proposal, that's why I am 
separating the strings so that we have one fast ascii dial string 
(extension) and one UTF8 (alphaextension) the same dual pair for caller 
IDs and Caller ID names. That way, we will always know when we have UTF8 
and when we have "plain" strings (if that even exist).

/O



More information about the asterisk-dev mailing list