[Asterisk-Dev] IAX spec: Text formats and character sets

Fri Apr 29 00:59:19 MST 2005

"Olle E. Johansson" <oej at edvina.net> writes:

> Kristian Nielsen wrote:

> > One problem is that strncpy(), which is used extensively in Asterisk,
> > cannot be used on utf8 strings. The reason is that it might truncate the
> > string in the middle of a multibyte character, leaving invalid utf-8
> > data in the destination string.
> 
> Good observation! Thank you. What do we use instead?

Well, it is easy to implement our own strncpy_utf8() that copies only up
to and including the last utf-8 character not going over the maximum
specified byte length. Then we could also fix it to actually
zero-terminate the copy (strncpy() doesn't always zero-terminate the
destination as I am _sure_ everyone remebers :-).

But there are other functions with similar problems (snprintf(), ...).
It is a pain having to through all of Asterisk, examining every bounded
string operation to understand if it accesses utf-8 data or not, and
fixing it if it does. I just can't think of an easier way right now :-(

 - Kristian.

-- 
Kristian Nielsen   kn at sifira.dk
Development Manager, Sifira A/S