[asterisk-dev] Unicode in Text frames - how to fix?

Mon Apr 30 02:34:42 MST 2007

30 apr 2007 kl. 11.28 skrev Tim Panton:

> Asterisk's handling of text frames does not support unicode.
>
> We discovered this by accident, our Java IAX stack sends IAX text  
> frames in
> unicode (ascii is deprecated in Java) without a terminating '\0' byte.
> The IAX draft rfc (link) says that text frames should be in unicode.
> Asterisk however requires (but doesn't test for) a '\0' byte as the  
> traditional
> 'C' end of string marker, and determines the length of the text  
> string with
> strlen(data).
>
> Although we found it in the case of IAX text frames it looks like
> this is a general problem.
>
> At first glance it looks easy to fix, just add a lenght attribute  
> to the text frame.
> However this would change the channel api, so isn't to be done  
> lightly.
>
> Other options would be:
> 	1) change the IAX rfc to state that text frames are null  
> terminated ascii and reject
> any packets that aren't. (I.e. drop unicode)
> 	2) carry the unicode by encoding it in some way (like in html) and  
> mandate this.
> 	3) ??? ideas ????
>
As Mark said, the IAX draft is the specification, now it's up to the  
developer community
to make sure that the Asterisk implementation supports the draft,  
which it currently
does not.

We really need to take a deeper look into character sets both for  
text frames and Caller ID names.
SIP Caller ID Names, display names, are also UTF8, like the IAX  
protocol. The ZAP Caller ID names
are not, so we will need transcoding between these character sets.

/O