[asterisk-dev] MySQL Realtime charset

Philipp Kempgen philipp.kempgen at amooma.de
Sun May 13 07:35:39 MST 2007


Philipp Kempgen wrote:

> Tilghman Lesher wrote:
> 
>> On Sunday 13 May 2007, Philipp Kempgen wrote:
>>> I couldn't find any code in res_config_mysql.c which cares about
>>> setting the right charset / collation for the MySQL connection.
>>>
>>> Shouldn't there by some queries like
>>> SET NAMES latin1
>>> SET collation_connection=latin1_general_ci
>>> or
>>> SET NAMES utf8
>>> SET collation_connection=utf8_general_ci
>>> in mysql_reconnect() right before the return 1; (there might be
>>> native functions to do that) ?
>>>
>>> Please correct my if I'm wrong.
>> I believe those are set per-table at table creation time.
> 
> Correct. The DB admin sets those per server, per table and per
> column. Those settings tell MySQL how to store the character
> data you put into a field (I'm simplifying things a bit).
> But that's not the whole story.
> 
>> In any
>> case, trying to assume what everybody would want is a recipe for
>> getting it wrong.

BTW: *Every* connection to MySQL has a charset and a collation.
Just if you don't set it explicitly you cannot be sure it is the
one you expect.

> I think that's not true. Asterisk needs to tell MySQL the charset
> in which *Asterisk* expects strings to be returned and in which it
> will send strings. (MySQL will do the conversion automatically.)
> 
> http://dev.mysql.com/doc/refman/5.1/en/charset-connection.html
> 
> Otherwise MySQL might return UTF-8 data when Asterisk assumes it
> is ISO-8859-1 or the other way round.

Let's assume the database admin chose to define the callerid
column as ISO-8859-1 but Asterisk expects the character data
returned to be UTF-8. Asterisk needs to tell the MySQL client
lib that it wants the string to be converted to UTF-8 no matter
which encoding MySQL uses internally to store the string.
If one didn't set the charset the string returned depends on
the default charset which might happen to be latin1 / sjis /
whatever. Thus the string might contain bytes which are not
valid characters in UTF-8 (i.e. those above 0x7F);

Grüße,
  Philipp

-- 
amooma GmbH - Bachstr. 126 - 56566 Neuwied - http://www.amooma.de
     Let's use IT to solve problems and not to create new ones.
           Asterisk? -> http://www.das-asterisk-buch.de

Geschäftsführer: Stefan Wintermeyer
Handelsregister: Neuwied B 14998


More information about the asterisk-dev mailing list