[asterisk-dev] MySQL Realtime charset
Philipp Kempgen
philipp.kempgen at amooma.de
Sun May 13 07:35:39 MST 2007
Philipp Kempgen wrote:
> Tilghman Lesher wrote:
>
>> On Sunday 13 May 2007, Philipp Kempgen wrote:
>>> I couldn't find any code in res_config_mysql.c which cares about
>>> setting the right charset / collation for the MySQL connection.
>>>
>>> Shouldn't there by some queries like
>>> SET NAMES latin1
>>> SET collation_connection=latin1_general_ci
>>> or
>>> SET NAMES utf8
>>> SET collation_connection=utf8_general_ci
>>> in mysql_reconnect() right before the return 1; (there might be
>>> native functions to do that) ?
>>>
>>> Please correct my if I'm wrong.
>> I believe those are set per-table at table creation time.
>
> Correct. The DB admin sets those per server, per table and per
> column. Those settings tell MySQL how to store the character
> data you put into a field (I'm simplifying things a bit).
> But that's not the whole story.
>
>> In any
>> case, trying to assume what everybody would want is a recipe for
>> getting it wrong.
BTW: *Every* connection to MySQL has a charset and a collation.
Just if you don't set it explicitly you cannot be sure it is the
one you expect.
> I think that's not true. Asterisk needs to tell MySQL the charset
> in which *Asterisk* expects strings to be returned and in which it
> will send strings. (MySQL will do the conversion automatically.)
>
> http://dev.mysql.com/doc/refman/5.1/en/charset-connection.html
>
> Otherwise MySQL might return UTF-8 data when Asterisk assumes it
> is ISO-8859-1 or the other way round.
Let's assume the database admin chose to define the callerid
column as ISO-8859-1 but Asterisk expects the character data
returned to be UTF-8. Asterisk needs to tell the MySQL client
lib that it wants the string to be converted to UTF-8 no matter
which encoding MySQL uses internally to store the string.
If one didn't set the charset the string returned depends on
the default charset which might happen to be latin1 / sjis /
whatever. Thus the string might contain bytes which are not
valid characters in UTF-8 (i.e. those above 0x7F);
Grüße,
Philipp
--
amooma GmbH - Bachstr. 126 - 56566 Neuwied - http://www.amooma.de
Let's use IT to solve problems and not to create new ones.
Asterisk? -> http://www.das-asterisk-buch.de
Geschäftsführer: Stefan Wintermeyer
Handelsregister: Neuwied B 14998
More information about the asterisk-dev
mailing list