[asterisk-bugs] [JIRA] (ASTERISK-30500) Caller name corruption in encodings other than UTF-8
Basil Mi (JIRA)
noreply at issues.asterisk.org
Tue Apr 25 12:59:03 CDT 2023
[ https://issues.asterisk.org/jira/browse/ASTERISK-30500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=261837#comment-261837 ]
Basil Mi commented on ASTERISK-30500:
-------------------------------------
It might not be the best idea to force/mandatory change characters in the callername.
It may be with the correct characters, but not in the correct encoding (not UTF-8).
Some thoughts:
Use "ast_utf8_replace_invalid_chars" only if any of the common/known encodings are not found in the string. If the callername is in the valid encodings (win-1251, koi-8 and so one), do not correct it and leave further conversion to the user.
Or briefly: if we found that the callername is in valid win-1251, then do not call "ast_utf8_replace_invalid_chars". :-)
When there is a call to the "broken device", we use the inverse transformation UTF-8->WIN-1251: {code} Set(CALLERID(name)=${ICONV(UTF-8,WINDOWS-1251,$[CALLERID(name)])});{code}
Tiсket to Panasonic Corp.? :-) I think they will solve the problem for years. This is a large family of hardware PBXs and proprietary telephone sets for them (for example https://www.kx-td.com/telephone-systems/).
> Caller name corruption in encodings other than UTF-8
> ----------------------------------------------------
>
> Key: ASTERISK-30500
> URL: https://issues.asterisk.org/jira/browse/ASTERISK-30500
> Project: Asterisk
> Issue Type: Bug
> Security Level: None
> Components: Resources/res_pjsip
> Affects Versions: 18.17.0
> Environment: FreeBSD 13.2
> Reporter: Basil Mi
> Assignee: Unassigned
> Severity: Major
> Attachments: sip-capture.txt, win-1251_text_example_1.txt
>
>
> After this change: ASTERISK-27830
> ===================================
> {quote}
> 2023-02-16 10:05 +0000 [1ddfb7551a] George Joseph <gjoseph at sangoma.com>
> * res_pjsip: Replace invalid UTF-8 sequences in callerid name
> * Added a new function ast_utf8_replace_invalid_chars() to
> utf8.c that copies a string replacing any invalid UTF-8
> sequences with the Unicode specified U+FFFD replacement
> character. For example: "abc\xffdef" becomes "abc\uFFFDdef".
> Any UTF-8 compliant implementation will show that character
> as a � character.
> * Updated res_pjsip:set_id_from_hdr() to use
> ast_utf8_replace_invalid_chars and print a warning if any
> invalid sequences were found during the copy.
> * Updated stasis_channels:ast_channel_publish_varset to use
> ast_utf8_replace_invalid_chars and print a warning if any
> invalid sequences were found during the copy.
> ASTERISK-27830
> {quote}
> ===================================
> Some legacy devices transmit the caller name in encodings other than UTF-8. For example, PBX Panasonic KX-TDE600 uses WINDOWS-1251 and it's not configurable.
> In this case we use a function ICONV in dialplan to convert caller name to UTF-8 (for incoming calls). And vice versa (for outcoming calls).
> Using new function {color:red} "ast_utf8_replace_invalid_chars" {color} distorts caller name to "�" characters before it can be converted to UTF-8 in dialplan.
> Users see “�������” on devices instead of valid the caller's name.
> This logic worked for almost 10 years and broke on 18.17.0.
> Need to be able to turn off the replacement of invalid UTF-8 sequences (f.e. from config). Or be able to use the ICONV before replacement (before call {color:red} "ast_utf8_replace_invalid_chars" {color}).
--
This message was sent by Atlassian JIRA
(v6.2#6252)
More information about the asterisk-bugs
mailing list