[asterisk-bugs] [JIRA] (ASTERISK-20167) UTF-8 cyrillic characters in voicemail email subject cause subject corruption

Walter Doekes (JIRA) noreply at issues.asterisk.org
Wed Nov 14 03:53:45 CST 2012


    [ https://issues.asterisk.org/jira/browse/ASTERISK-20167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=199706#comment-199706 ] 

Walter Doekes commented on ASTERISK-20167:
------------------------------------------

P.S. http://www.ietf.org/rfc/rfc2047.txt
{noformat}
   An 'encoded-word' may not be more than 75 characters long, including
   'charset', 'encoding', 'encoded-text', and delimiters.  If it is
   desirable to encode more text than will fit in an 'encoded-word' of
   75 characters, multiple 'encoded-word's (separated by CRLF SPACE) may
   be used.

   While there is no limit to the length of a multiple-line header
   field, each line of a header field that contains one or more
   'encoded-word's is limited to 76 characters.
...
   Some character sets use code-switching techniques to switch between
   "ASCII mode" and other modes.  If unencoded text in an 'encoded-word'
   contains a sequence which causes the charset interpreter to switch
   out of ASCII mode, it MUST contain additional control codes such that
   ASCII mode is again selected at the end of the 'encoded-word'.  (This
   rule applies separately to each 'encoded-word', including adjacent
   'encoded-word's within a single header field.)
{noformat}

I'd call the multibyte tokens "mode switching", so breaking mid char is indeed illegal.
                
> UTF-8 cyrillic characters in voicemail email subject cause subject corruption
> -----------------------------------------------------------------------------
>
>                 Key: ASTERISK-20167
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-20167
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Applications/app_voicemail
>    Affects Versions: 1.8.8.2
>         Environment: Linux myhost.mydomain 2.6.18-308.11.1.el5 #1 SMP Tue Jul 10 08:49:28 EDT 2012 i686 i686 i386 GNU/Linux
> Cent-OS 5.8
>            Reporter: Arcadiy Ivanov
>         Attachments: issueA20167_break_early_for_q_encoding.patch
>
>
> This has been happening ever since 1.4.x.
> ========
> In voicemail.conf:
> emailsubject=[PBX]: Сообщение от ${VM_CALLERID} в ${VM_DATE}
> ========
> The emails arrive with the following subject:
> [PBX]: Сообще�в Monday, July 23, 2012 at 11:45:46 PM
> ========
> The subject should appear as follows:
> [PBX]: Сообщение от "anonymous" <anonymous> в Monday, July 23, 2012 at 11:45:46 PM
> ========
> The raw subject header as it appears in the email message is:
> Subject: =?UTF-8?Q?=5BPBX=5D=3A_=D0=A1=D0=BE=D0=BE=D0=B1=D1=89=D0=B5=D0?=
>  =?UTF-8?Q?=BD=D0=B8=D0=B5_=D0=BE=D1=82_=22anonymous=22_=3Canonymous=3E_?=
>  =?UTF-8?Q?=D0=B2_Monday=2C_July_23=2C_2012_at_11=3A45=3A46_PM?=

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira



More information about the asterisk-bugs mailing list