[asterisk-bugs] [JIRA] (ASTERISK-29000) internationalization: UTF-8 character in channel variables causes crashes

Sean Bright (JIRA) noreply at issues.asterisk.org
Thu Oct 22 10:58:36 CDT 2020


    [ https://issues.asterisk.org/jira/browse/ASTERISK-29000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=252504#comment-252504 ] 

Sean Bright edited comment on ASTERISK-29000 at 10/22/20 10:58 AM:
-------------------------------------------------------------------

I'm wondering whether this issue hasn't been resolved by nature of the following commit?

2020-07-13 15:06 +0000 [e9e441c399]  Sean Bright <sean.bright at gmail.com>

	* utf8.c: Add UTF-8 validation and utility functions

	  There are various places in Asterisk - specifically in regards to
	  database integration - where having some kind of UTF-8 validation would
	  be beneficial. This patch adds:

	  * Functions to validate that a given string contains only valid UTF-8
	    sequences.

	  * A function to copy a string (similar to ast_copy_string) stopping when
	    an invalid UTF-8 sequence is encountered.

	  * A UTF-8 validator that allows for progressive validation.

	  All of this is based on the excellent UTF-8 decoder by Björn Höhrmann.
	  More information is available here:

	      https://bjoern.hoehrmann.de/utf-8/decoder/dfa/

	  The API was written in such a way that should allow us to replace the
	  implementation later should we determine that we need something more
	  comprehensive.

	  Change-Id: I3555d787a79e7c780a7800cd26e0b5056368abf9


was (Author: gmza):
I'm wondering whether this issue hasn't been resolved by nature of the following commit?

2020-07-13 15:06 +0000 [e9e441c399]  Sean Bright <sean.bright at gmail.com>

	* utf8.c: Add UTF-8 validation and utility functions

	  There are various places in Asterisk - specifically in regards to
	  database integration - where having some kind of UTF-8 validation would
	  be beneficial. This patch adds:

	  * Functions to validate that a given string contains only valid UTF-8
	    sequences.

	  * A function to copy a string (similar to ast_copy_string) stopping when
	    an invalid UTF-8 sequence is encountered.

	  * A UTF-8 validator that allows for progressive validation.

	  All of this is based on the excellent UTF-8 decoder by Björn Höhrmann.
	  More information is available here:

	      https://bjoern.hoehrmann.de/utf-8/decoder/dfa/

	  The API was written in such a way that should allow us to replace the
	  implementation later should we determine that we need something more
	  comprehensive.

	  Change-Id: I3555d787a79e7c780a7800cd26e0b5056368abf9

> internationalization: UTF-8 character in channel variables causes crashes
> -------------------------------------------------------------------------
>
>                 Key: ASTERISK-29000
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-29000
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Core/General
>    Affects Versions: 16.11.1
>         Environment: Asterisk 16.11.1, Ubuntu 18.04.4 LTS
>            Reporter: Gregory Massel
>            Severity: Minor
>
> An unexpected UTF8 character in a channel variable will either cause Asterisk to Segfault or to generate a harmless backtrace, depending on circucmstances.
> E.g. When using Set(), a harmless backtrace:
> [Jul 21 08:29:12] VERBOSE[1283][C-000022ef] pbx.c: Executing [s at swvpbx-sub-get-pin-auth:4] Set("PJSIP/mongenaglodge205-00005993", "PIN_CALLER_ID=Simon<E8>") in new stack
> [Jul 21 08:29:12] ERROR[1283][C-000022ef] json.c: Error building JSON from '{s: s, s: s}': Invalid UTF-8 string.
> [Jul 21 08:29:12] ERROR[1283][C-000022ef] : Got 13 backtrace records
> # 0: [0x563b7793a45f] asterisk json.c:613 ast_json_vpack()
> # 1: [0x563b7793a361] asterisk json.c:596 ast_json_pack()
> # 2: [0x563b779e0384] asterisk stasis_channels.c:831 ast_channel_publish_varset()
> # 3: [0x563b77983c3a] asterisk pbx_variables.c:1118 pbx_builtin_setvar_helper()
> # 4: [0x563b77983e64] asterisk pbx_variables.c:1154 pbx_builtin_setvar()
> # 5: [0x563b7797811f] asterisk pbx_app.c:492 pbx_exec()
> # 6: [0x563b77961a1f] asterisk pbx.c:2947 pbx_extension_helper()
> # 7: [0x563b77965eb7] asterisk pbx.c:4197 ast_spawn_extension()
> # 8: [0x563b77966c6b] asterisk pbx.c:4371 __ast_pbx_run()
> # 9: [0x563b779685e8] asterisk pbx.c:4696 pbx_thread()
> #10: [0x563b77a0a05e] asterisk utils.c:1249 dummy_start()
> #11: [0x7fe7554e16db] libpthread.so.0 pthread_create.c:463 start_thread()
> #12: [0x7fe7546d2a3f] libc.so.6 clone.S:97 clone()
> However, I have previously had scenarios where the UTF-8 character pulled into the "callerid=" clause in a pjsip.conf endpoint and, when dialling that endpoint, Asterisk segfaulted.
> This is minor as I've mitigated this by stripping all UTF-8 characters, however, it would be better, in the long term, that Asterisk either ignore or strip or handle these characters rather than segfault.



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list