<p>Sean Bright has uploaded this change for <strong>review</strong>.</p><p><a href="https://gerrit.asterisk.org/c/asterisk/+/14681">View Change</a></p><pre style="font-family: monospace,monospace; white-space: pre-wrap;">utf8.c: Add UTF-8 validation and utility functions<br><br>There are various places in Asterisk - specifically in regards to<br>database integration - where having some kind of UTF-8 validation would<br>be beneficial. This patch adds:<br><br>* Functions to validate that a given string contains only valid UTF-8<br> sequences.<br><br>* A function to copy a string (similar to ast_copy_string) stopping when<br> an invalid UTF-8 sequence is encountered.<br><br>* A UTF-8 validator that allows for progressive validation.<br><br>All of this is based on the excellent UTF-8 decoder by Björn Höhrmann.<br>More information is available here:<br><br> https://bjoern.hoehrmann.de/utf-8/decoder/dfa/<br><br>The API was written in such a way that should allow us to replace the<br>implementation later should we determine that we need something more<br>comprehensive.<br><br>Change-Id: I3555d787a79e7c780a7800cd26e0b5056368abf9<br>---<br>A include/asterisk/utf8.h<br>M main/asterisk.c<br>A main/utf8.c<br>3 files changed, 570 insertions(+), 0 deletions(-)<br><br></pre><pre style="font-family: monospace,monospace; white-space: pre-wrap;">git pull ssh://gerrit.asterisk.org:29418/asterisk refs/changes/81/14681/1</pre><pre style="font-family: monospace,monospace; white-space: pre-wrap;"><span>diff --git a/include/asterisk/utf8.h b/include/asterisk/utf8.h</span><br><span>new file mode 100644</span><br><span>index 0000000..431f118</span><br><span>--- /dev/null</span><br><span>+++ b/include/asterisk/utf8.h</span><br><span>@@ -0,0 +1,188 @@</span><br><span style="color: hsl(120, 100%, 40%);">+/*</span><br><span style="color: hsl(120, 100%, 40%);">+ * Asterisk -- An open source telephony toolkit.</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * Copyright (C) 2020, Sean Bright</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * Sean Bright <sean.bright@gmail.com></span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * See http://www.asterisk.org for more information about</span><br><span style="color: hsl(120, 100%, 40%);">+ * the Asterisk project. Please do not directly contact</span><br><span style="color: hsl(120, 100%, 40%);">+ * any of the maintainers of this project for assistance;</span><br><span style="color: hsl(120, 100%, 40%);">+ * the project provides a web site, mailing lists and IRC</span><br><span style="color: hsl(120, 100%, 40%);">+ * channels for your use.</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * This program is free software, distributed under the terms of</span><br><span style="color: hsl(120, 100%, 40%);">+ * the GNU General Public License Version 2. See the LICENSE file</span><br><span style="color: hsl(120, 100%, 40%);">+ * at the top of the source tree.</span><br><span style="color: hsl(120, 100%, 40%);">+ */</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+/*! \file</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * \brief UTF-8 information and validation functions</span><br><span style="color: hsl(120, 100%, 40%);">+ */</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+#ifndef ASTERISK_UTF8_H</span><br><span style="color: hsl(120, 100%, 40%);">+#define ASTERISK_UTF8_H</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+/*!</span><br><span style="color: hsl(120, 100%, 40%);">+ * \brief Check if a zero-terminated string is valid UTF-8</span><br><span style="color: hsl(120, 100%, 40%);">+ * \since 13.36.0, 16.13.0, 17.7.0</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * \param str The zero-terminated string to check</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * \retval 0 if the string is not valid UTF-8</span><br><span style="color: hsl(120, 100%, 40%);">+ * \retval Non-zero if the string is valid UTF-8</span><br><span style="color: hsl(120, 100%, 40%);">+ */</span><br><span style="color: hsl(120, 100%, 40%);">+int ast_utf8_is_valid(const char *str);</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+/*!</span><br><span style="color: hsl(120, 100%, 40%);">+ * \brief Check if the first \a size bytes of a string are valid UTF-8</span><br><span style="color: hsl(120, 100%, 40%);">+ * \since 13.36.0, 16.13.0, 17.7.0</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * Similar to \a ast_utf8_is_valid() but checks the first \a size bytes or until</span><br><span style="color: hsl(120, 100%, 40%);">+ * a zero byte is reached, whichever comes first.</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * \param str The string to check</span><br><span style="color: hsl(120, 100%, 40%);">+ * \param size The number of bytes to evaluate</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * \retval 0 if the string is not valid UTF-8</span><br><span style="color: hsl(120, 100%, 40%);">+ * \retval Non-zero if the string is valid UTF-8</span><br><span style="color: hsl(120, 100%, 40%);">+ */</span><br><span style="color: hsl(120, 100%, 40%);">+int ast_utf8_is_validn(const char *str, size_t size);</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+/*!</span><br><span style="color: hsl(120, 100%, 40%);">+ * \brief Copy a string safely ensuring valid UTF-8</span><br><span style="color: hsl(120, 100%, 40%);">+ * \since 13.36.0, 16.13.0, 17.7.0</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * This is similar to \a ast_copy_string, but it will only copy valid UTF-8</span><br><span style="color: hsl(120, 100%, 40%);">+ * sequences from the source string into the destination buffer. If an invalid</span><br><span style="color: hsl(120, 100%, 40%);">+ * UTF-8 sequence is encountered, or the available space in the destination</span><br><span style="color: hsl(120, 100%, 40%);">+ * buffer is exhausted in the middle of an otherwise valid UTF-8 sequence, the</span><br><span style="color: hsl(120, 100%, 40%);">+ * destination buffer will be truncated to ensure that it only contains valid</span><br><span style="color: hsl(120, 100%, 40%);">+ * UTF-8.</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * \param dst The destination buffer.</span><br><span style="color: hsl(120, 100%, 40%);">+ * \param src The source string</span><br><span style="color: hsl(120, 100%, 40%);">+ * \param size The size of the destination buffer</span><br><span style="color: hsl(120, 100%, 40%);">+ * \return Nothing.</span><br><span style="color: hsl(120, 100%, 40%);">+ */</span><br><span style="color: hsl(120, 100%, 40%);">+void ast_utf8_copy_string(char *dst, const char *src, size_t size);</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+enum ast_utf8_validation_result {</span><br><span style="color: hsl(120, 100%, 40%);">+ /*! \brief The consumed sequence is valid UTF-8</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * The bytes consumed thus far by the validator represent a valid sequence of</span><br><span style="color: hsl(120, 100%, 40%);">+ * UTF-8 bytes. If additional bytes are fed into the validator, it can</span><br><span style="color: hsl(120, 100%, 40%);">+ * transition into either \a AST_UTF8_INVALID or \a AST_UTF8_UNKNOWN</span><br><span style="color: hsl(120, 100%, 40%);">+ */</span><br><span style="color: hsl(120, 100%, 40%);">+ AST_UTF8_VALID,</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ /*! \brief The consumed sequence is invalid UTF-8</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * The bytes consumed thus far by the validator represent an invalid sequence</span><br><span style="color: hsl(120, 100%, 40%);">+ * of UTF-8 bytes. Feeding additional bytes into the validator will not</span><br><span style="color: hsl(120, 100%, 40%);">+ * change its state.</span><br><span style="color: hsl(120, 100%, 40%);">+ */</span><br><span style="color: hsl(120, 100%, 40%);">+ AST_UTF8_INVALID,</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ /*! \brief The validator is in an intermediate state</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * The validator is in the process of validating a multibyte UTF-8 sequence</span><br><span style="color: hsl(120, 100%, 40%);">+ * and requires additional data to be fed into it to determine validity. If</span><br><span style="color: hsl(120, 100%, 40%);">+ * additional bytes are fed into the validator, it can transition into either</span><br><span style="color: hsl(120, 100%, 40%);">+ * \a AST_UTF8_VALID or \a AST_UTF8_INVALID. If you have no additional data</span><br><span style="color: hsl(120, 100%, 40%);">+ * to feed into the validator the UTF-8 sequence is invalid.</span><br><span style="color: hsl(120, 100%, 40%);">+ */</span><br><span style="color: hsl(120, 100%, 40%);">+ AST_UTF8_UNKNOWN,</span><br><span style="color: hsl(120, 100%, 40%);">+};</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+/*!</span><br><span style="color: hsl(120, 100%, 40%);">+ * \brief Opaque type for UTF-8 validator state.</span><br><span style="color: hsl(120, 100%, 40%);">+ * \since 13.36.0, 16.13.0, 17.7.0</span><br><span style="color: hsl(120, 100%, 40%);">+ */</span><br><span style="color: hsl(120, 100%, 40%);">+struct ast_utf8_validator;</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+/*!</span><br><span style="color: hsl(120, 100%, 40%);">+ * \brief Create a new UTF-8 validator</span><br><span style="color: hsl(120, 100%, 40%);">+ * \since 13.36.0, 16.13.0, 17.7.0</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * \param[out] validator The validator instance</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * \retval 0 on success</span><br><span style="color: hsl(120, 100%, 40%);">+ * \retval -1 on failure</span><br><span style="color: hsl(120, 100%, 40%);">+ */</span><br><span style="color: hsl(120, 100%, 40%);">+int ast_utf8_validator_new(struct ast_utf8_validator **validator);</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+/*!</span><br><span style="color: hsl(120, 100%, 40%);">+ * \brief Feed a zero-terminated string into the UTF-8 validator</span><br><span style="color: hsl(120, 100%, 40%);">+ * \since 13.36.0, 16.13.0, 17.7.0</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * \param validator The validator instance</span><br><span style="color: hsl(120, 100%, 40%);">+ * \param data The zero-terminated string to feed into the validator</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * \return The \ref ast_utf8_validation_result indicating the current state of</span><br><span style="color: hsl(120, 100%, 40%);">+ * the validator.</span><br><span style="color: hsl(120, 100%, 40%);">+ */</span><br><span style="color: hsl(120, 100%, 40%);">+enum ast_utf8_validation_result ast_utf8_validator_feed(</span><br><span style="color: hsl(120, 100%, 40%);">+ struct ast_utf8_validator *validator, const char *data);</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+/*!</span><br><span style="color: hsl(120, 100%, 40%);">+ * \brief Feed a string into the UTF-8 validator</span><br><span style="color: hsl(120, 100%, 40%);">+ * \since 13.36.0, 16.13.0, 17.7.0</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * Similar to \a ast_utf8_validator_feed but will stop feeding in data if a zero</span><br><span style="color: hsl(120, 100%, 40%);">+ * byte is encountered or \a size bytes have been read.</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * \param validator The validator instance</span><br><span style="color: hsl(120, 100%, 40%);">+ * \param data The string to feed into the validator</span><br><span style="color: hsl(120, 100%, 40%);">+ * \param size The number of bytes to feed into the validator</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * \return The \ref ast_utf8_validation_result indicating the current state of</span><br><span style="color: hsl(120, 100%, 40%);">+ * the validator.</span><br><span style="color: hsl(120, 100%, 40%);">+ */</span><br><span style="color: hsl(120, 100%, 40%);">+enum ast_utf8_validation_result ast_utf8_validator_feedn(</span><br><span style="color: hsl(120, 100%, 40%);">+ struct ast_utf8_validator *validator, const char *data, size_t size);</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+/*!</span><br><span style="color: hsl(120, 100%, 40%);">+ * \brief Get the current UTF-8 validator state</span><br><span style="color: hsl(120, 100%, 40%);">+ * \since 13.36.0, 16.13.0, 17.7.0</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * \param validator The validator instance</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * \return The \ref ast_utf8_validation_result indicating the current state of</span><br><span style="color: hsl(120, 100%, 40%);">+ * the validator.</span><br><span style="color: hsl(120, 100%, 40%);">+ */</span><br><span style="color: hsl(120, 100%, 40%);">+enum ast_utf8_validation_result ast_utf8_validator_state(</span><br><span style="color: hsl(120, 100%, 40%);">+ struct ast_utf8_validator *validator);</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+/*!</span><br><span style="color: hsl(120, 100%, 40%);">+ * \brief Reset the state of a UTF-8 validator</span><br><span style="color: hsl(120, 100%, 40%);">+ * \since 13.36.0, 16.13.0, 17.7.0</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * Resets the provided UTF-8 validator to its initial state so that it can be</span><br><span style="color: hsl(120, 100%, 40%);">+ * reused.</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * \param validator The validator instance to reset</span><br><span style="color: hsl(120, 100%, 40%);">+ */</span><br><span style="color: hsl(120, 100%, 40%);">+void ast_utf8_validator_reset(</span><br><span style="color: hsl(120, 100%, 40%);">+ struct ast_utf8_validator *validator);</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+/*!</span><br><span style="color: hsl(120, 100%, 40%);">+ * \brief Destroy a UTF-8 validator</span><br><span style="color: hsl(120, 100%, 40%);">+ * \since 13.36.0, 16.13.0, 17.7.0</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * \param validator The validator instance to destroy</span><br><span style="color: hsl(120, 100%, 40%);">+ */</span><br><span style="color: hsl(120, 100%, 40%);">+void ast_utf8_validator_destroy(struct ast_utf8_validator *validator);</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+/*!</span><br><span style="color: hsl(120, 100%, 40%);">+ * \brief Register UTF-8 tests</span><br><span style="color: hsl(120, 100%, 40%);">+ * \since 13.36.0, 16.13.0, 17.7.0</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * Does nothing unless TEST_FRAMEWORK is defined.</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * \return Always returns 0</span><br><span style="color: hsl(120, 100%, 40%);">+ */</span><br><span style="color: hsl(120, 100%, 40%);">+int ast_utf8_init(void);</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+#endif /* ASTERISK_UTF8_H */</span><br><span>diff --git a/main/asterisk.c b/main/asterisk.c</span><br><span>index 794a9ac..1061fd0 100644</span><br><span>--- a/main/asterisk.c</span><br><span>+++ b/main/asterisk.c</span><br><span>@@ -242,6 +242,7 @@</span><br><span> #include "asterisk/media_cache.h"</span><br><span> #include "asterisk/astdb.h"</span><br><span> #include "asterisk/options.h"</span><br><span style="color: hsl(120, 100%, 40%);">+#include "asterisk/utf8.h"</span><br><span> </span><br><span> #include "../defaults.h"</span><br><span> </span><br><span>@@ -4065,6 +4066,7 @@</span><br><span> check_init(ast_json_init(), "libjansson");</span><br><span> ast_ulaw_init();</span><br><span> ast_alaw_init();</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_utf8_init();</span><br><span> tdd_init();</span><br><span> callerid_init();</span><br><span> ast_builtins_init();</span><br><span>diff --git a/main/utf8.c b/main/utf8.c</span><br><span>new file mode 100644</span><br><span>index 0000000..9b4b459</span><br><span>--- /dev/null</span><br><span>+++ b/main/utf8.c</span><br><span>@@ -0,0 +1,380 @@</span><br><span style="color: hsl(120, 100%, 40%);">+/*</span><br><span style="color: hsl(120, 100%, 40%);">+ * Asterisk -- An open source telephony toolkit.</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * Copyright (C) 2020, Sean Bright</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * Sean Bright <sean.bright@gmail.com></span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * See http://www.asterisk.org for more information about</span><br><span style="color: hsl(120, 100%, 40%);">+ * the Asterisk project. Please do not directly contact</span><br><span style="color: hsl(120, 100%, 40%);">+ * any of the maintainers of this project for assistance;</span><br><span style="color: hsl(120, 100%, 40%);">+ * the project provides a web site, mailing lists and IRC</span><br><span style="color: hsl(120, 100%, 40%);">+ * channels for your use.</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * This program is free software, distributed under the terms of</span><br><span style="color: hsl(120, 100%, 40%);">+ * the GNU General Public License Version 2. See the LICENSE file</span><br><span style="color: hsl(120, 100%, 40%);">+ * at the top of the source tree.</span><br><span style="color: hsl(120, 100%, 40%);">+ */</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+/*! \file</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * \brief UTF-8 information and validation functions</span><br><span style="color: hsl(120, 100%, 40%);">+ */</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+/*** MODULEINFO</span><br><span style="color: hsl(120, 100%, 40%);">+ <support_level>core</support_level></span><br><span style="color: hsl(120, 100%, 40%);">+***/</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+#include "asterisk.h"</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+#include "asterisk/utils.h"</span><br><span style="color: hsl(120, 100%, 40%);">+#include "asterisk/utf8.h"</span><br><span style="color: hsl(120, 100%, 40%);">+#include "asterisk/test.h"</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+/*</span><br><span style="color: hsl(120, 100%, 40%);">+ * BEGIN THIRD PARTY CODE</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * Copyright (c) 2008-2010 Björn Höhrmann <bjoern@hoehrmann.de></span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * Permission is hereby granted, free of charge, to any person obtaining a copy</span><br><span style="color: hsl(120, 100%, 40%);">+ * of this software and associated documentation files (the "Software"), to deal</span><br><span style="color: hsl(120, 100%, 40%);">+ * in the Software without restriction, including without limitation the rights</span><br><span style="color: hsl(120, 100%, 40%);">+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell</span><br><span style="color: hsl(120, 100%, 40%);">+ * copies of the Software, and to permit persons to whom the Software is</span><br><span style="color: hsl(120, 100%, 40%);">+ * furnished to do so, subject to the following conditions:</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * The above copyright notice and this permission notice shall be included in all</span><br><span style="color: hsl(120, 100%, 40%);">+ * copies or substantial portions of the Software.</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR</span><br><span style="color: hsl(120, 100%, 40%);">+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,</span><br><span style="color: hsl(120, 100%, 40%);">+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE</span><br><span style="color: hsl(120, 100%, 40%);">+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER</span><br><span style="color: hsl(120, 100%, 40%);">+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,</span><br><span style="color: hsl(120, 100%, 40%);">+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE</span><br><span style="color: hsl(120, 100%, 40%);">+ * SOFTWARE.</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * See http://bjoern.hoehrmann.de/utf-8/decoder/dfa/ for details.</span><br><span style="color: hsl(120, 100%, 40%);">+ */</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+#define UTF8_ACCEPT 0</span><br><span style="color: hsl(120, 100%, 40%);">+#define UTF8_REJECT 12</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+static const uint8_t utf8d[] = {</span><br><span style="color: hsl(120, 100%, 40%);">+ /* The first part of the table maps bytes to character classes that</span><br><span style="color: hsl(120, 100%, 40%);">+ * to reduce the size of the transition table and create bitmasks. */</span><br><span style="color: hsl(120, 100%, 40%);">+ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,</span><br><span style="color: hsl(120, 100%, 40%);">+ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,</span><br><span style="color: hsl(120, 100%, 40%);">+ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,</span><br><span style="color: hsl(120, 100%, 40%);">+ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,</span><br><span style="color: hsl(120, 100%, 40%);">+ 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,</span><br><span style="color: hsl(120, 100%, 40%);">+ 7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7, 7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,</span><br><span style="color: hsl(120, 100%, 40%);">+ 8,8,2,2,2,2,2,2,2,2,2,2,2,2,2,2, 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,</span><br><span style="color: hsl(120, 100%, 40%);">+ 10,3,3,3,3,3,3,3,3,3,3,3,3,4,3,3, 11,6,6,6,5,8,8,8,8,8,8,8,8,8,8,8,</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ /* The second part is a transition table that maps a combination</span><br><span style="color: hsl(120, 100%, 40%);">+ * of a state of the automaton and a character class to a state. */</span><br><span style="color: hsl(120, 100%, 40%);">+ 0,12,24,36,60,96,84,12,12,12,48,72, 12,12,12,12,12,12,12,12,12,12,12,12,</span><br><span style="color: hsl(120, 100%, 40%);">+ 12, 0,12,12,12,12,12, 0,12, 0,12,12, 12,24,12,12,12,12,12,24,12,24,12,12,</span><br><span style="color: hsl(120, 100%, 40%);">+ 12,12,12,12,12,12,12,24,12,12,12,12, 12,24,12,12,12,12,12,12,12,24,12,12,</span><br><span style="color: hsl(120, 100%, 40%);">+ 12,12,12,12,12,12,12,36,12,36,12,12, 12,36,12,12,12,12,12,36,12,36,12,12,</span><br><span style="color: hsl(120, 100%, 40%);">+ 12,36,12,12,12,12,12,12,12,12,12,12,</span><br><span style="color: hsl(120, 100%, 40%);">+};</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+#if 0</span><br><span style="color: hsl(120, 100%, 40%);">+/* We can bring this back if we need the codepoint? */</span><br><span style="color: hsl(120, 100%, 40%);">+static uint32_t inline decode(uint32_t *state, uint32_t *codep, uint32_t byte) {</span><br><span style="color: hsl(120, 100%, 40%);">+ uint32_t type = utf8d[byte];</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ *codep = (*state != UTF8_ACCEPT) ?</span><br><span style="color: hsl(120, 100%, 40%);">+ (byte & 0x3fu) | (*codep << 6) :</span><br><span style="color: hsl(120, 100%, 40%);">+ (0xff >> type) & (byte);</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ *state = utf8d[256 + *state + type];</span><br><span style="color: hsl(120, 100%, 40%);">+ return *state;</span><br><span style="color: hsl(120, 100%, 40%);">+}</span><br><span style="color: hsl(120, 100%, 40%);">+#endif</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+static uint32_t inline decode(uint32_t *state, uint32_t byte) {</span><br><span style="color: hsl(120, 100%, 40%);">+ uint32_t type = utf8d[byte];</span><br><span style="color: hsl(120, 100%, 40%);">+ *state = utf8d[256 + *state + type];</span><br><span style="color: hsl(120, 100%, 40%);">+ return *state;</span><br><span style="color: hsl(120, 100%, 40%);">+}</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+/*</span><br><span style="color: hsl(120, 100%, 40%);">+ * END THIRD PARTY CODE</span><br><span style="color: hsl(120, 100%, 40%);">+ *</span><br><span style="color: hsl(120, 100%, 40%);">+ * See copyright notice above.</span><br><span style="color: hsl(120, 100%, 40%);">+ */</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+int ast_utf8_is_valid(const char *src)</span><br><span style="color: hsl(120, 100%, 40%);">+{</span><br><span style="color: hsl(120, 100%, 40%);">+ uint32_t state = UTF8_ACCEPT;</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ while (*src) {</span><br><span style="color: hsl(120, 100%, 40%);">+ decode(&state, (uint8_t) *src++);</span><br><span style="color: hsl(120, 100%, 40%);">+ }</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ return state == UTF8_ACCEPT;</span><br><span style="color: hsl(120, 100%, 40%);">+}</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+int ast_utf8_is_validn(const char *src, size_t size)</span><br><span style="color: hsl(120, 100%, 40%);">+{</span><br><span style="color: hsl(120, 100%, 40%);">+ uint32_t state = UTF8_ACCEPT;</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ while (size && *src) {</span><br><span style="color: hsl(120, 100%, 40%);">+ decode(&state, (uint8_t) *src++);</span><br><span style="color: hsl(120, 100%, 40%);">+ size--;</span><br><span style="color: hsl(120, 100%, 40%);">+ }</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ return state == UTF8_ACCEPT;</span><br><span style="color: hsl(120, 100%, 40%);">+}</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+void ast_utf8_copy_string(char *dst, const char *src, size_t size)</span><br><span style="color: hsl(120, 100%, 40%);">+{</span><br><span style="color: hsl(120, 100%, 40%);">+ uint32_t state = UTF8_ACCEPT;</span><br><span style="color: hsl(120, 100%, 40%);">+ char *last_good = dst;</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_assert(size > 0);</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ while (size && *src) {</span><br><span style="color: hsl(120, 100%, 40%);">+ if (decode(&state, (uint8_t) *src) == UTF8_REJECT) {</span><br><span style="color: hsl(120, 100%, 40%);">+ /* We _could_ replace with U+FFFD and try to recover, but for now</span><br><span style="color: hsl(120, 100%, 40%);">+ * we treat this the same as if we had run out of space */</span><br><span style="color: hsl(120, 100%, 40%);">+ break;</span><br><span style="color: hsl(120, 100%, 40%);">+ }</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ *dst++ = *src++;</span><br><span style="color: hsl(120, 100%, 40%);">+ size--;</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ if (size && state == UTF8_ACCEPT) {</span><br><span style="color: hsl(120, 100%, 40%);">+ /* last_good is where we will ultimately write the 0 byte */</span><br><span style="color: hsl(120, 100%, 40%);">+ last_good = dst;</span><br><span style="color: hsl(120, 100%, 40%);">+ }</span><br><span style="color: hsl(120, 100%, 40%);">+ }</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ *last_good = '\0';</span><br><span style="color: hsl(120, 100%, 40%);">+}</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+struct ast_utf8_validator {</span><br><span style="color: hsl(120, 100%, 40%);">+ uint32_t state;</span><br><span style="color: hsl(120, 100%, 40%);">+};</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+int ast_utf8_validator_new(struct ast_utf8_validator **validator)</span><br><span style="color: hsl(120, 100%, 40%);">+{</span><br><span style="color: hsl(120, 100%, 40%);">+ struct ast_utf8_validator *tmp = ast_malloc(sizeof(*tmp));</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ if (!tmp) {</span><br><span style="color: hsl(120, 100%, 40%);">+ return 1;</span><br><span style="color: hsl(120, 100%, 40%);">+ }</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ tmp->state = UTF8_ACCEPT;</span><br><span style="color: hsl(120, 100%, 40%);">+ *validator = tmp;</span><br><span style="color: hsl(120, 100%, 40%);">+ return 0;</span><br><span style="color: hsl(120, 100%, 40%);">+}</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+enum ast_utf8_validation_result ast_utf8_validator_state(</span><br><span style="color: hsl(120, 100%, 40%);">+ struct ast_utf8_validator *validator)</span><br><span style="color: hsl(120, 100%, 40%);">+{</span><br><span style="color: hsl(120, 100%, 40%);">+ switch (validator->state) {</span><br><span style="color: hsl(120, 100%, 40%);">+ case UTF8_ACCEPT:</span><br><span style="color: hsl(120, 100%, 40%);">+ return AST_UTF8_VALID;</span><br><span style="color: hsl(120, 100%, 40%);">+ case UTF8_REJECT:</span><br><span style="color: hsl(120, 100%, 40%);">+ return AST_UTF8_INVALID;</span><br><span style="color: hsl(120, 100%, 40%);">+ default:</span><br><span style="color: hsl(120, 100%, 40%);">+ return AST_UTF8_UNKNOWN;</span><br><span style="color: hsl(120, 100%, 40%);">+ }</span><br><span style="color: hsl(120, 100%, 40%);">+}</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+enum ast_utf8_validation_result ast_utf8_validator_feed(</span><br><span style="color: hsl(120, 100%, 40%);">+ struct ast_utf8_validator *validator, const char *data)</span><br><span style="color: hsl(120, 100%, 40%);">+{</span><br><span style="color: hsl(120, 100%, 40%);">+ while (*data) {</span><br><span style="color: hsl(120, 100%, 40%);">+ decode(&validator->state, (uint8_t) *data++);</span><br><span style="color: hsl(120, 100%, 40%);">+ }</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ return ast_utf8_validator_state(validator);</span><br><span style="color: hsl(120, 100%, 40%);">+}</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+enum ast_utf8_validation_result ast_utf8_validator_feedn(</span><br><span style="color: hsl(120, 100%, 40%);">+ struct ast_utf8_validator *validator, const char *data, size_t size)</span><br><span style="color: hsl(120, 100%, 40%);">+{</span><br><span style="color: hsl(120, 100%, 40%);">+ while (size && *data) {</span><br><span style="color: hsl(120, 100%, 40%);">+ decode(&validator->state, (uint8_t) *data++);</span><br><span style="color: hsl(120, 100%, 40%);">+ size--;</span><br><span style="color: hsl(120, 100%, 40%);">+ }</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ return ast_utf8_validator_state(validator);</span><br><span style="color: hsl(120, 100%, 40%);">+}</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+void ast_utf8_validator_reset(struct ast_utf8_validator *validator)</span><br><span style="color: hsl(120, 100%, 40%);">+{</span><br><span style="color: hsl(120, 100%, 40%);">+ validator->state = UTF8_ACCEPT;</span><br><span style="color: hsl(120, 100%, 40%);">+}</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+void ast_utf8_validator_destroy(struct ast_utf8_validator *validator)</span><br><span style="color: hsl(120, 100%, 40%);">+{</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_free(validator);</span><br><span style="color: hsl(120, 100%, 40%);">+}</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+#ifdef TEST_FRAMEWORK</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+AST_TEST_DEFINE(test_utf8_is_valid)</span><br><span style="color: hsl(120, 100%, 40%);">+{</span><br><span style="color: hsl(120, 100%, 40%);">+ switch (cmd) {</span><br><span style="color: hsl(120, 100%, 40%);">+ case TEST_INIT:</span><br><span style="color: hsl(120, 100%, 40%);">+ info->name = "is_valid";</span><br><span style="color: hsl(120, 100%, 40%);">+ info->category = "/main/utf8/";</span><br><span style="color: hsl(120, 100%, 40%);">+ info->summary = "Test ast_utf8_is_valid and ast_utf8_is_validn";</span><br><span style="color: hsl(120, 100%, 40%);">+ info->description =</span><br><span style="color: hsl(120, 100%, 40%);">+ "Tests UTF-8 string validation code.";</span><br><span style="color: hsl(120, 100%, 40%);">+ return AST_TEST_NOT_RUN;</span><br><span style="color: hsl(120, 100%, 40%);">+ case TEST_EXECUTE:</span><br><span style="color: hsl(120, 100%, 40%);">+ break;</span><br><span style="color: hsl(120, 100%, 40%);">+ }</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ /* Valid UTF-8 */</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_is_valid("Asterisk"));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_is_valid("\xce\xbb"));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_is_valid("\xe2\x8a\x9b"));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_is_valid("\xf0\x9f\x93\x9e"));</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ /* Valid with leading */</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_is_valid("aaa Asterisk"));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_is_valid("aaa \xce\xbb"));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_is_valid("aaa \xe2\x8a\x9b"));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_is_valid("aaa \xf0\x9f\x93\x9e"));</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ /* Valid with trailing */</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_is_valid("Asterisk aaa"));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_is_valid("\xce\xbb aaa"));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_is_valid("\xe2\x8a\x9b aaa"));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_is_valid("\xf0\x9f\x93\x9e aaa"));</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ /* Valid with leading and trailing */</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_is_valid("aaa Asterisk aaa"));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_is_valid("aaa \xce\xbb aaa"));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_is_valid("aaa \xe2\x8a\x9b aaa"));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_is_valid("aaa \xf0\x9f\x93\x9e aaa"));</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ /* Valid if limited by number of bytes */</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_is_validn("Asterisk" "\xff", strlen("Asterisk")));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_is_validn("\xce\xbb" "\xff", strlen("\xce\xbb")));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_is_validn("\xe2\x8a\x9b" "\xff", strlen("\xe2\x8a\x9b")));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_is_validn("\xf0\x9f\x93\x9e" "\xff", strlen("\xf0\x9f\x93\x9e")));</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ /* Invalid */</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, !ast_utf8_is_valid("\xc0\x00")); /* Overlong */</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, !ast_utf8_is_valid("98.6\xa7")); /* 'High ASCII' */</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, !ast_utf8_is_valid("\xc3\x28"));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, !ast_utf8_is_valid("\xa0\xa1"));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, !ast_utf8_is_valid("\xe2\x28\xa1"));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, !ast_utf8_is_valid("\xe2\x82\x28"));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, !ast_utf8_is_valid("\xf0\x28\x8c\xbc"));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, !ast_utf8_is_valid("\xf0\x90\x28\xbc"));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, !ast_utf8_is_valid("\xf0\x28\x8c\x28"));</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ return AST_TEST_PASS;</span><br><span style="color: hsl(120, 100%, 40%);">+}</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+static int test_copy_and_compare(const char *src, size_t dst_len, const char *cmp)</span><br><span style="color: hsl(120, 100%, 40%);">+{</span><br><span style="color: hsl(120, 100%, 40%);">+ char dst[dst_len];</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_utf8_copy_string(dst, src, dst_len);</span><br><span style="color: hsl(120, 100%, 40%);">+ return strcmp(dst, cmp) == 0;</span><br><span style="color: hsl(120, 100%, 40%);">+}</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+AST_TEST_DEFINE(test_utf8_copy_string)</span><br><span style="color: hsl(120, 100%, 40%);">+{</span><br><span style="color: hsl(120, 100%, 40%);">+ switch (cmd) {</span><br><span style="color: hsl(120, 100%, 40%);">+ case TEST_INIT:</span><br><span style="color: hsl(120, 100%, 40%);">+ info->name = "copy_string";</span><br><span style="color: hsl(120, 100%, 40%);">+ info->category = "/main/utf8/";</span><br><span style="color: hsl(120, 100%, 40%);">+ info->summary = "Test ast_utf8_copy_string";</span><br><span style="color: hsl(120, 100%, 40%);">+ info->description =</span><br><span style="color: hsl(120, 100%, 40%);">+ "Tests UTF-8 string copying code.";</span><br><span style="color: hsl(120, 100%, 40%);">+ return AST_TEST_NOT_RUN;</span><br><span style="color: hsl(120, 100%, 40%);">+ case TEST_EXECUTE:</span><br><span style="color: hsl(120, 100%, 40%);">+ break;</span><br><span style="color: hsl(120, 100%, 40%);">+ }</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, test_copy_and_compare("Asterisk", 6, "Aster"));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, test_copy_and_compare("Asterisk \xc2\xae", 11, "Asterisk "));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, test_copy_and_compare("Asterisk \xc2\xae", 12, "Asterisk \xc2\xae"));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, test_copy_and_compare("Asterisk \xc0\x00", 12, "Asterisk "));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, test_copy_and_compare("\xce\xbb xyz", 1, ""));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, test_copy_and_compare("\xce\xbb xyz", 2, ""));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, test_copy_and_compare("\xce\xbb xyz", 3, "\xce\xbb"));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, test_copy_and_compare("\xce\xbb xyz", 4, "\xce\xbb "));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, test_copy_and_compare("\xce\xbb xyz", 5, "\xce\xbb x"));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, test_copy_and_compare("\xce\xbb xyz", 6, "\xce\xbb xy"));</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, test_copy_and_compare("\xce\xbb xyz", 7, "\xce\xbb xyz"));</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ return AST_TEST_PASS;</span><br><span style="color: hsl(120, 100%, 40%);">+}</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+AST_TEST_DEFINE(test_utf8_validator)</span><br><span style="color: hsl(120, 100%, 40%);">+{</span><br><span style="color: hsl(120, 100%, 40%);">+ struct ast_utf8_validator *validator;</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ switch (cmd) {</span><br><span style="color: hsl(120, 100%, 40%);">+ case TEST_INIT:</span><br><span style="color: hsl(120, 100%, 40%);">+ info->name = "utf8_validator";</span><br><span style="color: hsl(120, 100%, 40%);">+ info->category = "/main/utf8/";</span><br><span style="color: hsl(120, 100%, 40%);">+ info->summary = "Test ast_utf8_validator";</span><br><span style="color: hsl(120, 100%, 40%);">+ info->description =</span><br><span style="color: hsl(120, 100%, 40%);">+ "Tests UTF-8 progressive validator code.";</span><br><span style="color: hsl(120, 100%, 40%);">+ return AST_TEST_NOT_RUN;</span><br><span style="color: hsl(120, 100%, 40%);">+ case TEST_EXECUTE:</span><br><span style="color: hsl(120, 100%, 40%);">+ break;</span><br><span style="color: hsl(120, 100%, 40%);">+ }</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ if (ast_utf8_validator_new(&validator)) {</span><br><span style="color: hsl(120, 100%, 40%);">+ return AST_TEST_FAIL;</span><br><span style="color: hsl(120, 100%, 40%);">+ }</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_validator_feed(validator, "Asterisk") == AST_UTF8_VALID);</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_validator_feed(validator, "\xc2") == AST_UTF8_UNKNOWN);</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_validator_feed(validator, "\xae") == AST_UTF8_VALID);</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_validator_feed(validator, "Private") == AST_UTF8_VALID);</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_validator_feed(validator, "Branch") == AST_UTF8_VALID);</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_validator_feed(validator, "Exchange") == AST_UTF8_VALID);</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_validator_feed(validator, "\xe2") == AST_UTF8_UNKNOWN);</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_validator_feed(validator, "\x84") == AST_UTF8_UNKNOWN);</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_validator_feed(validator, "\xbb") == AST_UTF8_VALID);</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_validator_feed(validator, "\xc0\x00") == AST_UTF8_INVALID);</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_validator_feed(validator, "valid") == AST_UTF8_INVALID);</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_validator_feed(validator, "valid") == AST_UTF8_INVALID);</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_test_validate(test, ast_utf8_validator_feed(validator, "valid") == AST_UTF8_INVALID);</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_utf8_validator_destroy(validator);</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ return AST_TEST_PASS;</span><br><span style="color: hsl(120, 100%, 40%);">+}</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+static void test_utf8_shutdown(void)</span><br><span style="color: hsl(120, 100%, 40%);">+{</span><br><span style="color: hsl(120, 100%, 40%);">+ AST_TEST_UNREGISTER(test_utf8_is_valid);</span><br><span style="color: hsl(120, 100%, 40%);">+ AST_TEST_UNREGISTER(test_utf8_copy_string);</span><br><span style="color: hsl(120, 100%, 40%);">+ AST_TEST_UNREGISTER(test_utf8_validator);</span><br><span style="color: hsl(120, 100%, 40%);">+}</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+int ast_utf8_init(void)</span><br><span style="color: hsl(120, 100%, 40%);">+{</span><br><span style="color: hsl(120, 100%, 40%);">+ AST_TEST_REGISTER(test_utf8_is_valid);</span><br><span style="color: hsl(120, 100%, 40%);">+ AST_TEST_REGISTER(test_utf8_copy_string);</span><br><span style="color: hsl(120, 100%, 40%);">+ AST_TEST_REGISTER(test_utf8_validator);</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ ast_register_cleanup(test_utf8_shutdown);</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+ return 0;</span><br><span style="color: hsl(120, 100%, 40%);">+}</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+#else /* !TEST_FRAMEWORK */</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+int ast_utf8_init(void)</span><br><span style="color: hsl(120, 100%, 40%);">+{</span><br><span style="color: hsl(120, 100%, 40%);">+ return 0;</span><br><span style="color: hsl(120, 100%, 40%);">+}</span><br><span style="color: hsl(120, 100%, 40%);">+</span><br><span style="color: hsl(120, 100%, 40%);">+#endif</span><br><span></span><br></pre><p>To view, visit <a href="https://gerrit.asterisk.org/c/asterisk/+/14681">change 14681</a>. To unsubscribe, or for help writing mail filters, visit <a href="https://gerrit.asterisk.org/settings">settings</a>.</p><div itemscope itemtype="http://schema.org/EmailMessage"><div itemscope itemprop="action" itemtype="http://schema.org/ViewAction"><link itemprop="url" href="https://gerrit.asterisk.org/c/asterisk/+/14681"/><meta itemprop="name" content="View Change"/></div></div>
<div style="display:none"> Gerrit-Project: asterisk </div>
<div style="display:none"> Gerrit-Branch: 16 </div>
<div style="display:none"> Gerrit-Change-Id: I3555d787a79e7c780a7800cd26e0b5056368abf9 </div>
<div style="display:none"> Gerrit-Change-Number: 14681 </div>
<div style="display:none"> Gerrit-PatchSet: 1 </div>
<div style="display:none"> Gerrit-Owner: Sean Bright <sean.bright@gmail.com> </div>
<div style="display:none"> Gerrit-MessageType: newchange </div>