[Asterisk-code-review] res corosync: Change thread stack size (asterisk[14])

Jan Friesse asteriskteam at digium.com
Fri Jun 16 07:52:45 CDT 2017


Jan Friesse has uploaded this change for review. ( https://gerrit.asterisk.org/5858


Change subject: res_corosync: Change thread stack size
......................................................................

res_corosync: Change thread stack size

In Corosync 2.x libraries were changed to use LibQB IPC.
Sadly LibQB IPC doesn't support copy-free access to received buffer, so
Corosync libraries were rewritten to use stack as buffer. Mostly the
needed stack size is quite small, but for all *_dispatch functions, 1MiB
is needed.

Asterisk function ast_pthread_create_background set stack size for new
thread to much smaller AST_BACKGROUND_STACKSIZE (~500KiB).

This results in Asterisk crash when running with Corosync 2.x.

Patch solves this issue by creating it's own version of
ast_pthread_create_background which sets stack size to much higher value
(actually it's AST_BACKGROUND_STACKSIZE + 3MiB).

Another problem may appear when "corosync show members" netconsole
command is executed. It is also executed in thread and also has only
500KiB stack size. Sadly it calls corosync_cfg_get_node_addrs which
again needs at least 1MiB stack.

Solution is to use HAVE_COROSYNC_CFG_STATE_TRACK as a discriminator
between Corosync 1.x and 2.x. If 1.x is found, nothing changes. If 2.x
is found, NodeID is displayed instead of IP address.

ASTERISK-25370 #close
Reported by: mdu113

Change-Id: Id95b0d21ab6e708e7d74ad8786c587211676fa08
---
M res/res_corosync.c
1 file changed, 24 insertions(+), 5 deletions(-)



  git pull ssh://gerrit.asterisk.org:29418/asterisk refs/changes/58/5858/1

diff --git a/res/res_corosync.c b/res/res_corosync.c
index dbb8846..c865e20 100644
--- a/res/res_corosync.c
+++ b/res/res_corosync.c
@@ -79,6 +79,15 @@
 	struct ast_sockaddr addr;
 };
 
+/*! \brief Corosync ipc dispatch/request and reply size */
+#define COROSYNC_IPC_BUFFER_SIZE				(8192 * 128)
+
+/*! \brief Version of pthread_create to ensure stack is large enough */
+#define corosync_pthread_create_background(a, b, c, d)				\
+	ast_pthread_create_stack(a, b, c, d,					\
+		(AST_BACKGROUND_STACKSIZE + (3 * COROSYNC_IPC_BUFFER_SIZE)),	\
+		__FILE__, __FUNCTION__, __LINE__, #c)
+
 static struct corosync_node *corosync_node_alloc(struct ast_event *event)
 {
 	struct corosync_node *node;
@@ -810,19 +819,27 @@
 	for (i = 1, cs_err = cpg_iteration_next(cpg_iter, &cpg_desc);
 			cs_err == CS_OK;
 			cs_err = cpg_iteration_next(cpg_iter, &cpg_desc), i++) {
+#ifdef HAVE_COROSYNC_CFG_STATE_TRACK
 		corosync_cfg_node_address_t addrs[8];
 		int num_addrs = 0;
 		unsigned int j;
+#endif
 
+		ast_cli(a->fd, "=== Node %u\n", i);
+		ast_cli(a->fd, "=== --> Group: %s\n", cpg_desc.group.value);
+
+#ifdef HAVE_COROSYNC_CFG_STATE_TRACK
+		/*
+		 * Corosync 2.x cfg lib needs to allocate 1M on stack after calling
+		 * corosync_cfg_get_node_addrs. netconsole thread has allocated only 0.5M
+		 * resulting in crash.
+		 */
 		cs_err = corosync_cfg_get_node_addrs(cfg_handle, cpg_desc.nodeid,
 				ARRAY_LEN(addrs), &num_addrs, addrs);
 		if (cs_err != CS_OK) {
 			ast_log(LOG_WARNING, "Failed to get node addresses\n");
 			continue;
 		}
-
-		ast_cli(a->fd, "=== Node %u\n", i);
-		ast_cli(a->fd, "=== --> Group: %s\n", cpg_desc.group.value);
 
 		for (j = 0; j < num_addrs; j++) {
 			struct sockaddr *sa = (struct sockaddr *) addrs[j].address;
@@ -833,7 +850,9 @@
 
 			ast_cli(a->fd, "=== --> Address %u: %s\n", j + 1, buf);
 		}
-
+#else
+		ast_cli(a->fd, "=== --> Nodeid: %"PRIu32"\n", cpg_desc.nodeid);
+#endif
 	}
 
 	ast_cli(a->fd, "===\n"
@@ -1159,7 +1178,7 @@
 		goto failed;
 	}
 
-	if (ast_pthread_create_background(&dispatch_thread.id, NULL,
+	if (corosync_pthread_create_background(&dispatch_thread.id, NULL,
 			dispatch_thread_handler, NULL)) {
 		ast_log(LOG_ERROR, "Error starting CPG dispatch thread.\n");
 		goto failed;

-- 
To view, visit https://gerrit.asterisk.org/5858
To unsubscribe, visit https://gerrit.asterisk.org/settings

Gerrit-Project: asterisk
Gerrit-Branch: 14
Gerrit-MessageType: newchange
Gerrit-Change-Id: Id95b0d21ab6e708e7d74ad8786c587211676fa08
Gerrit-Change-Number: 5858
Gerrit-PatchSet: 1
Gerrit-Owner: Jan Friesse <jfriesse at redhat.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-code-review/attachments/20170616/bc8c36a1/attachment.html>


More information about the asterisk-code-review mailing list