[asterisk-bugs] [JIRA] (ASTERISK-26903) Listening TCP/TLS sockets stop when temporarily out of open files
Rusty Newton (JIRA)
noreply at issues.asterisk.org
Fri Mar 31 15:41:10 CDT 2017
[ https://issues.asterisk.org/jira/browse/ASTERISK-26903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rusty Newton updated ASTERISK-26903:
------------------------------------
Component/s: Core/General
> Listening TCP/TLS sockets stop when temporarily out of open files
> -----------------------------------------------------------------
>
> Key: ASTERISK-26903
> URL: https://issues.asterisk.org/jira/browse/ASTERISK-26903
> Project: Asterisk
> Issue Type: Bug
> Security Level: None
> Components: Core/General
> Reporter: Walter Doekes
>
> Just now a misconfigured Asterisk lost its AMI socket because it ran out of open files.
> Asterisk uses ast_tcptls_server_root for several TCP/TLS listening sockets. If an accept() there fails, the entire listening thread just stops without more than a WARNING. (Okay, ERROR on Asterisk 13+.)
> Example:
> {noformat}
> [2017-03-29 06:34:50] ERROR[6513] cel_custom.c: Unable to re-open master file /var/log/asterisk/cel-custom/full.csv : Too many open files
> [2017-03-29 06:34:50] WARNING[6519] tcptls.c: Accept failed: Too many open files
> [2017-03-29 06:34:50] ERROR[6513] cel_custom.c: Unable to re-open master file /var/log/asterisk/cel-custom/full.csv : Too many open files
> {noformat}
> At 06:34:50 AMI was trying to accept() an incoming connection. It failed here:
> {code}
> void *ast_tcptls_server_root(void *data)
> {
> // ...
> for (;;) {
> // ...
> fd = ast_accept(desc->accept_fd, &addr);
> if (fd < 0) {
> if ((errno != EAGAIN) && (errno != EWOULDBLOCK) && (errno != EINTR) && (errno != ECONNABORTED)) {
> ast_log(LOG_WARNING, "Accept failed: %s\n", strerror(errno));
> break;
> // ...
> return NULL;
> }
> {code}
> That is, with just that WARNING (ERROR), the listening thread dies.
> There is no cleanup when the thread ends, so it keeps listening, but no one is accept()ing any connections. Because the OS takes care of the TCP handshake, it appears as though Asterisk has hung before it can do a write. (You can connect to the port, but nothing happens.) But the problem is even earlier.
> This function is used here:
> {noformat}
> asterisk-rw-13.git$ wgrep . accept_fn.*ast_tcptls_server_root
> ./main/manager.c: .accept_fn = ast_tcptls_server_root, /* thread doing the accept() */
> ./main/manager.c: .accept_fn = ast_tcptls_server_root, /* thread doing the accept() */
> ./main/http.c: .accept_fn = ast_tcptls_server_root,
> ./main/http.c: .accept_fn = ast_tcptls_server_root,
> ./channels/chan_sip.c: .accept_fn = ast_tcptls_server_root,
> ./channels/chan_sip.c: .accept_fn = ast_tcptls_server_root,
> {noformat}
> (once for TCP, once for SSL)
> In a 2014 commit that has made it to Asterisk 13+, the message has been changed from WARNING to ERROR.
> {noformat}
> commit 7c276f9fef945b644566533ddbcb72a2ec8ff821
> Author: Olle Johansson <oej at edvina.net>
> Date: Sun Apr 27 19:29:27 2014 +0000
> tcptls.c : Log errors as ERROR, not warning or something else.
> {noformat}
> But there still is no indication that the thread has ended.
> Suggestions for improvement:
> - add another ERROR before {{return NULL}} that says the thread has ended (prematurely)
> - or, don't end the thread just because a single accept() failed and stay in the for-loop instead; that would make Asterisk more resilient against temporary problems
--
This message was sent by Atlassian JIRA
(v6.2#6252)
More information about the asterisk-bugs
mailing list