[Asterisk-code-review] taskprocessor.c: Prevent crash on graceful shutdown (asterisk[16])
Kevin Harwell
asteriskteam at digium.com
Mon Feb 7 17:52:13 CST 2022
Attention is currently required from: Michael Bradeen, Joshua Colp, Benjamin Keith Ford.
Kevin Harwell has posted comments on this change. ( https://gerrit.asterisk.org/c/asterisk/+/17955 )
Change subject: taskprocessor.c: Prevent crash on graceful shutdown
......................................................................
Patch Set 5: Code-Review-1
(6 comments)
File include/asterisk/taskprocessor.h:
https://gerrit.asterisk.org/c/asterisk/+/17955/comment/6818e6b8_6713749b
PS5, Line 67: #define AST_TASKPROCESSOR_SHUTDOWN_MAX_WAIT 10
Move this to taskprocessor.c since it's not referenced from anywhere else.
File main/taskprocessor.c:
https://gerrit.asterisk.org/c/asterisk/+/17955/comment/1b0fcc0a_29e95b67
PS5, Line 298: /* During shutdown there may still be taskprocessor threads running and those
: * tasprocessors reference tps_singletons. When those taskprocessors finish
: * they will call ast_taskprocessor_unreference, creating a race condition which
: * can result in tps_singletons being referenced after being deleted. To try and
: * avoid this we check the container count and if greater than zero, give the
: * running taskprocessors a chance to finish */
Not sure if this is the right fix here. Kinda feels like a "hack". It does not alleviate the problem, just lessens the chance a bit.
Maybe we should instead decouple the unreferencing of a task processor from it's removal from the global container?
Even then I guess we might still need to have a timed wait for the task processors to complete, so maybe this is fine. Or find a way for tasks to safely be interrupted or "killed".
As a note to the last point even after the task processor is removed from the container its listener is then "shutdown" afterwards, so there is still a bit of a shutdown race on top of the tps_singletons race condition.
https://gerrit.asterisk.org/c/asterisk/+/17955/comment/f8093285_1eb96f3c
PS5, Line 306: "taskprocessor shutdown with %d tps object(s) still allocated.\n", objcount);
There is a "tab" after the comma here. Should be a space.
https://gerrit.asterisk.org/c/asterisk/+/17955/comment/3330fa7e_2ed8d9d6
PS5, Line 311: while(nanosleep(&delay, &delay));
Put a space between the "while" and "("
https://gerrit.asterisk.org/c/asterisk/+/17955/comment/82cb0bbd_97b0e717
PS5, Line 317: delay.tv_sec = 1;
: delay.tv_sec = 0;
: }
Log a notice or something on each loop so the user is aware of what's going on. Otherwise Asterisk may appear "hung".
https://gerrit.asterisk.org/c/asterisk/+/17955/comment/d4197e6a_1822e695
PS5, Line 325: ast_log(LOG_ERROR,
: "taskprocessor shutdown while tasks still runing, assertion may occur!\n");
Think it would make sense here to also log the names of any taskprocessors still in the container. It might make it easier for someone to investigate later?
--
To view, visit https://gerrit.asterisk.org/c/asterisk/+/17955
To unsubscribe, or for help writing mail filters, visit https://gerrit.asterisk.org/settings
Gerrit-Project: asterisk
Gerrit-Branch: 16
Gerrit-Change-Id: Ia932fc003d316389b9c4fd15ad6594458c9727f1
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Bradeen <mbradeen at sangoma.com>
Gerrit-Reviewer: Benjamin Keith Ford <bford at digium.com>
Gerrit-Reviewer: Friendly Automation
Gerrit-Reviewer: Joshua Colp <jcolp at sangoma.com>
Gerrit-Reviewer: Kevin Harwell <kharwell at digium.com>
Gerrit-Attention: Michael Bradeen <mbradeen at sangoma.com>
Gerrit-Attention: Joshua Colp <jcolp at sangoma.com>
Gerrit-Attention: Benjamin Keith Ford <bford at digium.com>
Gerrit-Comment-Date: Mon, 07 Feb 2022 23:52:13 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: Yes
Gerrit-MessageType: comment
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-code-review/attachments/20220207/2629d1c9/attachment.html>
More information about the asterisk-code-review
mailing list