[Asterisk-code-review] chan sip: Don't unload module while TCP/TLS threads are stil... (asterisk[master])
Corey Farrell
asteriskteam at digium.com
Fri Dec 2 19:06:04 CST 2016
Corey Farrell has posted comments on this change. ( https://gerrit.asterisk.org/4536 )
Change subject: chan_sip: Don't unload module while TCP/TLS threads are still running
......................................................................
Patch Set 2:
> I think I'm giving up on this. I needed to touch almost every line
> in unload_module and still couldn't get it right so that module
> unload chan_sip could be re-run after all threads had become
> unblocked and had terminated -- every time I fixed one thing, two
> others seemed to pop up. After all, it's not the most important bug
> to fix and these race conditions should be very rare (e.g. if a
> thread is blocked but doesn't respond to SIGURG, then magically
> becomes unstuck after the unload).
>
> One thing I would like to keep is the second attempt at
> pthread_kill(thread, SIGURG) in the current patch set. That gives
> threads that hadn't reached their blocking syscall yet at the point
> the signal was first sent the opportunity to terminate properly. If
> I add a time limit of, say, 10 seconds to the while loop in line
> 35527, would that be acceptable to merge?
>
> The problems that will remain are the leaking of deadlocked threads
> (unfixable at this level, but it's mainly a memory leak, therefore
> it doesn't matter during shutdown) and the crashing of threads that
> become unblocked after being blocked on a call that cannot return
> EINTR (messy to fix, see above, but probably very unlikely to
> occur; could crash Asterisk during shutdown if the thread becomes
> unstuck before shutdown is completed).
I'm not sure if additional calls to pthread_kill help. If it didn't work the first time why would it work the second time? I'm not a fan of expanding the time limit either, in fact I think the goal should be to shorten the time limit. Module unload should not be taking long periods of time to run - it should just succeed or fail.
Please give me a chance to try patching to allow unload_module to fail gracefully (return -1). I think I should be able to reorder it to safely protect against threads that become unstuck. I will need your assistance testing it as the only place I've ever used TCP based SIP is in the testsuite.
--
To view, visit https://gerrit.asterisk.org/4536
To unsubscribe, visit https://gerrit.asterisk.org/settings
Gerrit-MessageType: comment
Gerrit-Change-Id: I23a8abf9e8f73b57ff7ada61984b8506aea6cbde
Gerrit-PatchSet: 2
Gerrit-Project: asterisk
Gerrit-Branch: master
Gerrit-Owner: Michael Kuron <m.kuron at gmx.de>
Gerrit-Reviewer: Anonymous Coward #1000019
Gerrit-Reviewer: Corey Farrell <git at cfware.com>
Gerrit-Reviewer: Michael Kuron <m.kuron at gmx.de>
Gerrit-HasComments: No
More information about the asterisk-code-review
mailing list