[asterisk-dev] [Code Review] 2970: Don't call dlclose in a while loop
David Lee
reviewboard at asterisk.org
Tue Oct 29 14:31:17 CDT 2013
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/2970/#review10028
-----------------------------------------------------------
Ship it!
I've done basically the same thing in 12, but with less error checking.
Feel free to replace my code with yours when you merge into 12 and trunk.
- David Lee
On Oct. 29, 2013, 12:42 p.m., Matt Jordan wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/2970/
> -----------------------------------------------------------
>
> (Updated Oct. 29, 2013, 12:42 p.m.)
>
>
> Review request for Asterisk Developers and David Lee.
>
>
> Repository: Asterisk
>
>
> Description
> -------
>
> For awhile now, we've noticed continuous integration builds hanging on CentOS 6 64-bit build agents. After resolving a number of problems with symbols, strange locks, and other shenanigans, the problem has persisted. In all cases, gdb shows the Asterisk process stuck in loader.c on one of the infinite while loops that calls dlclose repeatedly until success:
>
> Thread 1 (Thread 0xb77ae730 (LWP 20263)):
> #0 0x009b6be1 in _dl_catch_error () from /lib/ld-linux.so.2
> #1 0x00b6803c in _dlerror_run () from /lib/libdl.so.2
> #2 0x00b67d0a in dlclose () from /lib/libdl.so.2
> #3 0x082305fa in load_dynamic_module (resource_in=0xa306e40 "res_snmp.so", global_symbols_only=0, resource_heap=0xa2bd1d0) at loader.c:474
> #4 0x082332ba in load_resource (resource_name=0xa306e40 "res_snmp.so", global_symbols_only=0, resource_heap=0xa2bd1d0, required=0) at loader.c:899
> #5 0x0823402a in load_resource_list (load_order=0xbfb41f20, global_symbols=0, mod_count=0xbfb41f18) at loader.c:1022
> #6 0x082351fc in load_modules (preload_only=0) at loader.c:1200
> #7 0x080b4a69 in main (argc=8, argv=0xbfb43454) at asterisk.c:4239
>
> The documentation of dlclose states that it returns 0 on success; any other value on error. It does not state that repeatedly calling it will eventually clear those errors. Most likely, the repeated calls to dlclose was to force a close by exhausting the references on the library; however, that will never succeed if:
> (a) There is some fundamental error at work in the loaded library that precludes unloading it
> (b) Some other loaded module is referencing a symbol in the currently loaded module
>
> This results in Asterisk sitting forever. Waiting for Godot, as it were.
>
> Since we have matching pairs of dlopen/dlclose, this path opts to only call dlclose once, and log out as an ERROR if dlclose fails to return success. If nothing else, this might help to determine why on the CentOS 6 64-bit build agent things are not closing successfully.
>
>
> Diffs
> -----
>
> /branches/1.8/main/loader.c 402149
>
> Diff: https://reviewboard.asterisk.org/r/2970/diff/
>
>
> Testing
> -------
>
>
> File Attachments
> ----------------
>
>
> https://reviewboard.asterisk.org/media/uploaded/files/2013/10/29/dlclose_loop_error_1.txt
>
>
> Thanks,
>
> Matt Jordan
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20131029/c42a459e/attachment.html>
More information about the asterisk-dev
mailing list