[asterisk-dev] [Code Review] 4450: locatime.c file descriptor leak on kqueue(2) systems
Ed Hynan
reviewboard at asterisk.org
Tue Mar 3 09:22:20 CST 2015
> On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote:
> > tags/11.7.0/main/stdtime/localtime.c, lines 395-403
> > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line395>
> >
> > Using a global static struct without locking is not thread safe. More on this below.
This should be guarded by the zonelist lock, like the list itsel and all struct state. It is only touched in kqueue_daemon(), where in the next patch all handling of sp and psx_sp is with the lock, and in add_notify() to which all code paths should be entered only with the lock held. More below.
> On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote:
> > tags/11.7.0/main/stdtime/localtime.c, lines 287-302
> > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line287>
> >
> > Having the various blocks of SP_INIT_EX/SP_FREE_EX defined around the code is actually a bit confusing, particularly since the structure they initialize/destroy already has various #defines for the various system options it has to support. How about having this defined immediately after struct state, and handling the "flavors" there:
> >
> > struct state {
> > ...
> > };
> >
> > #if defined(HAVE_INOTIFY)
> > #define SP_INIT_EX ...
> > #define SP_FREE_EX ...
> > #elif defined(HAVE_KQUEUE)
> > #define SP_INIT_EX ...
> > #define SP_FREE_EX ...
> > #else
> > #define SP_INIT_EX ...
> > #define SP_FREE_EX ...
> > #endif
> >
> > That way it's clear why this is getting defined multiple times, and the declaration of the macro immediately following the definition of the structure makes it easy to see why the various variants exist.
Done. Also, names changed to SP_HEAP_{INIT,FREE}, and an additional macro SP_STACK_INIT for one instance of a temporary struct state as an automatic -- a bug I hadn't spotted before.
> On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote:
> > tags/11.7.0/main/stdtime/localtime.c, line 408
> > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line408>
> >
> > If you move kqueue_daemon_freestate(struct state *sp) above kqueue_daemon, then you won't need this forward declaration.
I've removed the prototype; also, references to the proc are now only in the SP_HEAP_FREE macro which is only expanded below the definition.
> On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote:
> > tags/11.7.0/main/stdtime/localtime.c, lines 482-484
> > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line482>
> >
> > I'm not sure why we are printing out to stderr here, but generally Asterisk does not do that unless (a) the logger couldn't be created and (b) things have gone horribly wrong. That doesn't seem like that would necessarily be the case here.
>
> Ed Hynan wrote:
> I agree, no need for a message there. Re. stderr, it's used several other places in original code, so it seemed preferred in this context.
BTW, don't the ast log functions use time formatting? Are they suitable within localtime.c?
> On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote:
> > tags/11.7.0/main/stdtime/localtime.c, lines 765-768
> > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line765>
> >
> > Generally, we tend to prefer allocations to be written as:
> >
> > p = ast_calloc(1, sizeof(*p));
> > if (!p) {
> > /* Handle memory error */
> > }
> >
> > It generally prevents paranthesis errors, and newlines are relatively inexpensive :-)
Done.
> On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote:
> > tags/11.7.0/main/stdtime/localtime.c, lines 775-778
> > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line775>
> >
> > Since ast_free is NULL tolerant, I would go ahead and make SP_FREE_EX NULL tolerant as well. That makes this:
> >
> > static void sstate_free(struct state *p)
> > {
> > SP_FREE_EX(p);
> > ast_free(p);
> > }
Done.
> On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote:
> > tags/11.7.0/main/stdtime/localtime.c, lines 1609-1610
> > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line1609>
> >
> > This shouldn't be necessary. Called in a loop, AST_LIST_REMOVE_HEAD will:
> >
> > (1) Remove items until (cur = head->first) == NULL. That implies head->first is NULL when complete.
> > (2) Will set head->last = NULL when the (cur == head->last), which should happen on the last loop iteration.
> >
> > Since AST_LIST_HEAD_INIT_NOLOCK simply sets both of those to NULL, it is redundant with the final loop state.
Removed. Had been added with a poor picture of the cause of corruption, and seemed to do the trick, but only due to coincidence with other changes. Works without it.
> On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote:
> > tags/11.7.0/main/stdtime/localtime.c, lines 566-587
> > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line566>
> >
> > This is not actually thread safe. You cannot guarantee that two threads, at the same time, won't attempt to test and allocate the structure and assign it to psx_sp.
> >
> > What's more, the initialization of localtime happens very early in the Asterisk startup, and has some very interesting race conditions with forking/threading. I would definitely protect initialization of the psx_sp pointer with a lock.
>
> Ed Hynan wrote:
> Good eyes. I proceeded thinking calls to tzload() (which calls aadd_notify()) were guarded by AST_LIST_LOCK(&zonelist) but another look now shows a place where tzload()(and tzparse() and gmtload()) are called between an unlock and another lock (ast_tzset() line 1451 unpatched). Yet other places make the calls while the lock is held.
>
> This is an extra complication.
There was one 'hole' where code paths to add_notify() were entered without the zonelist lock: ast_tzset() was unlocking before tzparse/tzload/gmtload and re-locking after. That is very likely the source of corruption that had been causing error returns leading to repeated tzload("posixrules").
In the revised patch, the lock is held throughout ast_tzset(). I've studied the code, and checked with GNU cflow, to be sure paths to add_notify() are protected. Also, if you try the revised test program (against 11.7.0 source), have a look at the logs it makes. The test really hammers localtime.c with threads. Build with "make hack", run ./run_test.sh, and after a long run, try grep -E '(BAD|TZSET)' /tmp/testlocaltimed.log (or view logs from menu) -- there should be no "BAD" but *many* "TZSET" showing how much the lock is needed.
In addition to ensuring the relevant code paths have the lock, the use of psx_sp within add_notify() is now guarded with AST_LIST_TRYLOCK(&zonelist); which should be redundant, but better safe than sorry. (It was only yesterday I noticed AST mutexes are set RECURSIVE, and that misunderstanding made me check the code even more, which is not a bad thing.)
- Ed
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/4450/#review14553
-----------------------------------------------------------
On March 3, 2015, 2:47 p.m., Ed Hynan wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/4450/
> -----------------------------------------------------------
>
> (Updated March 3, 2015, 2:47 p.m.)
>
>
> Review request for Asterisk Developers.
>
>
> Bugs: ASTERISK-24739
> https://issues.asterisk.org/jira/browse/ASTERISK-24739
>
>
> Repository: Asterisk
>
>
> Description
> -------
>
> File descriptor leak on kqueue(2) systems (found on OpenBSD 5.5) on main/stdtime/localtime.c -- TZ data files change watch code within the time functions. (Issue at https://issues.asterisk.org/jira/browse/ASTERISK-24739.)
>
>
> Diffs
> -----
>
> tags/11.7.0/main/stdtime/localtime.c 432443
>
> Diff: https://reviewboard.asterisk.org/r/4450/diff/
>
>
> Testing
> -------
>
> Patch initially developed against OpenBSD 5.5 ports package sources, and Asterisk in use. Subsequently developed test program that can run original or patched 11.7.0 code.
>
>
> File Attachments
> ----------------
>
> localtime.c (11.7.0) test program
> https://reviewboard.asterisk.org/media/uploaded/files/2015/02/25/f7a19f2a-4a8a-4aff-b7d6-44a4146c358c__patchtest.tgz
>
>
> Thanks,
>
> Ed Hynan
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20150303/70077cac/attachment-0001.html>
More information about the asterisk-dev
mailing list