[Asterisk-Dev] moh issues

Luigi Rizzo rizzo at icir.org
Wed Jun 15 03:57:36 MST 2005


[background: i noticed a strange behaviour of asterisk on FreeBSD 4.11
and userland pthreads library when using res_musiconhold.]

On Tue, Jun 14, 2005 at 09:47:34AM +0100, Chris Stenton wrote:
> Luigi,
> 
> Try linking with the linux threads port. You are better moving to 
> FreeBSD 5.4 if possible.

Thanks. However I found where the problem is, and it affects also
FreeBSD 5.x and 6.x with the userland pthread library.
No idea that is the status with other BSD systems.

A description of the problem follows (just posted to aurrent at freebsd.org),
I hope it is a useful reading... I certainly learned a lot in debugging
this issue.

In terms of asterisk, the same problem it may affect all modules
that do a fork.

	cheers
	luigi

------------------------------------------------

Probably a known issue, but I thought it worthwhile reporting it,
if nothing else for archival purposes.

I think the FreeBSD userland thread library (libc_r) has some bugs in
handling descriptors.  I can reproduce the behaviour on -current
and 4.x, and I believe it applies to 5.x too.  

Following is a description of the problem and some code to replicate it
The code includes a workaround but it is not particularly nice.

Any better ideas ? I am not sure on what to do, but perhaps the
only sensible thing to do is to add a note with this workaround
(or better ones, if available) to our pthreads manpage

--- PROBLEM DESCRIPTION ---

Basically, our libc_r keeps two views of i/o descriptors, one
(external) is for threads and reflects the modes requested by the
threads (blocking or not, etc.); the "internal" view instead is how
descriptors are actually set in the kernel -- and there they should
always be set as O_NONBLOCK to avoid blocking on a syscall.

The bug occurs when a process does a fork(), and then either
a close() or an exec() -- a similar thing also occurs with popen().
The relevant source code is in

    /usr/src/lib/libc_r/uthread/uthread_execve.c
    /usr/src/lib/libc_r/uthread/uthread_close.c

Right before the exec(), the internal descriptors are put into
blocking mode if the external one are blocking, and they are only
reset to O_NONBLOCK after termination of the child (upon SIGCHLD).
The same occurs for close(). 

Note that close() has hacks to leave pipes alone, but the same
code is not present in the execve() case where instead I believe
it would be necessary. Another thing to note is that there is
some kind of 'fate sharing' among the stdio descriptors (0, 1, 2)
which is not totally clear to me, but seems to require setting
O_NONBLOCK on all 3 to make sure that they are not changed to
blocking mode.

Because descriptors are shared between parent and child, for the
lifetime of the child descriptors in the parent will be blocking
and the scheduling of threads will be completely broken.

The only fix i have found is to act as follows:

        pipe(fd);       /* create a pipe with the child */
        p = fork();
        if (p == 0) { /* child */
            /* call fcntl() _before_ close() to avoid resetting
             * O_NONBLOCK on the internal descriptors. After that,
             * close the descriptors not needed in the child.
             */  
            for (i=0; i < getdtablesize(); i++) {
                long fl = fcntl(i, F_GETFL);
                if (fl != -1 && i != fd[0]) {
                    /* open and must be closed in the child */
                    fcntl(i, F_SETFL, O_NONBLOCK | fl);
                    close(i);
                }
            }
            /* standard stuff (dup2, exec*()... */
            dup2(fd[0], STDOUT_FILENO); /* as an example */
            execl(....);
        } else { /* parent */
            close(fd[0]);       /* close child end. */
            ...
        }

but of course this is rather unintuitive. On the other hand,
I have no idea of a better way to address the problem, and being
fairly new to threads programming maybe others know better.

I am attaching two minimal programs to demonstrate the bug.

simple.c is a simple program (linked against the regular C library)
	cc -o simple simple.c

that only plays with blocking mode on the descriptors.

thre.c is meant to be linked with libc_r.
	cc -o thre thre.c -lc_r

It does a fork and exec of the other program.
If you call it without arguments, it does not implement the
above workaround, and you see how the 'internal' descriptor
change to blocking mode. If you call it with an argument, it
implements the workaround.

	enjoy
	luigi




More information about the asterisk-dev mailing list