[Asterisk-Dev] Deadlock/mutex conflicts in chan_agent.c
steve at daviesfam.org
steve at daviesfam.org
Sat Jan 1 05:09:33 MST 2005
Hi,
I've started trying to use Agents with CVS-HEAD, and I'm having problems
with mutex deadlocks between the "AgentLogin" side and the
"Queue" side.
All the mutex twiddling in chan_agent.c is complicated, so its hard for me
to tell what is supposed to happen and how best to fix it.
Anyway - here's an explanation of the deadlock:
In __login_exec, there's a lot of locking and unlocking of &agentlock and
the &p->lock for the particular agent_pvt structure.
This code locks &agentlock first and then the agent_pvt structure.
In the middle is the following:
ast_mutex_lock(&agentlock);
ast_mutex_lock(&p->lock);
if (p->lastdisc.tv_sec) {
gettimeofday(&tv, NULL);
if ((tv.tv_sec - p->lastdisc.tv_sec) * 1000 +
(tv.tv_usec - p->lastdisc.tv_usec) / 1000 > p->wrapuptime) {
ast_log(LOG_DEBUG, "Wrapup time expired!\n");
memset(&p->lastdisc, 0, sizeof(p->lastdisc));
if (p->ackcall > 1)
check_beep(p, 0);
else
check_availability(p, 0);
}
}
ast_mutex_unlock(&p->lock);
ast_mutex_unlock(&agentlock);
Meanwhile, on the "other side", in agent_call, the code holds a lock
on the agent_pvt structure from start to finish without first locking the
agentlock:
static int agent_call(struct ast_channel *ast, char *dest, int timeout)
{
struct agent_pvt *p = ast->pvt->pvt;
struct ast_channel *chan;
int res = -1;
ast_mutex_lock(&p->lock);
...
... Code omitted
...
ast_verbose( VERBOSE_PREFIX_3 "agent_call, call to agent '%s' call on '%s'\n", p->agent, p->chan->name);
ast_log( LOG_DEBUG, "Playing beep, lang '%s'\n", p->chan->language);
res = ast_streamfile(p->chan, beep, p->chan->language);
ast_log( LOG_DEBUG, "Played beep, result '%d'\n", res);
if (!res) {
res = ast_waitstream(p->chan, "");
ast_log( LOG_DEBUG, "Waited for stream, result '%d'\n", res);
}
if (!res) {
res = ast_set_read_format(p->chan, ast_best_codec(p->chan->nativeformats));
ast_log( LOG_DEBUG, "Set read format, result '%d'\n", res);
if (res)
ast_log(LOG_WARNING, "Unable to set read format to %s\n", ast_getformatname(ast_best_codec(p->chan->nativeformats)));
} else {
/* Agent hung-up */
p->chan = NULL;
}
if (!res) {
ast_set_write_format(p->chan, ast_best_codec(p->chan->nativeformats));
ast_log( LOG_DEBUG, "Set write format, result '%d'\n", res);
if (res)
ast_log(LOG_WARNING, "Unable to set write format to %s\n", ast_getformatname(ast_best_codec(p->chan->nativeformats)));
}
if( !res )
{
/* Call is immediately up, or might need ack */
if (p->ackcall > 1)
ast_setstate(ast, AST_STATE_RINGING);
else {
ast_setstate(ast, AST_STATE_UP);
if (recordagentcalls)
agent_start_monitoring(ast,0);
p->acknowledged = 1;
}
res = 0;
}
CLEANUP(ast,p);
ast_mutex_unlock(&p->lock);
return res;
}
Now during the execution of agent_call, something calls agent_devicestate
- I suppose during execution of ast_waitstream.
agent_devicestate then tries to lock agentlock....
End result is that the agent_call thread has the agent_pvt struct locked
and is trying to lock agentlock; the __login_exec thread has agentlock
locked and is trying to lock the agent_pvt. Result: deadlock.
Now I've got two possible fixes:
Option 1:
agent_call releases its &p->lock lock before doing the ast_waitstream
etc. Perhaps some code will be necessary to be sure the agent_pvt
doesn't disappear underneath us.
Option 2:
agent_call should lock agentlock up front before &p->lock and hold that
from start to finish just like &p->lock.
I tried both and they both fixed the problem. In which case I
guess the option 2 is safer.
Comments? Otherwise I'll send the patch.
Steve
More information about the asterisk-dev
mailing list