[asterisk-dev] Problems with reregister schedule
torbjorn.abrahamsson at gmail.com
Wed Sep 19 02:46:25 CDT 2018
We have encountered a problem concerning the scheduling of reregisters in
chan_sip. We are using version 13.15.0.
Our problem is that sometimes the scheduler seem to contain more objects
than it should, resulting in more registers being sent than it should. The
problem seem to occur when doing reloads, but not always.
If I do a sip show registry, I see the number of expected registers, and
if I do a sip show sched, I see that there are more reregister schedules
than the previously shown number of registers. On a fresh machine these
values are the same, but after an amount of reloads they begin to differ.
The registry_list seem to contain the correct amount of objects. These rouge
reregisters seem to live a life of their own. This is not a really big
problem because sending 10 registers instead of 1 only consumes more network
traffic, but the REGISTRAR does not really care. But, if we remove a
register from Asterisk, then the rouge ones will still be there, keeping on
registering until the end of the world. Same goes for changing the extension
that a register maps to, which will result in two registrations with
different contact being sent. These cases are problematic. The only way the
stop them seem to be to restart Asterisk.
After looking at the code, we see that on a reload the schedule is canceled
and rebuilt. The problem is that this cancel/rebuild is based on the
registry_list, which do not contain the rouge registers. So we started to
look at possibilities to clear the whole schedule. After a little
investigation we found the ast_sched_clean_by_callback function. So we
implemented this new callback function:
static int my_clean_task(const void *data)
And then we modified cleanup_all_regs, and added the following function call
before calling the ao2_t_callback:
ast_sched_clean_by_callback(sched, sip_reregister, my_clean_task);
This seemed to solve the problem, the sip show registry and sip show
sched now always showed the same value. The problem now was that Asterisk
segfaulted (sig 11) when doing multiple reloads. So my guess is that we do
need to lock something before doing this, but I do unfortunately not see
what lock to use.
So, any pointers to what to do? Is our solution on the right track? Should
this be solved in another way?
Thanks in advance, and best regards,
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the asterisk-dev