[asterisk-ss7] chan_ss7 1.4.3 bug with overlapping CIC numbers on combined linksets
Gregory Massel
greg at csurf.co.za
Thu Jan 6 09:11:03 CST 2011
Hello
I seem to have picked up a bug in chan_ss7 (version 1.4.3) and I was
wondering if anyone else can confirm the same experience or assist in
developing a fix.
The problem arises when there are multiple combined linksets with
overlapping CIC numbers. This was supported from chan_ss7 version 1.4 (and
described as "Dutch ISUP" in the NEWS file) through the inclusion of a patch
developed by Robert Verspuy.
Example config:
--------
[linkset-alpha]
loadshare => combined
combined => alpha-bravo
[linkset-bravo]
loadshare => combined
combined => alpha-bravo
[link-l1]
linkset => alpha
channels => 1-15,17-31
schannel => 16
firstcic => 33
[link-l2]
linkset => alpha
channels => 1-31
firstcic => 65
[link-l3]
linkset => bravo
channels => 1-15,17-31
schannel => 16
firstcic => 33
[link-l2]
linkset => bravo
channels => 1-31
firstcic => 65
--------
Now, at first things work well:
"ss7 link status" shows that both link l1/16 and l3/16 are INSERVICE.
"ss7 linestat" shows that all CICs go into Idle status.
Calls flow in both directions perfectly, usually for around four to six
hours, sometimes less, sometimes more, depending on the call volume.
The only unusual messages that I pick up are the following:
NOTICE[1467] l4isup.c: Trying to remove CIC=68 from idle list, but not
found?!?.
So it seems that the patch is effective at matching the CICs in one section,
but perhaps some else in the code something else also needs to be patched to
allow clean-up of the CICs once used.
Eventually, something exhausts itself and everything falls apart with the
following message flooding repeatedly:
NOTICE[8513]: mtp.c:413 mtp_put: Full MTP receivebuf, event lost, type=15.
At that point, the only solution is to shut down the Asterisk process and
completely restart it. Try to unload the chan_ss7 module is unsuccessful. It
would appear that the Full MTP receivebuf is not actually the problem
itself, but perhaps that a memory leak or lack of clean-up is simply causing
chan_ss7 to get to the point where it is so unresponsive that the this
message starts logging.
Interestingly, one way of mitigating the problem was to block one set of the
overlapping CICs.
e.g. If you block all the CICs on l1 and l2 (linkset alpha) as follows:
ss7 block 33 15 alpha
ss7 block 49 15 alpha
ss7 block 65 31 alpha
then it generally would not log the errors and wouldn't crash. Of course
this meant running on 50% capacity. It was also essential to block the
correct linkset.
Eventually, what I did was arrange with the other end to change the CIC
numbers for links l3 and l4 to start at 97 and 129 respectively so that
there was no overlap with l1 and l2. Once I did this, chan_ss7 was 100%
stable again, even in heavy load.
So it seems that the patch contributed for "Dutch ISUP" (also used in South
Africa) perhaps needs to be extended to avoid causing "NOTICE[1467]
l4isup.c: Trying to remove CIC=68 from idle list, but not found?!?." and the
associated eventual crash.
Unfortunately I'm a bit out of my depth in terms of locating and modifying
the part of the code where the problem is occurring. I hope that this is
sufficient information to identify the bug and fix it.
Kind Regards
Gregory Massel
More information about the asterisk-ss7
mailing list