[asterisk-ss7] chan_ss7 1.4.3 bug with overlapping CIC numbers on combined linksets

Gregory Massel greg at csurf.co.za
Thu Jan 6 12:19:45 CST 2011


Hi Robert

I think the write buffer notice you're getting is normal audio loss due to the jitter buffer and jitter associated with the SIP channel.

It seems that remove_from_idlelist() in l4isup.c may be the issue:

static void remove_from_idlelist(struct ss7_chan *pvt) {
  struct linkset* linkset = pvt->link->linkset;
  struct ss7_chan *prev, *cur;

  cur = linkset->group_linkset->idle_list;
  prev = NULL;
  while(cur != NULL) {
    if(pvt->cic == cur->cic) {
      if(prev == NULL) {
        linkset->group_linkset->idle_list = pvt->next_idle;
      } else {
        prev->next_idle = pvt->next_idle;
      }
      pvt->next_idle = NULL;
      return;
    }
    prev = cur;
    cur = cur->next_idle;
  }
  ast_log(LOG_NOTICE, "Trying to remove CIC=%d from idle list, but not found?!?.\n", pvt->cic);
}


What concerns me is the following line:

if(pvt->cic == cur->cic) {

I get the impression that it should read something like:

if(pvt->cic == cur->cic && linkset->dpc == cur->linkset->dpc) {

In my mind, this should, in theory, ensure that the CIC is matched not just on the CIC number but also on the DPC.

My thinking is that perhaps the current code is causing the wrong channel to be removed from the idle list.

Does this sound logical to you?

Unfortunately this doesn't explain why I'm experiencing the problem but you arn't, although I suspect that the crash may only occur once you try dial out a call using a channel that wasn't correctly removed from the idle list. Depending on the hunting policy in use, it may be that you're receiving calls on even CICs and making them on odd CICs as long as there are free CICs, in which case you may not experience the problem if you don't use more than 50% of your capacity. Just a theory...

--Greg

  ----- Original Message ----- 
  From: Robert Verspuy 
  To: Gregory Massel 
  Cc: asterisk-ss7 at lists.digium.com 
  Sent: Thursday, January 06, 2011 5:57 PM
  Subject: Re: chan_ss7 1.4.3 bug with overlapping CIC numbers on combined linksets


  Hi Gregory,

  Op 06-01-11 16:11, Gregory Massel schreef: 
    Hello 

    I seem to have picked up a bug in chan_ss7 (version 1.4.3) and I was wondering if anyone else can confirm the same experience or assist in developing a fix. 

    The problem arises when there are multiple combined linksets with overlapping CIC numbers. This was supported from chan_ss7 version 1.4 (and described as "Dutch ISUP" in the NEWS file) through the inclusion of a patch developed by Robert Verspuy. 


  I'm running 2 asterisk servers (although still with chan_ss7 1.2.1 with my own patches backported).


    The only unusual messages that I pick up are the following: 
    NOTICE[1467] l4isup.c: Trying to remove CIC=68 from idle list, but not found?!?. 

  I don't see that kind of message in my logfiles,
  But I do see the following:

  [Jan  6 09:52:08] NOTICE[29584] l4isup.c: Got call progress, but call setup not active, CIC=95, state=5?!?

  I don't know if this is related somehow.



    So it seems that the patch is effective at matching the CICs in one section, but perhaps some else in the code something else also needs to be patched to allow clean-up of the CICs once used. 

  This could be.
  I don't know the code very well, and this was my first patch to get everything working on our systems.


    Eventually, something exhausts itself and everything falls apart with the following message flooding repeatedly: 
    NOTICE[8513]: mtp.c:413 mtp_put: Full MTP receivebuf, event lost, type=15. 

  I don't see any log messages like that,
  But I do see:
  [Jan  6 12:53:18] NOTICE[14044] l4isup.c: Write buffer full on CIC=99 (wrote only 160 of 240), audio lost (suppress 13).

  This could also be related.
  We have mainly inbound calls, so it could be a bit of the same issue, but with the call setup the other way around.

  I did look into the buffer problems before, but did not find any cause on the server (not very busy).
  So I assumed this was likely be caused by transferring the call from SS7 to SIP.

  With a bit extra latency through SIP, and a bit too much jitter, I think it's possible that chan_ss7 can not fill the write buffer,
  or when the rtp packets from sip suddenly arrive, the writesbuffer can be filled to much?

  But all my issues seem only temporary.
  After these log messages, the CIC's are still usable and I see a few minutes later calls on those CIC's with problems.
  And we had no crash eiter of a running system.

  Both servers are now running for a 6 weeks and 5 hours (sinds the last maintenance).
  And handled together a bit more than 600.000 inbound SS7 and 10.000 outbound SS7 calls.
  Totally processed around 900.000 calls.

  From last februari until september (8 months) the servers had their longest time without any problems.
  The system had to be restarted for maintenance and adding a few extra SS7 links.

  With kind Regards
  Robert Verspuy


  -- 
  Exa-Omicron
  Patroonsweg 10
  3892 DB Zeewolde
  Tel.: 088-OMICRON (66 427 66)
  http://www.exa-omicron.nl
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-ss7/attachments/20110106/66f0f4ae/attachment-0001.htm>


More information about the asterisk-ss7 mailing list