[asterisk-ss7] KNK SS7-27 - first experiences - part 1
Kaloyan Kovachev
kkovachev at varna.net
Wed Jun 26 14:01:52 CDT 2013
The problem with stock libss7 is that one will never complete the tests
required from telcos in Europe as it is missing functionality which ITU
have described in the test procedures.
Without the ISUP timers (the main functionality added from the patches)
it is just not possible and the link may not even come UP in some cases.
Probably in ANSI world it works fine, but not in the ITU world.
One of the difficulties was to keep the code working as before without
the timers defined in chan_dahdi.conf, another is the hard to find
(freely available) ANSI standard and it's requirements, then the code
base/functionality have changed quite a lot from 1.6 separating sig_ss7
from chan_dahdi etc.
I am sure there are bugs and place for improvements for that branch, but
the original version from Domjan is used from me and many more for few
years already (that's why i said the bugs are from me in that branch :)
) and we are stuck with 1.6 because of that. I have tried to get the
changes in Asterisk 11, but my (below average) C skills and available
time did not allowed to do that, while the more time passes more
difficult it will be to keep it up to date with the rest of the Asterisk
code.
I hope with some help from others (testing and patches) this code will
finally find it's way in Asterisk and then we may look to adding the
cluster/routing/STP functionality
On 2013-06-26 16:33, Marcelo Pacheco wrote:
> Thanks Kaloyan.
> Before this thread, there were no mentions at all to a KNK tree, so I
> though this was stock libss7.
> I'm using my own patched libss7.
> I processed over one million call setups with ten servers, with many
> difficult setups (third party STPs, STPs 1000 miles away with
> transmission lines that do fail from time to time, connections with
> almost one dozen types of ISUP switches, sharing two links with an STP
> to half a dozen switches), and the issues reported don't happen at all.
> So this look like a bug in your patch or in Attila's code.
> My patch is for paying customers only (they get the source, and could
> release if they want to, but chose not to).
> I have done very small changes to the ISUP side of things, but some
> fairly major changes to MTP2, MTP3 and DAHDI mtp2 mode.
> I even implemented very basic STP functionality, and MTP2 over UDP
> signalling (between asterisks only).
> It might be worth trying to look at the diffs from stock to your
> branch.
>
> On 06/26/13 09:05, Kaloyan Kovachev wrote:
> Almost forgot. Please do not post patches (if any) in this list, but
> attach them to the SS7-27 issue instead with proper license agreement,
> so it can be included in Asterisk codebase
>
> On 2013-06-26 14:57, Kaloyan Kovachev wrote:
> Hi all,
> sorry for joining so late, but i am on holidays (by the end of the
> week) and rarely checking my mailbox. Thanks to bad weather i did that
> today :)
>
> To the OP:
> while reading the first posts i thought it is an old problem with
> REL/RSC loop (persistent on start with ANSI signaling) which was fixed
> in libss7 instead of sig_ss7, but not sure if it is a similar yet
> different one or it is the same issue. It really is a (remaining)
> problem if we receive RLC on previous REL, but after we have sent RSC.
> I was thinking to clear the old status bits after we receive RLC, but
> this will not fix the double RLC received problem and we can't ignore
> the first one (or just clear the SENT_REL flag), because we may never
> get a second one, so it should probably be better to ignore sending
> second RSC inside isup_handle_unexpected() if the previous one was
> sent T17 (timer seconds) ago. Because the timer is stopped on RLC it
> should be another timer or some flag to ignore it's expiration and not
> reset again ... will work on this next week when i am back.
>
> The code in my branch is actually Domjan Attila's version (the patches
> attached to the SS7-27 issue) ported to later Asterisk versions with
> very few additions/modifications, so the muffins are for him, while
> the bugs are from me :)
>
> P.S.
> apologies for top posting - the connection is unstable and i had to
> write the post offline and just copy/paste it
>
> On 2013-06-26 06:42, Pavel Troller wrote:
> Hi!
> So, I'm replying to my own original post, to keep the question and a
> possible answer together without any excessive or unrelated
> information.
> I hope I've found the cause of the problem and I hope I solved it. A
> modified libss7 is now online and I'm waiting for busy hours to see,
> whether
> it will help.
> The problem is, that in the isup_rel() function, all the important
> got_sent_msg flags are cleared, so the stack "forgets" a preceding call
> state:
> ... isup_rel():
> c->got_sent_msg |= ISUP_SENT_REL;
> c->got_sent_msg &= ~(ISUP_SENT_IAM | ISUP_PENDING_IAM |
> ISUP_CALL_CONNECTED | ISUP_GOT_IAM | ISUP_GOT_CCR | ISUP_SENT_INR);
> ...
> So, an incoming MSU, which was perfectly legitimate before sending REL,
> is now handled as unexpected.
> My solution adds the following code to the isup_receive() function for
> every message, which can confuse the stack by the discovered cause
> (an example for ACM message):
> case ISUP_ACM:
> + if (c->got_sent_msg & ISUP_SENT_REL) {
> + ss7_message(ss7, "Got unexpected ACM
> after sending REL on CIC %d PC %d, ignoring ", c->cic, opc);
> + return 0;
> + }
>
> if (!(c->got_sent_msg & ISUP_SENT_IAM)) {
> ss7_message(ss7, "Got ACM but we didn't send IAM on CIC %d PC %d ",
> c->cic, opc);
> return isup_handle_unexpected(ss7, c, opc);
> }
>
> If my change will prove good, I'm planning to remove the
> ss7_message() to
> limit the stack verbosity, as these situations are relatively
> frequent under
> heavy load and I think they are moreless logical and normal.
>
> I would be glad for some words from the KNK branch maintainer(s),
> whether to
> create a JIRA issue and put my patch there or how to proceed now in
> general.
>
> With regards,
> Pavel
>
>
>
> Hi!
> I would like to share my expiernce with deployment of this
> experimental SS7
> branch.
> The first impressions are good, especially the timers seem to work
> well,
> saving many calls from being frozen.
> However, there are still some strange things, which I would like to
> discuss
> here, one by one.
> The first one is, that the channel sometimes doesn't recognize a
> message
> (mostly RLC), even it comes from an action initiated by the channel
> itself.
> Typically, the following is appearing often:
>
> [Jun 24 13:33:41] ERROR[3975]: chan_dahdi.c:14406 dahdi_ss7_error:
> [1] ISUP timer t17 expired on CIC 27 DPC 4097
> [1] Got RLC but we didn't send REL/RSC on CIC 27 PC 4097 reseting the
> cic
>
> As I understand, there were some timeouts and now the channel tries to
> recover by sending RSC and firing T17. However, it seems that it
> immediately
> rejects RLC, which comes back as a response to the RSC which was just
> sent
> upon expiry of T17. And this appears again and again in the rhythm of
> T17,
> and the channel is not operational.
> ss7 show calls shows the following line for the misbehaving CIC:
> 27 4097 11 IAM IAM
>
> Or, a very similar situation:
> [2] Got SUS but no call on CIC 48 PC 4096 reseting the CIC
> [2] Got RLC but we didn't send REL/RSC on CIC 48 PC 4096 reseting the
> CIC
>
> The first question is, why there was no call while SUS was received. My
> idea is, that both the parties hung up their phones in the same time
> and
> that the call was undergoing destruction on Asterisk side (REL just
> sent
> or something like this), while SUS arrived. Maybe the call was marked
> as
> cleared even before RLC came back ? OK, I can understand this. But
> if the CIC was reset as the first message says (i.e. RSC was sent),
> why the
> RLC going back is not recognized then ?
>
> Or, just now the following appeared:
>
> [1] Got ACM but we didn't send IAM on CIC 10 PC 4097 reseting the cic
> [1] Got RLC but we didn't send REL/RSC on CIC 10 PC 4097 reseting the
> cic
>
> Again, it's questionable, why this happened, but the second line seems
> to indicate some brokeness again.
>
> To explain: The channel is operating on a gateway equipped with 16 E1s
> and current traffic is about 10 CAPS, there are two linksets to two
> cooperating exchanges. They are EWSDs, which have very mature and
> stable
> SS7, so I'm almost sure that they are not making signalling errors.
>
> With regards,
> Pavel
>
> --
> _____________________________________________________________________
> -- Bandwidth and Colocation Provided by http://www.api-digital.com --
>
> asterisk-ss7 mailing list
> To UNSUBSCRIBE or update options visit:
> http://lists.digium.com/mailman/listinfo/asterisk-ss7
>
> --
> _____________________________________________________________________
> -- Bandwidth and Colocation Provided by http://www.api-digital.com --
>
> asterisk-ss7 mailing list
> To UNSUBSCRIBE or update options visit:
> http://lists.digium.com/mailman/listinfo/asterisk-ss7
>
> --
> _____________________________________________________________________
> -- Bandwidth and Colocation Provided by http://www.api-digital.com --
>
> asterisk-ss7 mailing list
> To UNSUBSCRIBE or update options visit:
> http://lists.digium.com/mailman/listinfo/asterisk-ss7
>
> --
> _____________________________________________________________________
> -- Bandwidth and Colocation Provided by http://www.api-digital.com --
>
> asterisk-ss7 mailing list
> To UNSUBSCRIBE or update options visit:
> http://lists.digium.com/mailman/listinfo/asterisk-ss7
>
>
>
> --
> Atenciosamente,
>
> Marcelo Pacheco
> M2J Comunicações e Informática
> Fixo: (27)2222-8118 / (27)2233-2296
> Vivo: (27)9964-5440
> Claro: (27)9312-5319
> MSN: marcelo at macp.eti.br
> E-mail: marcelo at m2j.com.br
>
>
> --
> _____________________________________________________________________
> -- Bandwidth and Colocation Provided by http://www.api-digital.com --
>
> asterisk-ss7 mailing list
> To UNSUBSCRIBE or update options visit:
> http://lists.digium.com/mailman/listinfo/asterisk-ss7
More information about the asterisk-ss7
mailing list