[asterisk-ss7] KNK SS7-27 - first experiences - part 1
Pavel Troller
patrol at sinus.cz
Tue Jun 25 04:56:39 CDT 2013
Hello Marcelo,
> Another possibility is you're mixing the whole thing in a single linkset
> where you must use two linksets in the way you explained.
I hope I'm not doing this.
>
> Can you see those errors with just a few test calls ?
>
No. These errors are occuring only in the high traffic periods.
>
> I found about 20 bugs / structural design flaws in stock libss7 / dahdi
> mtp2 support. With my changes the mtp2/mtp3 layers are far more robust
> than stock libss7.
> Fixed all but a single one, related to knowing then the linkset is up or
> down, and not trying to send isup messages, specially IAM through a down
> linkset - all sigchans down.
I believe that you fixed most of the bugs in the stock libss7, but now I'm
trying to use the patched one, which contains isup timers (it was a pain
to live with them) and improved diagnostic, more dumping commands etc.
This is the reason I'm trying to use it.
>
> If there's a bug, use ss7 set debug on linkset X to trace ss7 messages
> and track isup message flow.
The problem is, that the format of the dump is good for manual viewing,
but not for machine processing (greping for patterns etc.). And running
a minute of ss7 debug on a single linkse creates a really huge file (10+ MB),
which is very hard to view personally.
>
> I used libss7 succesfully with telcobridges tmedia, digitro switches,
> ericsson AXE, huawei NGN, Nortel DMS, several STPs, EWSS, Nec NEAX, and
> I'm probably missing a couple switch types.
Generally, your experience is really large.
> I never ran into SS7 / ISUP bugs of other switches, always libss7, but,
> the nature of the bugs found are nothing like what you're reporting.
> I started testing libss7 with those kinds of switches 5 years ago, so I
> have a some mileage to make those statements, specially from reading and
> understanding a large portion of the libss7 / sig_ss7 / chan_dahdi code.
>
> The issue you're describing is caused by Asterisk getting ss7 messages
> that belong to another linkset or sending ss7 messages on the wrong ss7
> link.
> Check for UCIC or CFN ISUP responses.
>
There will be no UCIC messages, because both the linksets have identical
CICs, so even if the case the messages are mixed between linksets, the CIC
will be always there.
Sometimes I can see CFN, but always easily understandable in the regular
call context (invalid parameters being sent etc.).
>
>
> you need to define chan_dahdi.conf basicly like this:
There is my config:
signalling=ss7
ss7type=itu
ss7_called_nai=dynamic
ss7_calling_nai=dynamic
ss7_internationalprefix=00
ss7_nationalprefix=
ss7_subscriberprefix=
ss7_unknownprefix=
ss7_explictacm=yes
; ============== ALI 01 ===============
; All settings apply to linkset 1
linkset=1
slc=0
pointcode=8
adjpointcode=4097
defaultdpc=4097
networkindicator=national
; First signalling channel
sigchan=1
mtp3_timer.t21=1
isup_timer.t1 = 15000 ; Wait for RLC
isup_timer.t2 = 180000 ; User SUS received
;isup_timer.t3 = 120000 ; Overload
;isup_timer.t4 = 300000 ; MTP Inaccessible Remote User Timer
isup_timer.t5 = 300000 ; Wait for RLC after initial REL
isup_timer.t6 = 30000 ; Network SUS received
isup_timer.t7 = 30000 ; Last Address Message, waiting for ACM/CON
;isup_timer.t11 = 15000 ; Automatic ACM timer
isup_timer.t12 = 15000 ; BLO -> BLA timer
isup_timer.t13 = 300000 ; Initial BLO -> BLA timer
isup_timer.t14 = 15000 ; UBL -> UBA timer
isup_timer.t15 = 300000 ; Initial UBL -> UBA timer
isup_timer.t16 = 15000 ; RSC timer due to T5 expiry
isup_timer.t17 = 300000 ; Initial RSC -''-
isup_timer.t18 = 15000 ; CGB -> CGBA timer
isup_timer.t19 = 300000 ; Initial CGB -> CGBA timer
isup_timer.t20 = 15000 ; CGU -> CGUA timer
isup_timer.t21 = 300000 ; Initial CGU -> CGUA timer
isup_timer.t22 = 15000 ; CGR -> CGRA timer
isup_timer.t23 = 30000 ; Initial CGR -> CGRA timer
isup_timer.t27 = 240000 ; COT failure
isup_timer.t33 = 15000 ; INR -> INF timer
isup_timer.t35 = 15000 ; Overlap dialling timer
group=1
context=from_ss7
faxdetect=no
; Begin CIC (Circuit indication codes) count with this number
cicbeginswith=2
; Channels to associate with CICs on this linkset
channel=2-31
cicbeginswith=33
channel=32-62
... for all other spans
; ============== ALI 02 ===============
linkset=2
pointcode=8
adjpointcode=4096
defaultdpc=4096
networkindicator=national
; First signalling channel
sigchan=125
mtp3_timer.t21=1
... the rest the same as for Linkset 1
... of course channel numbers differ in the channel= definitions
So, it differs in the following from your suggestion below:
- Own pointcode is stated in the linkset sections, but it's the same
in all the linksets.
- There are both adjpointcode and defaultdpc specified in every linkset
definition, both being the same to be sure.
Thank you again for your help with my problems!
With regards,
Pavel
>
> ; basic ss7 / isup parameters, usually the same for the whole libss7 setup
> signalling=ss7
> ss7type=itu/ansi
> ss7_called_nai=subscriber/national/international/unknown
> ss7_calling_nai=subscriber/national/international/unknown
> networkindicator=national/international/...
>
> ; Your local pointcode
> pointcode = X
>
> ; Start definition for linkset N
> linkset = N
>
> adjpointcode = STP point code otherwise switch point code
> ; Instantiate a signalling link on channel 16 belonging to linkset N,
> with adjacency to adjpointcode
> sigchan = 16
> ; Define more signalling links if needed, with adjpointcode and sigchan
>
> defaultdpc = pointcode for ISUP messages
> cicbeginswith= CIC of the next voice channel defined
> ; Instantiate voice channel on linkset N, talking to PC defaultdpc, CIC
> numbering incremented automatically
> channel => dahdi channel range
>
> cicbeginswith= next CIC range, if non contiguous
> channel => dahdi channel range
>
> defaultdpc = another point code belonging to the same linkset (if links
> share signalling to multiple switches, typically links through an STP)
> ;repeat cicbeginswith, channel
>
> ; Starts definition of another linkset
> linkset = M
> ; repeat same sequence as above
>
>
> On 06/25/13 05:13, Pavel Troller wrote:
> > Hello Marcelo,
> >
> >> Per usual, read the fine manual. Wait, there's no manual !
> > You're right :-).
> >
> >> Since you seem to have done your part and actually knows some ss7 and
> >> isup, here comes a hint.
> >>
> >> You created two or more linksets where you must have a single one.
> >> libss7 don't have the ss7 routing feature.
> > It seems strange to me. Let's try to explain this in more detailed way.
> > There is 1 (one) Asterisk box.
> > It has 2 (two) "linksets" configured, with 1 (one) signallink link per linkset.
> > Linkset 1 is configured for one DPC and with CICs 1 - 496.
> > Linkset 2 is configured for another (different) DPC and also with CICs 1 - 496.
> > Both the systems connected to this Asterisk box are configured to respond
> > directly to the linkset between them and the Asterisk, so it's sure that
> > a MSU from DPC1 cannot come over LS2 and vice versa.
> > I hope that this extremely simple setup is in the scope of current libss7
> > functionality. Or am I wrong ?
> >
> >> In libss7 linkset concept is diferent from official ss7 linkset.
> >>
> >> All signalling links that carry ISUP traffic for a given set of channels
> >> must be kept on a single linkset, as well as all ISUP channels that go
> >> through those links.
> > I hope that my setup is conformant with this limitation.
> >
> >> It looks like you're getting incoming signalling for ISUP channels that
> >> are on another linkset.
> > It really looks like this, but I still hope it's not the case. Please note that
> > the traffic on the box is rather high, such an error occurs for one of, say,
> > 10000 call attempts. I think that in case of such a fatal routing problem,
> > which you are talking about, it wouldn't be possible to use the system
> > regularly.
> >
> >> I'm sure you didn't find any libss7 bug.
> > Really strong words! I wouldn't say it for any of my programs :-).
> >
> >> I have a highly customized version of libss7/dahdi/asterisk, fixing lots
> >> of issue, but this isn't one of them.
> > Possibly your setup/usage scenario is a bit different ?
> >
> >
> >> Processed over one million call setups, with a very complex setup (6
> >> linksets, 7 links, 6E1 on a single switch, plus another 6E1 on remote
> >> switches using my simple STP solution, sharing the local links over SS7
> >> over UDP - my simpler proprietary alternative to sigtran).
> > These switches (I have two of them, but the second one is still on a regular
> > unpatched SS7 stack) make approx. 3 millions of call setups per week. My
> > record (without restarting/crashing Asterisk) is about 3 weeks with more than
> > 10 millions of calls.
> >
> >> If you need commercial support, contact me off list.
> > Thanks for your offer.
> >
> > With regards, Pavel
> >
> >> On 06/24/13 09:02, Pavel Troller wrote:
> >>> Hi!
> >>> I would like to share my expiernce with deployment of this experimental SS7
> >>> branch.
> >>> The first impressions are good, especially the timers seem to work well,
> >>> saving many calls from being frozen.
> >>> However, there are still some strange things, which I would like to discuss
> >>> here, one by one.
> >>> The first one is, that the channel sometimes doesn't recognize a message
> >>> (mostly RLC), even it comes from an action initiated by the channel itself.
> >>> Typically, the following is appearing often:
> >>>
> >>> [Jun 24 13:33:41] ERROR[3975]: chan_dahdi.c:14406 dahdi_ss7_error: [1] ISUP timer t17 expired on CIC 27 DPC 4097
> >>> [1] Got RLC but we didn't send REL/RSC on CIC 27 PC 4097 reseting the cic
> >>>
> >>> As I understand, there were some timeouts and now the channel tries to
> >>> recover by sending RSC and firing T17. However, it seems that it immediately
> >>> rejects RLC, which comes back as a response to the RSC which was just sent
> >>> upon expiry of T17. And this appears again and again in the rhythm of T17,
> >>> and the channel is not operational.
> >>> ss7 show calls shows the following line for the misbehaving CIC:
> >>> 27 4097 11 IAM IAM
> >>>
> >>> Or, a very similar situation:
> >>> [2] Got SUS but no call on CIC 48 PC 4096 reseting the CIC
> >>> [2] Got RLC but we didn't send REL/RSC on CIC 48 PC 4096 reseting the CIC
> >>>
> >>> The first question is, why there was no call while SUS was received. My
> >>> idea is, that both the parties hung up their phones in the same time and
> >>> that the call was undergoing destruction on Asterisk side (REL just sent
> >>> or something like this), while SUS arrived. Maybe the call was marked as
> >>> cleared even before RLC came back ? OK, I can understand this. But
> >>> if the CIC was reset as the first message says (i.e. RSC was sent), why the
> >>> RLC going back is not recognized then ?
> >>>
> >>> Or, just now the following appeared:
> >>>
> >>> [1] Got ACM but we didn't send IAM on CIC 10 PC 4097 reseting the cic
> >>> [1] Got RLC but we didn't send REL/RSC on CIC 10 PC 4097 reseting the cic
> >>>
> >>> Again, it's questionable, why this happened, but the second line seems
> >>> to indicate some brokeness again.
> >>>
> >>> To explain: The channel is operating on a gateway equipped with 16 E1s
> >>> and current traffic is about 10 CAPS, there are two linksets to two
> >>> cooperating exchanges. They are EWSDs, which have very mature and stable
> >>> SS7, so I'm almost sure that they are not making signalling errors.
> >>>
> >>> With regards,
> >>> Pavel
> >>>
> --
> Atenciosamente,
>
> Marcelo Pacheco
> M2J Comunicaç?es e Informática
> Fixo: (27)2222-8118 / (27)2233-2296
> Vivo: (27)9964-5440
> Claro: (27)9312-5319
> MSN: marcelo at macp.eti.br
> E-mail: marcelo at m2j.com.br
More information about the asterisk-ss7
mailing list