<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#333333">
Kai Militzer wrote:
<blockquote cite="mid44AE2522.5020608@westend.com" type="cite">Hello
everyone,
<br>
<br>
I have a very strange issue with a TE205P, zaptel and chan_ss7
regarding interrupts. As I think this is originated somehwere in the
zaptel driver, I crosspost this to -dev, hope that's OK.
<br>
<br>
As chan_ss7 need not d-channel, the configuration in /etc/zaptel.conf
is as follows:
<br>
<br>
bchan=1-31
<br>
bchan=32-62
<br>
<br>
That's the first difference to a config with zaptel, where dchannels
need to be configured.
<br>
<br>
I came across the issue, when I started to get a lot of CRC16 errors on
the MTP2 part of chan_ss7 resulting at last in a flapping of the
complete SS7 signaling every few minutes. Together with these CRC16
errors I got messages, that chan_ss7 ran into an "Excessive poll delay"
and that the Zaptel input buffer went full, directly followed by a
empty zaptel output buffer. What was/is strange is the fact, that I
have two machines configured the same, only differing in DPC and OPC
codes in chan_ss7 and different CPUs. The machine working without
problems runs with a AMD Duron with 1300 MHz, the one with the
CRC-errors on a P4 with 3GHz.
<br>
<br>
My first step to find the source of the problem was to put the Card
into a different System, also running a P4, but this time only with
1.8GHz. That resulted in the same errors, the SS7 part wouldn't even
start with that one.
<br>
<br>
So I started to dig deeper. I made a crosslink cable and connected two
E1 ports with it, started two instances of asterisk with chan_ss7 and
experienced the problem (that proved at last, that there was no problem
with the Switch from the TelCo). So my next step was to start the two
instances with chan_zap instead of chan_ss7 and everything is fine. No
erros in any way.
<br>
<br>
As I knew, that the Card in my other system with an AMD worked, I now
changed Hardware again, this time putting the card in AMD Athlon XP
3000+, crosslinked the two E1s again and started two instances of
asterisk running chan_ss7. And voila, no problem. At least at first it
looked this way. I let it run over night (only asterisk is running, no
traffic or whatsoever may distract the system) and when I came back
this morning, I had CRC errors + the other error messages on the
screen. So now I suspected the card (which I still do for a bit). To
test, if the TE205P works as it should, I made a crossover plug (Pin
1-4 and 2-5) and ran a patlooptest, a loop test with zttool, a zttest
and also uncommented #define CONFIG_ZAPTEL_WATCHDOG in zconfig.h. All
looks fine, no errors whatsoever. The card is assigned an interrupt
for itself without anything else using it. All looked good.
<br>
<br>
Then finaly I came across the behavior that puzzles me. Asterisk was
running with two instances over the crosslink and the console screen
was blanked on the console. So I wanted to press Shift to unblank it,
but accidently pressed the CapsLock key. When the screen was unblanked
and I started to type, I realized, that CapLock was on. I pressed it
again and in this moment, I got a CRC16 error. I thought that was
strange and pressed it again twice, and there it was again, Packets
from the zaptel driver to chan_ss7 got lost. The same behavior happens,
when I press ScollLock or NumLock. The Keyboard runs on interrupt 1 and
the TE205P on IRQ11, so there shouldn't be any impact when the keyboard
uses this interrupt.
<br>
<br>
As you can see, I am stuck now, why does this happen and why only with
chan_ss7? I cannot say if I can reproduce the errors on my "running
system" (the one without the errors) as it is located elswhere without
a keyboard connected. Any ideas how to solve this are greatly
appreciated, as I need to get the system back to work. <br>
</blockquote>
Here is the bottom line problem in chan_ss7 right now.. <br>
<br>
snip from man 2 write<br>
<br>
ERRORS<br>
EAGAIN Non-blocking I/O has been selected using O_NONBLOCK and
the write would block.<br>
EINTR The call was interrupted by a signal before any data was
written.<br>
<br>
It basically comes down to the write errors not being handled properly
as best I can tell. Make a patch that tests for these errors and
handles them as they should.. When I have more time I'll poke around
in how ast_frame is being passed as the asterisk-devs think there may
also be a problem there.. I can only confirm differences in
implementation from the different channel stacks I've seen.<br>
<br>
Cheers,<br>
<br>
C.<br>
</body>
</html>