[asterisk-dev] Link reliability issues on DAHDI

Wed Nov 12 11:52:48 CST 2008

Hi,

We are having what seems to be a weird issue with DAHDI. We are using Digium
boards and DAHDI to monitor and generate signalling of some protocols (ISUP
and ISDN) on E1 lines, without the higher level Asterisk application.

Using a D-channel configured as HDLCFCS for sparse ISDN signalling works
without problems.

When we tried to use a D-channel (configured as either HDLCFCS or HARDHDLC)
and mtp2 (but without DAHDI's mtp2 mode -- in our application we need to
count FISU and LSSU message repetitions), we got a huge flow of EVENT_ABORT
and EVENT_BADFCS, not to mention several ELAST errors on read() and write()
calls -- even though we are trying to deplete the event buffer with a loop
of DAHDI_GETEVENT before reads and writes. ISUP signalling suffers badly in
these conditions. And HARDHLDC gives many more errors than HDLCFCS.

To check whether the link was reliable, I wrote a simple application to test
a repeat pattern of 160 bytes (00 .. 9F) over a clear (audio) channel. Link
to the source:

http://pastebin.com/f40704fc1

The setup is a TE205P board on a x86_64 (core2 duo, lots of RAM) running
CentOS 5.2 and DAHDI 2.0.0rc4 (the latest non-SVN at the time we started
testing). The 2 E1 ports of the 205P are attached with a 1-foot crossover
cable. We already tried switching cables -- it's not the cable.

When the system is idle, we get loss of one or a handful of bytes
occasionally. (in the application above, an error rate of 0.3 to 1.0
errors/second).
Whenever there is any disk activity ("/bin/yes > /tmp/file.txt" and hit
Ctrl+C after a few seconds), the received pattern is full of errors (> 20
errors/second)

Can anyone point the reason for this behavior ? Is there any obvious mistake
in the pattern testing code ? Isn't the board supposed to receive and buffer
all E1 frames regardless of interrupt activity on the PC bus ?

Also related: I have tried changing noburst=1 and noburst=0 in the wct4xxp
module options (,/etc/modprobe.d/dahdi) but dmesg always shows the driver
coming up in burst mode:

Nov 12 11:40:44 labcom52 kernel: Found TE2XXP at base address fdeff000,
remapped to ffffc20000064000
Nov 12 11:40:44 labcom52 kernel: TE2XXP version c01a016a, burst ON
Nov 12 11:40:44 labcom52 kernel: Octasic optimized!
(...)
Nov 12 11:40:44 labcom52 kernel: Found a Wildcard: Wildcard TE205P (4th Gen)

Is there any reason for the driver to refuse non-burst mode on this board ?

About ELAST errors:

What is the the intended semantic of the ELAST error on reads and writes ?
On a write, if the buffers are full/link is busy, isn't the driver supposed
to block the write instead of returning an error immediately ?

Thanks in advance,

-- Felipe Bergo
LABCOM Sistemas, Campinas, SP, Brazil
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.digium.com/pipermail/asterisk-dev/attachments/20081112/b493f026/attachment.htm