<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000066">
<small><font face="Verdana">Hi team,<br>
<br>
Not long ago a bunch of us were posting reports of a strange phenomenon
where voice quality would pack up completely from time to time,
typically resulting in loud crackling on the line and/or the voice
channel breaking up completely. With our installation it would occur
from time to time, typically when the * server was at it's busiest.<br>
<br>
Most of the time this problem would result in all users having to
terminate their calls and re-establish them.<br>
<br>
After a lot of (very frustrating) troubleshooting we have have now gone
two weeks without a re-occurrence of the problem and we are hoping that
we may have finally resolved it altogether. I wanted to post a quick
summary of the steps that we have taken to resolve this issue and what
we think the problem turned out to be, as (from the number of responses
to my last posts about this issue), it sounds like a few people have
been experiencing it, so hopefully our experiences will help.<br>
<br>
The * server in question is based on a single-processor IBM xSeries 205
with a gig of RAM, SCSI 320 HDD's (RAID 1) and Red Hat ES 3. It uses
ISDN (via CAPI and a four port Eicon Diva Pro Server card) and a
mixture of SIP and analogue extensions.<br>
<br>
A TDM400P with four FXS ports supports the four analogue extensions
(all Uniden cordless phones) and the SIP handsets consist of a mixture
of BT102's and SNOM190's.<br>
<br>
Our turning point with this issue came when we bit the bullet and
purchased a support incident from Digium. By this stage we had spent
dozens and dozens of hours trying unsuccessfully to research and
diagnose the problem and still had no accurate idea of what was causing
it. Several people replied to our posts to this list saying that they
were having a very similar issue as well, but no one had a clue what
was causing it.<br>
<br>
Digium support zeroed in on the issue fairly quickly and we got the
*distinct* impression that they have seen this problem many times
before. They instantly got us to look at the output of zttest and we
found that this was (in their words) 'extremely low', with 'best' and
'worst' readings of 99.975586% and 99.963379% respectively. They told
us that we needed to be getting at least 99.98% and recommended that we:<br>
</font></small>
<ol>
<li><small><font face="Verdana">Check that the TDMP is on it's own
IRQ (much to our embarrassment our card wasn't at the time, so we had
to play with it a bit to get it to occupy a unique IRQ).</font></small></li>
<li><small><font face="Verdana">Disable hyper threading on the Xeon
CPU.</font></small></li>
<li><small><font face="Verdana">Uninstall our SCSI hardware and
replace it with IDE hardware.</font></small></li>
<li><small><font face="Verdana">Upgrade to the latest stable releases
of Asterisk, Zaptel and Libpri.</font></small></li>
</ol>
<small><font face="Verdana">We made changes 1 and 2 in the above list
and are prepared to make changes 3 and 4 if we find the problem hasn't
gone away. It hasn't happened in over two weeks now (after occuring
many times per day for a while), so we hopefully won't have to throw
out our SCSI hardware. After we made each change (1 and 2 were made
about two weeks apart from each other) we found that the quality
improved, with the incidence of the issue halving after '1' and
disappearing (hopefully for good) after '2'. Incidentally the results
of zttest *did not* noticeably improve after making these changes (it
is still below 99.98%).<br>
<br>
Apparently our problem is related to the fact that the TDMP generates
massive amounts of IRQ requests and that it becomes extremely upset if
a suitable number of those IRQ requests are not honoured. Dispite the
fact that a PCI device has to be able to share an IRQ in order to meet
the PCI specification, it appears that having a TDMP sharing an IRQ
with *anything* is a really really bad idea.<br>
<br>
I haven't been able to get an explanation about why hyper threading is
a bad thing, but apparently high-performance devices such as SCSI
adapters can cause resource contention issues with the TDMP, resource
issues that the TDMP becomes very upset about.<br>
<br>
So hopefully we have seen the back of this problem and I have to say
that I have been pretty dissappointed to find out that this issue
appears to be relatively well known by Digium, but seemingly not
publicised in the slightest. We searched for days to find anything
relating to our issue but to no avail. Hopefully the next time someone
has this issue they might find this mail and save themselves some of
the frustration that we had.<br>
<br>
When we challenged Digiums advice about retarding the CPU (i.e.
disabling hyper threading) and slowing I/O (by throwing out our SCSI
RAID controller and replacing with IDE) they fell strangely silent -
after getting prompt and meaningful responses to our requests they
suddenly stopped responding at all.<br>
<br>
I think that this issue constitutes a pretty major flaw in the design
of the TDMP and we will strongly avoid putting these cards into any *
servers from now on. This is a real shame, as we as a company really
want to reward Digium for all of their good work by actually buying
their products, but we no longer have any faith in the design and
suitability for production use of this product.<br>
<br>
Maybe it's time for Digium to think about following the lead of Red
Hat, Compiere and other successful O/S software vendors and release a
commercial version of * - one that they can charge licensing and
support fees for. Surely this would give them the financial resources
required to finally take * (and their hardware products) to the level
where they collectively provide levels of performance, reliability and
support provided by traditional PABX products and vendors.<br>
<br>
I for one would be more than happy to pay for such outstanding
software... just as long as it remains Open Source, of course.<br>
<br>
</font></small>
<pre class="moz-signature" cols="72">FFF Managed Technology Ltd
60 Cook St
P.O. 6368 Wellesley St
Auckland
t +64 9 356 2911
f +64 9 358 9070
m +64 21 415 297
w <a class="moz-txt-link-abbreviated" href="http://www.fff.co.nz">www.fff.co.nz</a></pre>
</body>
</html>