[Asterisk-Users] PCI Problems

Rich Adamson radamson at routers.com
Fri May 26 08:15:59 MST 2006


Andrew Kohlsmith wrote:
> On Thursday 25 May 2006 16:11, Sean Cook wrote:
>> What could be the other causes?  I have exhausted everything I know how
>> to do.  PCI sharing explains it (whether or not it is infact the
>> problem).  This card shares the BIOS assigned interrupt with the network
>> card...
> 
> Audio problems can come for a variety of reasons.  They are caused by (but not 
> limited to) things such as
> - IRQ sharing with another device with a shitty driver or poor hardware
> - Poor/inconsistent PCI bus behaviour and timing
> - overloaded CPU or poor kernel parameters which cause timing problems
> - shitty hardware or drivers which can lock out IRQs for a long time
> - buggy drivers for the TDM or ethernet hardware
> - bad PCI tuning with setpci or kernel parameters, latency timers especially
> - other hardware (PCI bus controller, north or south bridge) issues
> - faulty hardware
> - poor cabling (either TDM side or ethernet side)
> 
> IRQ sharing is often blamed for audio problems but the fact of the matter is 
> that IRQ sharing is *NOT* an issue if the hardware that is sharing the IRQ 
> (and the drivers for that hardware) plays nicely and reacts to the IRQ 
> quickly.  PCI is DESIGNED to share IRQs.  The trouble comes when vendors take 
> old ISA hardware, port it to PCI and/or don't ensure that they not only share 
> IRQs properly but also do not ensure that their drivers check that their 
> hardware caused the IRQ and react to IRQs quickly.
> 
> There is NOTHING inherently wrong with sharing IRQs.  The IRQ handler needs to 
> check the hardware to see if it was their hardware that generated the IRQ and 
> get the hell out if not.  A lot of (poor) drivers do NOT do this.  The driver 
> either assumes that the IRQ MUST have been generated by the hardware (which 
> can cause a host of weird problems), or the check takes so long that it 
> causes trouble for the card that DID generate the IRQ.
> 
> Digium's hardware is more sensitive to IRQ sharing trouble than other hardware 
> for two very simple reasons.
> 
> The first is that the TDM cards have no real buffering.  If the data is not 
> taken from the register it will quickly be overwritten by the next block of 
> data.  This is analogous to the old 16450 UARTs of yore.  They had a receiver 
> shift register and a 1-byte receiver buffer.  If you didn't get the data out 
> of the buffer before the next byte had shifted in, the new byte would be 
> transferred to the buffer and you'd get an overrun error.  The 16550 replaced 
> the 1-byte receive buffer with a 16-byte FIFO (IIRC) -- you could trigger an 
> IRQ after the FIFO had filled 'x' bytes, and then service the IRQ, retrieving 
> all bytes received in one fell swoop.  And if your IRQ service routine got a 
> little delayed it was no big deal because there was room for another byte or 
> two before you started losing data.  This allowed the IRQ volume on busy 
> serial applications to be far lower (up to 16x lower) than before, which 
> allowed for better system utilization.
> 
> Digium's hardware is like the old 16450.  There is no FIFO.  This was done 
> consciously, and is not necessarily a bad design -- TDM is VERY sensitive to 
> latencies.  The more delay you have, the worse things like echo become.  
> Bringing TDM data into the PC is already pretty laggy.  Adding more delay 
> with FIFOs isn't necessarily a good thing.  (I would argue that having a 16 
> byte FIFO and triggering the IRQ on the first position would not be a bad 
> thing nor would it introduce any latency, but that's me. I'd change a few 
> things about Digium's hardware, but there is no arguing at their success.)
> 
> So back to the problem at hand: if there is significant delay between the IRQ 
> and the IRQ service, you lose data.  This leads to chirping/clicking and in 
> the case of T1, HDLC/framing errors, dropped links and bouncing D channels 
> (for PRI).
> 
> The second reason is that Digium's drivers do a LOT of work in the IRQ 
> handler.  Essentially they are "poor" PCI neighbours.  In the past (I have 
> not checked this recently) all of the echo cancellation and "heavy lifting" 
> was done right inside the IRQ handler, with interrupts disabled.  This caused 
> their IRQ service time to be lengthy, and until interrupts are enabled again 
> you essentially lock out any other driver from servicing its hardware.   
> (Basically Digium's drivers do to other drivers what Digium's drivers can't 
> stand to have done to it.)  Contrast this with Sangoma's drivers, which get 
> the data into system RAM, set a flag (softIRQ?) and then get the hell out of 
> the IRQ context as quickly as possible.  Then whenever the CPU gets time to 
> do it,  the driver takes the data and processes it OUTSIDE of the IRQ 
> context.  Whether this is better or worse for performance is under debate, 
> but there is absolutely no question that doing it this way makes their 
> products better PCI neighbours.
> 
> This is a rather lengthy post, and I am sure that others will post 
> contradictory or corrective responses, which I welcome.  The jist of the 
> post, however, is that there are far more things that can cause audio 
> problems than simple IRQ sharing.  I had a TDM400 (3 FXS) in a P3 system that 
> shared its IRQ with the LAN card *AND* the disk controller.  This computer 
> was also an NFS server for my media PC in the living room.  Every single call 
> came in over the network card (I have no phone line), and even while watching 
> movies (heavy network and disk use), I had absolutely NO issue with the 
> TDM400.  No chirping, no echo trouble, nothing.  That system had a DAMN good 
> PCI interface, and all the drivers coexisted peacefully.
> 
> By all anecdotal evidence and rules of thumb that system should have had 
> TERRIBLE audio problems.  I was not only sharing the IRQ between 3 devices, 
> but two of the three devices (TDM400 and network card) would ALWAYS be firing 
> IRQs at the same time during a call.  I did not, however, have any trouble 
> whatsoever.

Andrew,

Have you dug into the TDM400 far enough to know whether the common 
complaints are associated with a hardware design issue, TigerJet issue, 
or driver?  (eg, can any of the issues truly be addressed?)

Just curious...

R.




More information about the asterisk-users mailing list