[Asterisk-Users] FXO/FXS cpu spikes, data loss and ztclock.

Rich Adamson radamson at routers.com
Tue Jun 21 06:21:50 MST 2005


> I have not really tried any other values for N1, M1 or CGM.  I actually
> used the formulas from the Si3035 data sheet to calculate what they
> "should be" for 8Khz.  There's a lot of math in there, but it looks like
> there may be several ways to arrive at the same output values.  Not sure
> if using a different calculation for the different dividers might give
> better results using the same crystal or not.  This was my first shot at
> it but your idea seems like a good one.

It might be possible to change the values slightly to judge their impact.
I've not done the math, so not sure if changing the values has any real
merit.

> I'm not sure what profiling tools might be useful, but would be delighted
> to hear any suggestions that anyone can contribute.  It really appears
> that things are choking up somewhere in the interrupt handling routines
> and I'm guessing somewhere in the zaptel driver.

I'm not a proficient programmer at all, but some experienced programmers
use various profiling tools to help understand which routines are consuming
cycles. It would seem like that could be used to help isolate the 
repetitive cpu spikes.
 
> If the problem turns out to be a timing sync problem due to oversampling a
> sample or so per second, then the best solution may be a hardware one. 

Its my understanding (which could be incorrect) the clock on the TDM card
is used for two purposes. First to drive the onboard chipset and second
to generate an interrupt on a recurring basis. And, that same interrupt is
used to "time" or "sync" other functions within asterisk. At least that
has been the argument behind "do you have a zaptel timing device". Each
of the digium cards seem to use that same architecture, however it also
seems the TDM card is the only card that leaves something on the table.

So, is the missed data resulting from:
 1. pcm data arriving to fast/slow on the card for the pci controller to
    cause an interrupt and transfer the data across the bus reliably?
 2. to much time spent handling the interrupt within asterisk drivers
    causing an interrupt to be missed (or delayed service)?
 3. timing design conflicts between clocking the 3050 (pcm conversation)
    verses interrupt requirements?
 4. potential problems in the pci controller design?

I would have to believe the clock is driving the pcm encoding function
within the 3050 chip, and the design objective is to cause the chip to
encode exactly 8,000 samples per second. Therefore, changing that 
clocking mechanism is likely to generate 7,990 or 8,010 samples (or
some other non-standard rate) that is likely to negatively impact other
asterisk functions (due to the reliance on the interrupts as a timing
source). But, the flip side of that would suggest the existing design
is running at some rate other then 8,000 samples/sec now.

For the TDM card, there is no such thing as syncing its clock to anything
since its handling incoming analog audio that contains no such info.

> I'm still trying to get a handle on exactly how the overall system timing
> works with the zaptel driver.  It does not seem like even multiple
> (non-t1) cards of the same type in an asterisk system sync their clocks. 
> For example, each seems to bring data into the system according to the
> timing of it's own internal oscillator.  

I believe that is correct and was very likely one of the driving forces
in the design of the TDM card (e.g., one interrupt handling four pstn
lines as opposed to multiple x100p cards each with their own interrupt
servicing requirements.

> That's my assessment of the wcfxo
> style cards at least.  The TDM400 seems to derive it's clock a little
> differently.  Perhaps somebody could jump in and shed a little light on
> how the hardware clocking works for that card.  It seems that overall the
> basic theory of operation is quite similar - Tiger Jet 320 PCI controller,
> DAA (or SLIC for FXS) etc.  As far as I know, the problems of CPU spikes
> and data loss are not apparent on a properly configured T1 setup.

I don't believe anyone has confirmed the cpu spikes are actually 
responsible for missed frames. At least I won't assume that for now.

The T1 card is different since a properly configured card will sync its
onboard clock with an external source that is considered highly accurate.
When the clock is in sync, there is no such thing as missed pcm frames
on a T1 card. But, I'm sure you're read the various postings from folks
that did not properly define the card sync and those postings generally
relate to audio clicks (and other disturbances) that are essentially the
same apparent issues as a free-wheeling TDM clock.

> I think that any data that we can gain from others running vmstat 1
> (looking for cpu spikes) in combination with running ztclock would be
> useful.  Especially on differing hardware including the various T1 cards. 
> ztclock is looking pretty good to me on my hardware, but as with most
> polling type tests I would anticipate there must be some margin of error. 
> I don't have a handle on that yet.

It's my opinion (which also could be incorrect) that running vmstat and
ztclock is simply pointing out a symptom, and are probably not the right
tools to identify the root cause. Note the same symptom exists on an
600 mhz mobo as compared with a 3.0 ghz mobo, therefore the root cause
appears to be more related to something happening after xxxx frames.

Steve Underwood has made the comment that spandsp "did" work at one time
on the TDM card, and spandsp is probably "the" most critical software
that is dependent on absolutely no missed frames. If that is correct,
that implies the problem is most likely associated with the zaptel
drivers as those same original TDM cards don't work now (and obviously
nothing has changed on those installed cards).

I'd have to guess that digium outsourced the design of the various
cards they are selling and that's one of the reasons why there hasn't
been anyone that has stepped up to the plate to help resolve the
issues. (I'm about 95% certain of that statement based on certain
comments that have been made off list.) Therefore, anyone trying to
debug or reverse engineer the drivers/code doesn't have access to
anyone with second-level technical knowledge of its operation or
design expectations.

Also, the TDM card has gone through several hardware revisions
indicating the original design had multiple design short-comings. 
Considering that card was one of the first designs to be marketed
by digium, and considering that T1 interface cards are much easier
to design (since there is no analog component involved), it's highly
probable the drivers do not exactly match the card's design objectives
(left-hand right-hand scenario).

> Also, I'd be interested to know if anyone is moving data successfully
> across FXO/FXS ports where (for example) an Adtran Channel bank is
> providing timing for the entire system.  Anybody heard of anything like
> that?

No knowledge on that one.

Rich





More information about the asterisk-users mailing list