[Asterisk-Users] FXO/FXS cpu spikes, data loss and ztclock.
qrss
qrss at keitz.org
Mon Jun 20 14:36:53 MST 2005
I have not really tried any other values for N1, M1 or CGM. I actually
used the formulas from the Si3035 data sheet to calculate what they
"should be" for 8Khz. There's a lot of math in there, but it looks like
there may be several ways to arrive at the same output values. Not sure
if using a different calculation for the different dividers might give
better results using the same crystal or not. This was my first shot at
it but your idea seems like a good one.
I'm not sure what profiling tools might be useful, but would be delighted
to hear any suggestions that anyone can contribute. It really appears
that things are choking up somewhere in the interrupt handling routines
and I'm guessing somewhere in the zaptel driver.
If the problem turns out to be a timing sync problem due to oversampling a
sample or so per second, then the best solution may be a hardware one.
I'm still trying to get a handle on exactly how the overall system timing
works with the zaptel driver. It does not seem like even multiple
(non-t1) cards of the same type in an asterisk system sync their clocks.
For example, each seems to bring data into the system according to the
timing of it's own internal oscillator. That's my assessment of the wcfxo
style cards at least. The TDM400 seems to derive it's clock a little
differently. Perhaps somebody could jump in and shed a little light on
how the hardware clocking works for that card. It seems that overall the
basic theory of operation is quite similar - Tiger Jet 320 PCI controller,
DAA (or SLIC for FXS) etc. As far as I know, the problems of CPU spikes
and data loss are not apparent on a properly configured T1 setup.
I think that any data that we can gain from others running vmstat 1
(looking for cpu spikes) in combination with running ztclock would be
useful. Especially on differing hardware including the various T1 cards.
ztclock is looking pretty good to me on my hardware, but as with most
polling type tests I would anticipate there must be some margin of error.
I don't have a handle on that yet.
Also, I'd be interested to know if anyone is moving data successfully
across FXO/FXS ports where (for example) an Adtran Channel bank is
providing timing for the entire system. Anybody heard of anything like
that?
-----Original Message-----
From: Rich Adamson
Sent: Mon, June 20, 2005 5:53 pm
Certainly sounds like you're getting closer to the problem. I thought
about doing something like that, but after spending soooo much time
with the TDM card, decided not to mess with it.
I assume you tried a few other values for N1, M1 and CGM as well?
Do you think any of the profiling tools would be useful to isolate
the asterisk routines causing the spikes?
------------------------
> Digging further into the FXO cpu spike vs clock issue, I
> removed the 18.432 MHZ crystal from an FXO card and replaced
> it with a 20.000 MHZ crystal. This of course forced the zaptel
> timing way off ~ 93% accurate using ztclock. I then proceeded to
> modify the wcfxo.c driver source code to set the proper PLL divider
> values to return the DAA clock back to 8 Khz. I came up with the
> values of N1=25, M1=72 and CGM=1. I wrote the corresponding
> values out to registers 7, 8 & 10. This seemed to bring the clock
> back closer to the true 8 Khz spec and in fact seemed to provide
> a slightly better clock value than the original crystal.
>
> Before the mod, I was seeing CPU spikes once every 12 seconds
> while ztclock was predicting "Estimate 8 frame slips every
> 12.083200 seconds."
>
> After the mod, I was seeing them once every 15 second while
> ztclock was predicting "Estimate 8 frame slips every
> 15.104000 seconds."
>
> It certainly seems that there is a direct, predictable
> relationship here. I'd appreciate any thoughts that others
> may be able to contribute based upon these results or the
> results of their own testing with ztclock and vmstat 1 on
> any of the FXO/FXS hardware.
>
> Here are my thoughts on this:
>
> I suspect that frame slips are occuring somehow. I have not
> quite figured out how at this point, but it does appear (if ztclock
> is accurate) that the math is pointing in that direction. The
> predictability of the spikes seems too much to be just coincidence.
> Also, assuming ztclock is accurate, it appears that for most FXO/FXS
> hardware, the clock actually runs just a little faster than 8000 hz.
> If the FXO card was moving data into a zaptel buffer at a rate slightly
> faster than it was being removed, then a buffer overrun condition aka
> frame slip would be the invariable result. I'm thinking that meshing
> against precisely timed VOIP data (or T1) would be one example where we
>> could
> expect something like this to occur. In any event a buffer overrun
> would most certainly result in lost data. I suspect this is causing the
> CPU spikes, and also is the reason why nobody seems to be able to
> reliably use data/fax applications across these types of cards.
> Best as I can determine, it seems that certain channels of the Asterisk
> PBX seem to time independently from the primary clock source. My
> experience tells me that in order for data to pass across a telecom
> network, every node must be in precise timing sync in order to avoid
> data loss. I would not expect Asterisk to be any different.
>> Interestingly,
> by adding a TDMOE connection to a second system and configuring it to
>> time
> from the first, the exact same ztclock and vmstat 1 results were
>> obtained
> on the second system.
>
> Here is some raw data on the results I obtained...
>
>
> *** ztclock results before modification ***
>
> ./ztclock
>
>
> ztclock - clock source accuracy test (3 passes)
>
> Flushing input buffer...
> Flush Complete.
>
> Test is approximately 3 minutes. Please wait...
>
> 483328 samples in 60.410900 sec. (483288 sample intervals) 99.991722%
> 483328 samples in 60.410901 sec. (483288 sample intervals) 99.991722%
> 483328 samples in 60.410899 sec. (483288 sample intervals) 99.991722%
>
> Estimate 8 frame slips every 12.083200 seconds.
>
>
> *** ztclock results after modification ***
>
>
> ./ztclock
>
> ztclock - clock source accuracy test (3 passes)
>
> Flushing input buffer...
> Flush Complete.
>
> Test is approximately 3 minutes. Please wait...
>
> 483328 samples in 60.411915 sec. (483296 sample intervals) 99.993378%
> 483328 samples in 60.411915 sec. (483296 sample intervals) 99.993378%
> 483328 samples in 60.411918 sec. (483296 sample intervals) 99.993378%
>
> Estimate 8 frame slips every 15.104000 seconds.
>
>
>
> Results of vmstat 1
>
> procs memory swap io system
>> cpu
> r b w swpd free buff cache si so bi bo in cs us sy
>> id
>
> 0 0 0 0 65108 13224 35340 0 0 0 24 1115 185 0 17
>> 83
> 1 0 0 0 65108 13224 35340 0 0 0 0 1113 178 0 0
>> 100
> 0 0 0 0 65108 13224 35340 0 0 0 0 1113 180 0 0
>> 100
> 0 0 0 0 65108 13224 35340 0 0 0 0 1113 178 0 0
>> 100
> 0 0 0 0 65108 13224 35340 0 0 0 0 1114 178 0 0
>> 100
> 1 0 0 0 65108 13224 35340 0 0 0 12 1115 192 0 0
>> 100
> 0 0 0 0 65108 13224 35340 0 0 0 0 1113 180 0 0
>> 100
> 0 0 0 0 65108 13224 35340 0 0 0 0 1113 180 0 0
>> 100
> 0 0 0 0 65108 13224 35340 0 0 0 0 1113 180 0 0
>> 100
> 1 0 0 0 65108 13224 35340 0 0 0 0 1113 178 0 0
>> 100
> 0 0 0 0 65104 13228 35340 0 0 0 24 1115 190 0 0
>> 100
> 0 0 0 0 65104 13228 35340 0 0 0 0 1113 183 0 0
>> 100
> 0 0 0 0 65104 13228 35340 0 0 0 0 1113 180 0 0
>> 100
> 1 0 0 0 65104 13228 35340 0 0 0 0 1113 179 0 0
>> 100
> 0 0 0 0 65104 13228 35340 0 0 0 0 1113 180 0 0
>> 100
> 0 0 0 0 65100 13232 35340 0 0 0 36 1118 190 0 16
>> 84
>
> _______________________________________________
> Asterisk-Users mailing list
> Asterisk-Users at lists.digium.com
> http://lists.digium.com/mailman/listinfo/asterisk-users
> To UNSUBSCRIBE or update options visit:
> http://lists.digium.com/mailman/listinfo/asterisk-users
>
---------------End of Original Message-----------------
More information about the asterisk-users
mailing list