Hi Steve,<br>

<br>

inline<br><br><div><span class="gmail_quote">On 5/19/06, <b class="gmail_sendername">Steve Underwood</b> &lt;<a href="mailto:steveu@coppice.org">steveu@coppice.org</a>&gt; wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

<a href="mailto:ColinZuo@viatech.com.cn">ColinZuo@viatech.com.cn</a> wrote:<br><br>&gt; Hi,<br>&gt;<br>&gt; The theory is not based on the music, it's based on that given by the<br>&gt; ITU G.711 Appendix I (BTW: the music is converted to 8K/mono/16bit by

<br>&gt; CoolEdit).<br>&gt;<br>What works well for music is very different from what works well for<br>voice. </blockquote><div><br>

yeah,&nbsp; but i don't think the difference is so big unless you give me a voice file to prove me wrong.<br>

And again the reason i prolong it based on theory given by G.711 Appendix I, which is said to be<br>

derived from experimentation of BELL.<br>

</div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">G.711 Appendix 1 and my code fade to silence over 50ms. For music<br>much greater sustain to fill in the gaps works much better. With speech,

<br>that badly affects intelligibility. </blockquote><div><br>

I didn't change this, BTW, G.711 Appendix I fade to silence over 60ms because&nbsp; it doesn't<br>

fade for the first erasure but you did and i think&nbsp; as you can't know&nbsp; the wave are going to&nbsp; <br>

rise or down you'd better keep the same level for the first erasure.<br>

<br>

////////////////////////////////////////////////////////////////////////////////////////////////<br>

G.711 Appendix I<br>

I.2.4 Synthetic signal generation for first 10 ms<br>

For the first 10 ms of the erasure, the best results are obtained by generating the synthesized signal<br>

from the last pitch period with no attenuation.<br>

/////////////////////////////////////////////////////////////////////////////////////////////////////<br>

 </div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">I used the Appendix 1 approach<br>without experimenting. I suspect something other than linear attenuation

<br>would behave better.</blockquote><div><br>

By experimentation, i think as long as the algorithm aimed at Generic Linear concealment,<br>

probably you cann't find one much better than this, unless you analyse some voice parameters from<br>

previous samples.<br>

</div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&gt; And the current plc algorithm is similar to the G.711 Appendix I except:<br>

&gt; 1. The pitch detection algorithm : G.711 Appendix I uses cross<br>&gt; correlation, but Asterisk uses AMDF which is simpler and also performs<br>&gt; well<br>&gt;<br>Correct.<br><br>&gt; 2. The OLA window: G.711 update the OLA window length when burst loss

<br>&gt; occurs, but Asterisk didn't<br>&gt;<br>Wrong. They both use the same OLA strategy - 1/4 pitch period overlap.</blockquote><div><br>

G.711 will prolong the OLA window by 4ms&nbsp; until it reached 10ms, but&nbsp; the Asterisk one doesn't? <br>

<br>

////////////////////////////////////////////////////////////////////////////////////////////////<br>

G.711 Appendix I<br>

I.2.7 First good frame after an erasure<br>

At the first good frame after an erasure, a smooth transition is needed between the synthesized<br>

erasure speech and the real signal. To do this, the synthesized speech from the pitch buffer is<br>

continued beyond the end of the erasure, and then mixed with the real signal using an OLA. The<br>

length of the OLA depends on both the pitch period and the length of the erasure. For short, 10 ms<br>

erasures, a 1/4 wavelength window is used. For longer erasures the window is increased by 4 ms per<br>

10 ms of erasure, up to a maximum of the frame size, 10 ms.<br>

</div>////////////////////////////////////////////////////////////////////////////////////////////////<br>

<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&gt; 3. The nearby field of the first erasure: G.711 delays the output for<br>&gt; 3.75 ms to compensate the probable loss, but Asterisk just use the

<br>&gt; symmetrical<br>&gt;<br>&gt; part before the lost to do the OLA. The one G.711 Appendix I utilized<br>&gt; should be better, but it's not very important as human being's ears<br>&gt; are really anti-jamming.<br>&gt;

<br>That 3.75ms delay is so the Appendix 1 algorithm can do a 1/4 pitch<br>period of OLA when erasure commences. However, it incurs lots of buffer<br>copying when there are no lost packets. What my code does is time<br>reverse the last 1/4 pitch period and OLA with that. It sounds nasty,

<br>but listening tests with speech showed it was very close to the sound of<br>the G.711 appendix 1 algorithm, and improves efficiency a lot in the<br>common case - no packets being lost.</blockquote><div><br>

Yeah, the result are similar, but the difference is just 3.75 ms delay,&nbsp; i didn't see<br>

more buffer copying than necessary,&nbsp; both algorithm save the same history (although G.711 keeps<br>

a longer one and delay for 3.75ms)<br>

</div>BTW: packet loss is very common at least in China, and the burst loss can last very long.<br>

For example, as the bandwith between the two major carriers are very low, two user from each<br>

will experience packet loss very often if they use the public internet not some softswitch network.<br>

<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&gt; 4. whether prolong the pitch period during burst loss: G.711 Appendix<br>&gt; I prolong the pitch period to a maximum of 3 pitch period, but

<br>&gt; Asterisk only uses one which<br>&gt;<br>&gt; saves memory but behave bad at burst loss.<br>&gt;<br>For ptolonged erasures G.711 Appendix 1 and my code act in exactly the<br>same way. They linearly attenuate to zero over the first 50ms. In that

<br>period they repeat the last 1.25 pitch periods of real speech, with a<br>quarter pitch period of overlap. When real speech restarts they both do<br>a 1/4 pitch period of OLA, based on the last known pitch. The algorithms

<br>are identical beyond the initial 1/4 pitch period of OLA. Why would<br>anyone want to save memory here? It only uses a small amount. The<br>algorithmic changes were to reduce the buffer manipulation in the common<br>case.

</blockquote><div><br>

Not the same.<br>

<br>

////////////////////////////////////////////////////////////////////////////////////////////////<br>

G.711 Appendix I<br>

I.2.5 Synthetic signal generation after 10 ms<br>

If the next frame is also erased, the erasure will be at least 20 ms long and further action is required.<br>

While repeating a single pitch period works well for short erasures (e.g. 10 ms), on long erasures it<br>

introduces unnatural harmonic artifacts (beeps). This is especially noticeable if the erasure lands in<br>

an unvoiced region of speech, or in a region of rapid transition such as a stop. It was discovered by<br>

experimentation that these artifacts are significantly reduced by increasing the number of pitch<br>

periods used to synthesize the signal as the erasure progresses. Playing more pitch periods increases<br>

the variation in the signal. Although the pitch periods are not played in the order they occurred in the<br>

original signal, the resulting output still sounds natural. At 10 ms into the erasure the number of pitch<br>

periods used to synthesize the speech is increased to two, and at 20 ms a third pitch period is added.<br>

For erasures longer than 20 ms no additional modifications to the pitch buffer are made.<br>

</div>////////////////////////////////////////////////////////////////////////////////////////////////<br>

<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">I think the documentation for my PLC code is missing from the Asterisk</blockquote><div><br>

No, it's available&nbsp; in plc.h under&nbsp; asterisk/include. :)<br>

</div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">source code, but you can find it at<br><a href="http://www.soft-switch.org/spandsp-doc/plc_page.html">

http://www.soft-switch.org/spandsp-doc/plc_page.html</a><br><br>Regards,<br>Steve<br><br>_______________________________________________<br>--Bandwidth and Colocation provided by <a href="http://Easynews.com">Easynews.com

</a> --<br><br>asterisk-dev mailing list<br>To UNSUBSCRIBE or update options visit:<br>&nbsp;&nbsp; <a href="http://lists.digium.com/mailman/listinfo/asterisk-dev">http://lists.digium.com/mailman/listinfo/asterisk-dev</a><br></blockquote>

</div><br>