Hi Steve,<br>
<br>
inline<br><br><div><span class="gmail_quote">On 5/19/06, <b class="gmail_sendername">Steve Underwood</b> <<a href="mailto:steveu@coppice.org">steveu@coppice.org</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<a href="mailto:ColinZuo@viatech.com.cn">ColinZuo@viatech.com.cn</a> wrote:<br><br>> Hi,<br>><br>> The theory is not based on the music, it's based on that given by the<br>> ITU G.711 Appendix I (BTW: the music is converted to 8K/mono/16bit by
<br>> CoolEdit).<br>><br>What works well for music is very different from what works well for<br>voice. </blockquote><div><br>
yeah, but i don't think the difference is so big unless you give me a voice file to prove me wrong.<br>
And again the reason i prolong it based on theory given by G.711 Appendix I, which is said to be<br>
derived from experimentation of BELL.<br>
</div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">G.711 Appendix 1 and my code fade to silence over 50ms. For music<br>much greater sustain to fill in the gaps works much better. With speech,
<br>that badly affects intelligibility. </blockquote><div><br>
I didn't change this, BTW, G.711 Appendix I fade to silence over 60ms because it doesn't<br>
fade for the first erasure but you did and i think as you can't know the wave are going to <br>
rise or down you'd better keep the same level for the first erasure.<br>
<br>
////////////////////////////////////////////////////////////////////////////////////////////////<br>
G.711 Appendix I<br>
I.2.4 Synthetic signal generation for first 10 ms<br>
For the first 10 ms of the erasure, the best results are obtained by generating the synthesized signal<br>
from the last pitch period with no attenuation.<br>
/////////////////////////////////////////////////////////////////////////////////////////////////////<br>
</div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">I used the Appendix 1 approach<br>without experimenting. I suspect something other than linear attenuation
<br>would behave better.</blockquote><div><br>
By experimentation, i think as long as the algorithm aimed at Generic Linear concealment,<br>
probably you cann't find one much better than this, unless you analyse some voice parameters from<br>
previous samples.<br>
</div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> And the current plc algorithm is similar to the G.711 Appendix I except:<br>
> 1. The pitch detection algorithm : G.711 Appendix I uses cross<br>> correlation, but Asterisk uses AMDF which is simpler and also performs<br>> well<br>><br>Correct.<br><br>> 2. The OLA window: G.711 update the OLA window length when burst loss
<br>> occurs, but Asterisk didn't<br>><br>Wrong. They both use the same OLA strategy - 1/4 pitch period overlap.</blockquote><div><br>
G.711 will prolong the OLA window by 4ms until it reached 10ms, but the Asterisk one doesn't? <br>
<br>
////////////////////////////////////////////////////////////////////////////////////////////////<br>
G.711 Appendix I<br>
I.2.7 First good frame after an erasure<br>
At the first good frame after an erasure, a smooth transition is needed between the synthesized<br>
erasure speech and the real signal. To do this, the synthesized speech from the pitch buffer is<br>
continued beyond the end of the erasure, and then mixed with the real signal using an OLA. The<br>
length of the OLA depends on both the pitch period and the length of the erasure. For short, 10 ms<br>
erasures, a 1/4 wavelength window is used. For longer erasures the window is increased by 4 ms per<br>
10 ms of erasure, up to a maximum of the frame size, 10 ms.<br>
</div>////////////////////////////////////////////////////////////////////////////////////////////////<br>
<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> 3. The nearby field of the first erasure: G.711 delays the output for<br>> 3.75 ms to compensate the probable loss, but Asterisk just use the
<br>> symmetrical<br>><br>> part before the lost to do the OLA. The one G.711 Appendix I utilized<br>> should be better, but it's not very important as human being's ears<br>> are really anti-jamming.<br>>
<br>That 3.75ms delay is so the Appendix 1 algorithm can do a 1/4 pitch<br>period of OLA when erasure commences. However, it incurs lots of buffer<br>copying when there are no lost packets. What my code does is time<br>reverse the last 1/4 pitch period and OLA with that. It sounds nasty,
<br>but listening tests with speech showed it was very close to the sound of<br>the G.711 appendix 1 algorithm, and improves efficiency a lot in the<br>common case - no packets being lost.</blockquote><div><br>
Yeah, the result are similar, but the difference is just 3.75 ms delay, i didn't see<br>
more buffer copying than necessary, both algorithm save the same history (although G.711 keeps<br>
a longer one and delay for 3.75ms)<br>
</div>BTW: packet loss is very common at least in China, and the burst loss can last very long.<br>
For example, as the bandwith between the two major carriers are very low, two user from each<br>
will experience packet loss very often if they use the public internet not some softswitch network.<br>
<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> 4. whether prolong the pitch period during burst loss: G.711 Appendix<br>> I prolong the pitch period to a maximum of 3 pitch period, but
<br>> Asterisk only uses one which<br>><br>> saves memory but behave bad at burst loss.<br>><br>For ptolonged erasures G.711 Appendix 1 and my code act in exactly the<br>same way. They linearly attenuate to zero over the first 50ms. In that
<br>period they repeat the last 1.25 pitch periods of real speech, with a<br>quarter pitch period of overlap. When real speech restarts they both do<br>a 1/4 pitch period of OLA, based on the last known pitch. The algorithms
<br>are identical beyond the initial 1/4 pitch period of OLA. Why would<br>anyone want to save memory here? It only uses a small amount. The<br>algorithmic changes were to reduce the buffer manipulation in the common<br>case.
</blockquote><div><br>
Not the same.<br>
<br>
////////////////////////////////////////////////////////////////////////////////////////////////<br>
G.711 Appendix I<br>
I.2.5 Synthetic signal generation after 10 ms<br>
If the next frame is also erased, the erasure will be at least 20 ms long and further action is required.<br>
While repeating a single pitch period works well for short erasures (e.g. 10 ms), on long erasures it<br>
introduces unnatural harmonic artifacts (beeps). This is especially noticeable if the erasure lands in<br>
an unvoiced region of speech, or in a region of rapid transition such as a stop. It was discovered by<br>
experimentation that these artifacts are significantly reduced by increasing the number of pitch<br>
periods used to synthesize the signal as the erasure progresses. Playing more pitch periods increases<br>
the variation in the signal. Although the pitch periods are not played in the order they occurred in the<br>
original signal, the resulting output still sounds natural. At 10 ms into the erasure the number of pitch<br>
periods used to synthesize the speech is increased to two, and at 20 ms a third pitch period is added.<br>
For erasures longer than 20 ms no additional modifications to the pitch buffer are made.<br>
</div>////////////////////////////////////////////////////////////////////////////////////////////////<br>
<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">I think the documentation for my PLC code is missing from the Asterisk</blockquote><div><br>
No, it's available in plc.h under asterisk/include. :)<br>
</div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">source code, but you can find it at<br><a href="http://www.soft-switch.org/spandsp-doc/plc_page.html">
http://www.soft-switch.org/spandsp-doc/plc_page.html</a><br><br>Regards,<br>Steve<br><br>_______________________________________________<br>--Bandwidth and Colocation provided by <a href="http://Easynews.com">Easynews.com
</a> --<br><br>asterisk-dev mailing list<br>To UNSUBSCRIBE or update options visit:<br> <a href="http://lists.digium.com/mailman/listinfo/asterisk-dev">http://lists.digium.com/mailman/listinfo/asterisk-dev</a><br></blockquote>
</div><br>