[asterisk-dev] Suggestion on Packet Loss Concealment Algorithm

Steve Underwood steveu at coppice.org
Fri May 19 04:50:54 MST 2006


ColinZuo at viatech.com.cn wrote:

> Hi,
>
> The theory is not based on the music, it's based on that given by the
> ITU G.711 Appendix I (BTW: the music is converted to 8K/mono/16bit by
> CoolEdit).
>
What works well for music is very different from what works well for
voice. G.711 Appendix 1 and my code fade to silence over 50ms. For music
much greater sustain to fill in the gaps works much better. With speech,
that badly affects intelligibility. I used the Appendix 1 approach
without experimenting. I suspect something other than linear attenuation
would behave better.

> And the current plc algorithm is similar to the G.711 Appendix I except:
> 1. The pitch detection algorithm : G.711 Appendix I uses cross
> correlation, but Asterisk uses AMDF which is simpler and also performs
> well
>
Correct.

> 2. The OLA window: G.711 update the OLA window length when burst loss
> occurs, but Asterisk didn't
>
Wrong. They both use the same OLA strategy - 1/4 pitch period overlap.

> 3. The nearby field of the first erasure: G.711 delays the output for
> 3.75 ms to compensate the probable loss, but Asterisk just use the
> symmetrical
>
> part before the lost to do the OLA. The one G.711 Appendix I utilized
> should be better, but it's not very important as human being's ears
> are really anti-jamming.
>
That 3.75ms delay is so the Appendix 1 algorithm can do a 1/4 pitch
period of OLA when erasure commences. However, it incurs lots of buffer
copying when there are no lost packets. What my code does is time
reverse the last 1/4 pitch period and OLA with that. It sounds nasty,
but listening tests with speech showed it was very close to the sound of
the G.711 appendix 1 algorithm, and improves efficiency a lot in the
common case - no packets being lost.

> 4. whether prolong the pitch period during burst loss: G.711 Appendix
> I prolong the pitch period to a maximum of 3 pitch period, but
> Asterisk only uses one which
>
> saves memory but behave bad at burst loss.
>
For ptolonged erasures G.711 Appendix 1 and my code act in exactly the
same way. They linearly attenuate to zero over the first 50ms. In that
period they repeat the last 1.25 pitch periods of real speech, with a
quarter pitch period of overlap. When real speech restarts they both do
a 1/4 pitch period of OLA, based on the last known pitch. The algorithms
are identical beyond the initial 1/4 pitch period of OLA. Why would
anyone want to save memory here? It only uses a small amount. The
algorithmic changes were to reduce the buffer manipulation in the common
case.

I think the documentation for my PLC code is missing from the Asterisk
source code, but you can find it at
http://www.soft-switch.org/spandsp-doc/plc_page.html

Regards,
Steve




More information about the asterisk-dev mailing list