[asterisk-bugs] [JIRA] (ASTERISK-22951) Wrong RTP timestamps (resulting in audio issues) when transcoding to SILK from G711
Joshua Colp (JIRA)
noreply at issues.asterisk.org
Mon Dec 18 11:18:07 CST 2017
[ https://issues.asterisk.org/jira/browse/ASTERISK-22951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joshua Colp updated ASTERISK-22951:
-----------------------------------
Affects Version/s: 13.18.4
> Wrong RTP timestamps (resulting in audio issues) when transcoding to SILK from G711
> -----------------------------------------------------------------------------------
>
> Key: ASTERISK-22951
> URL: https://issues.asterisk.org/jira/browse/ASTERISK-22951
> Project: Asterisk
> Issue Type: Bug
> Security Level: None
> Components: Codecs/General, Resources/res_rtp_asterisk
> Affects Versions: 11.6.0, 13.18.4
> Environment: Debian 3.2.51-1 x86_64
> Reporter: Peter Sokolov
> Severity: Minor
>
> I have noticed problems with voice quality when Asterisk transcodes to SILK. As both ends reported zero packet loss I have started to look for the reason. It seems that when Asterisk receives RTP stream which is not coming in as chunks of 20 ms (e.g. G711 30 ms), then it generates wrong RTP timestamps for SILK packets it is sending out.
> Enabling rtp set debug on and verbose in the console shows the following (my comments are inserted inline following the semicolons):
> {noformat}
> Got RTP packet from x.x.x.x:7082 (type 08, seq 000001, ts 000240, len 000240)
> {noformat}
> ; Asterisk receives 30 ms RTP G711 payload.
> {noformat}
> > SILK encoder set: sample rate:8000 dtx:0 bitrate:20000 fec:0 packetlosspercentage:0 packetSize:160
> > lintosilk_frameout: encoding 240 samples
> {noformat}
> ;30 ms payload (240 samples) are sent to the SILK encoder. That one encodes the first 20 ms of payload (160 samples) to SILK packet. The remaining 10 ms are kept in (Asterisk?) buffer for later use.
> {noformat}
> Sent RTP packet to x.x.x.x:4004 (type 96, seq 000076, ts 000240, len 000033)
> {noformat}
> ;Silk RTP packet with 160 samples forwarded. Containing voice data from timestamp 240 to timestamp 240+160=400.
> {noformat}
> Got RTP packet from x.x.x.x:7082 (type 08, seq 000002, ts 000480, len 000240)
> {noformat}
> ; Asterisk receives the next 30 ms of RTP G711 payload.
> {noformat}
> > lintosilk_frameout: encoding 320 samples
> {noformat}
> ;The remaining 10 ms from the buffer from previous SILK encoding and just received 30 ms payload (total 40 ms/320 samples) are sent to SILK encoder. That one encodes all 40 ms of payload (320 samples) to SILK packet. No remaining data is kept in (Asterisk?) buffer.
> {noformat}
> Sent RTP packet to x.x.x.x:4004 (type 96, seq 000077, ts 000480, len 000095)
> {noformat}
> ;Asterisk forwards Silk RTP packet with 320 samples. The problem here is that Asterisk uses RTP timestamp from the start of the last 30 ms of the encoded payload although it has encoded 10 ms more data from the past. So actually in my opinion Asterisk should have decreased the RTP packet timestamp by the amount of data it used from a remaining buffer and it sent to the SILK encoder. That would mean that this RTP packet would have a timestamp 400 instead of 480. 10 ms less than G711 packet because 10 ms of additional data were used from the buffer from the end of the previous received G711 packet.
> The result now is that the RTP endpoint receives RTP SILK packet with 20 ms payload and correct timestamp. Every second SILK packet however is a SILK packet with 40 ms payload and a timestamp of 10 ms later than where it should be. The endpoint receives no data for the 10 ms before the 40 ms packets and 10 ms of overlapping data.
> *To reproduce the problem:*
> configure in sip.conf
> {noformat}
> directmedia=no
> allow=alaw:30,ulaw:30
> {noformat}
> for one peer and
> {noformat}
> directmedia=no
> allow=silk8,silk12,silk16,silk24
> {noformat}
> for the other peer. The peer using alaw/ulaw has to be configured to send 30 ms chunks. You should be able to see the above log in the console on a call between them and hear voice problems (enable rtp set debug on and verbose).
> If the peer using alaw/ulaw is under one's control, one can configure it to use 20 ms chunks which removes the problem. However there are peers which are not under control of one and others that do not have possibility of configuring 20 ms chunks. In those cases the audio is distorted as soon as the other side uses SILK and Asterisk transcodes.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
More information about the asterisk-bugs
mailing list