[Asterisk-Dev] 16 KHz audio ?

Fri Dec 17 14:57:56 MST 2004

At 9:49 AM -0500 on 12/17/04, Andrew Lindh wrote:
>G.722 (wideband speech coding standard) would be a good step.
>It's an accepted standard and some phones already support it.
>It should not be too hard to write a new codec/format module
>for asterisk. Looks like it only needs about 10MIPS of power to
>run. I'm sure there is already some reference code out there.
>I don't know the copyright/patent status of the code of G.722
>
>One of the benefits from this would be to improve communication
>between people speaking English where English is NOT their first
>language. It's hard enough to speak someone else's language
>but when both parties are using non-native speech you need all
>the help you can get....if you have the bandwidth for 64K....
>
>FYI: G.722 is a 14 bit 16Khz sampled with SB-ADPCM compression
>down to 64K, 56K, or 48K.
>
>http://www.itu.int/rec/recommendation.asp?type=folders&lang=e&parent=T-REC-G.722

I agree with Andrew.  The G.722 spec already exists; let's use 
something that give us what we want, yet is compatible with existing 
hardware and other platforms.  Most notably, the Grandstream phones 
support G.722, so that certainly is a widely used platform at this 
time.  While g722 may be an "outdated" codec (and the "new" 722.1 is 
patented) I still think it's probably worthwhile to investigate it. 
Perhaps there is some alternate royalty-free standards-based codec 
which might also fit the bill?

I don't know if this addresses the issue about the low-bandwidth, 
high-quality codec.  It seems that G.722 is 48, 56, or 64kbps.   If 
the primary motivation is "high-quality", then G.722 seems like a 
reasonable solution.  If the primary motivation is "economy of bits", 
then it's not quite as appealing because it's probably not going to 
live well on a modem-based connection of any type.

I think that just as critical would be to implement the packet-loss 
concealment and variable bitrate methods available in codecs like 
iLBC and Speex.  Let's clean up what we already have before running 
to implement new codecs.

Now that I've said we should wait and finish what's already on our 
plate, here are some things we should consider if a "better-sounding" 
codec with a different khz rating is implemented. I agree that while 
this modifies some features (voicemail, conferencing) there may be 
solutions to this:

   1) Use "least common denominator" when creating channel 
connections, and use a re-INVITE on IAX and SIP devices to 
renegotiate to the 8khz level when a less-capable user joins the 
call/conference/etc.

   2) Treat the 16khz streams as a separate "codec", and transcode 
down to 8hkz for any channels that can only understand 8khz during 
audio sessions.  I don't know how to deal with things like voicemail 
recording or playback files.  It would seem to make sense to have 
them recorded in "high-quality" mode, and then transcoded on the fly 
for 8khz streams, but I'm sure this would require major re-writes of 
the code that handles these playback/recording methods.

Side note: I typically hear people complain about Skype when 
connecting to the PSTN, but there are no complaints about bandwidth 
in the peer-to-peer calls, which probably use the GIPS codecs 
natively end-to-end.  As for the quality on PSTN calls, I assume this 
is because they have to do a double-jump for audio, since I imagine 
they use custom codecs that will not directly talk with gateway 
devices like Sonus or Cisco.  They (probably) have to convert their 
custom encoding methods on a device which then relays their calls via 
G.711 to the final destination gateway.  One could even speculate 
that they might use a modified Asterisk system for this, as it would 
seem like the path of least resistance, but I have no actual 
knowledge of their architecture.

JT