[asterisk-dev] Wish: adding intelligent codec negotiation to asterisk / pjsip

Michael Maier m1278468 at allmail.net
Tue Jan 31 02:18:38 CST 2017


Hello Matt,

thanks for your (and all the other developers) kind response(s)! I'm
happy that you didn't think "please, not one more of this old and
boring discussion". Thanks to all of you taking part regardless!

On 01/30/2017 at 10:22 PM Matt Fredrickson wrote:

> Hey Michael,
> 
> First off, thanks for taking the time to express some of your thoughts
> and concerns to the asterisk-dev list.  I'll keep my reply to your
> email inline below.
> 
> On Mon, Jan 30, 2017 at 4:13 AM, Michael Maier <m1278468 at allmail.net> wrote:
> > Dear developers,
> >
> > I've been redirected to this mailing list by Joshua Colp during fixing a
> > one way audio bug[1] to discuss another solution as provided in the fix.
> >
> > Background:
> > - A lot of people complain about bad VoIP call quality compared to the
> > old POTS / ISDN devices. What do they mean from a technical point of
> > view: High latencies (resulting in echo), digital sound because of "bad"
> > codecs, general quality loss during transcoding and many other reasons more.
> > - In Europe, HD audio is being adopted slowly. This means, more and more
> > UAs can natively handle HD codecs like g722. But they must be downward
> > compatible at the same time for older UAs, which just speak alaw (like
> > the old POTS devices e.g. or UAs which are not yet HD capable).
> > Therefore, they advertise at least two codecs: g722 and alaw (mostly
> > plus some more like ulaw or some other codecs).  
> 
> So there are multiple reasons why you could be seeing reportedly bad
> call quality that come to my mind:
> 
> 1. Transcoding - changes the audio, but typically doesn't make things
> sound *too* bad.  Obviously it is codec dependent as to how bad it
> sounds afterwards, but most modern codecs aren't terrible for speech
> replication and encoding.  Usually this is not where call quality
> problems are noticed.
> 
> 2. Packet loss and jitter related problems.  In an ISDN network, there
> is a guaranteed real time audio channel for transporting media.  As
> long as the data pumps on the transmit and receive side are working
> properly, you should hear almost no audio quality issues.  VoIP tries
> to transport real time audio over a non-guaranteed transport channel.
> This sometimes causes bad audio quality issues due to packet loss,
> packet reordering, or extreme packet delays.  Enabling Asterisk's
> jitter buffers typically improves many problems that arise due to
> this.  They are typically *not* enabled by default and so must be
> explicitly enabled.

My primary concern is latency. Each transcoding adds additional
latency. Latency (if it's high enough) is responsible for echo, which
reduces call quality massively. I know, that there have to be more
requirements to be met until you can hear echo - but my conviction
always is: Everybody should do the best at the points he is responsible for.

"core show translation" lists transcoding alaw to or from g722 as one of
the most expensive transcoding jobs. Therefore I think, it shouldn't be
done if it both UAs have a common codec.

> I'm hoping you already have dived into your problem to look at both
> the above elements, and have confirmed that you are not dealing with
> the second problem instead of voice mutation due to the first problem.
> Usually you can track the second problem by doing packet captures of
> the voice conversations in question as well as look at RTCP
> statistics.

I know off these things and you are certainly right. But as I already
told: that's not the primary point for me. I don't have any means to
optimize any device I'm not responsible for. But if each device
involved does its best it can do, things will get better or just won't
break.
The problem is: There are people in the wild having e.g. headsets
which are not crosstalk save. Before VoIP, this mostly wasn't a problem,
because there wasn't enough latency. Now, VoIP adds (more or less)
latency, partly necessarily (AD and DA converter e.g.), partly not
necessarily (e.g. not always essential transcoding takes place).

Generally speaking: People often do not have VoIP optimized devices
and they refuse to buy (expensive) new devices just to be "VoIP safe".
Most of them even don't know about the reasons of the problems suddenly
rising after being switched to VoIP. They even aren't affected as echo
e.g. is heard just by the peer!

> 
> > What does this mean to Asterisk?
> > My conviction is, that Asterisk shouldn't make things even worse when
> > handling calls / codecs by forcing unnecessary transcoding, which
> > unnecessarily harms call quality. Next point of unnecessary transcoding:
> > it unnecessarily steals system resources from the machine asterisk is
> > running on.
> > Asterisk should harm each call it handles and the underlying machine as
> > little as possible.
> >
> > Therefore I would like to see a (switchable) feature, that asterisk /
> > pjsip always tries to primarily advertise codecs, which are supported by
> > both UAs and remove those codecs, which are not supported by one of the
> > UAs. This prevents unnecessary transcoding.  
> 
> This actually would be a really neat thing for Asterisk to be able to
> do.  Last time I looked at it, there are quite a few challenges in
> making it happen.  Asterisk is designed to be a back to back user
> agent, and it inherently is designed to terminate media and codecs
> individually with each leg in question, but not necessarily together.
> It "makes things work" on each leg separately, based on the allowed
> codecs for each endpoint.
> 
> This is a needful behavior since many times an Answer() has already
> occurred and negotiated the codec capabilities for a call and most
> dial plan applications assume a call needs to have media fully
> negotiated in order to interact on the channel.
> 
> For the simple case where your dial plan doesn't do any intense media
> interaction with a channel and simply Dial()'s out, a significant
> portion that doesn't work right now is that the codec information from
> the 200 OK received from the outbound channel is not passed back
> through to the inbound channel - I'm assuming that's what you're
> referring to.  Hopefully Josh or Mark will correct me if my memory is
> off.
> 
> > Example:
> >
> > Configuration of asterisk:
> >
> > extension: g722,alaw,ulaw
> > trunk: g722,alaw,ulaw
> >
> > Today's behavior w/ pjsip:
> >
> > *Incoming* call from provider to asterisk
> > INVITE from provider contains alaw.
> > INVITE from asterisk to extension contains g722,alaw,ulaw
> > OK 200 SDP from extension to asterisk contains g722
> > OK 200 SDP to provider contains alaw
> >
> > Result: asterisk has to transcode between extension and provider,
> > because it has to use alaw to provider and extension uses g722
> > (extension chooses the primary codec of the list in initial INVITE).
> >
> >
> > *Outgoing* call from extension to provider
> > INVITE from extension to asterisk contains g722,alaw,ulaw.
> > INVITE from asterisk to provider contains g722,alaw,ulaw.
> > OK 200 SDP from provider to asterisk contains alaw.
> > OK 200 SDP from asterisk to extension contains g722,alaw,ulaw.
> >
> > Result: asterisk has to transcode between extension and provider,
> > because it has to use alaw to provider and extension uses g722 (the
> > primary codec of the list in 200 OK SDP).
> >
> >
> > Both transcode actions above are completely unnecessary, because both
> > UAs would be able to use a common codec!
> >
> >
> > Preferred behavior:
> >
> > *Incoming* call from provider to asterisk
> > INVITE from provider contains alaw.
> > INVITE from asterisk to extension contains alaw
> > OK 200 SDP from extension to asterisk contains alaw
> > OK 200 SDP to provider contains alaw
> >
> > Result: no transcoding is necessary. Quality of call isn't harmed
> > unnecessarily! No unnecessary CPU load.
> > BTW: That's the way it already works with chan_sip!
> >
> >
> > *Outgoing* call from extension to provider
> > INVITE from extension to asterisk contains g722,alaw,ulaw.
> > INVITE from asterisk to provider contains g722,alaw,ulaw.
> > OK 200 SDP from provider to asterisk contains alaw.
> > OK 200 SDP from asterisk to extension contains alaw.
> >
> >
> > Result: no transcoding is necessary. Quality of call isn't harmed
> > unnecessarily! No unnecessary CPU load.
> >
> >
> > I would be really glad to have this intelligent codec handling w/
> > asterisk / pjsip!  
> 
> I think we would love to see some work in this area as well.  I'm not
> aware of anybody working on it right now, but if you'd like to help
> out with adding this feature, I know that there are a number of other
> people beside yourself that would be glad to see it.

I could offer to do some testing here.


Thanks,
Michael



More information about the asterisk-dev mailing list