[asterisk-dev] Potential change to outgoing codec offers (new topic)

Tue Oct 24 02:20:49 MST 2006

On Mon, Oct 23, 2006 at 02:37:39PM +0100, Brian Candler wrote:
> Here's one I knocked together in Perl:
> http://pobox.com/~b.candler/software/testsiperror
> 
> The fake codec it offers is "WIBBLE/8000".
> 
> And the results I get when pointing it at various SIP UAs:
> 
>   Audiocodes Tulip ATA (2.2.0_build_6)          403 Forbidden
>   Ekiga 2.0.1                                   180 Ringing (!!)
>   Asterisk trunk r45305 with Zaptel FXS         100 Trying, then rings (!!)

And a couple more data points; this time not using the Perl script, but
configuring the Tulip ATA to use only G726-16, and then pointing at an
upstream provider.

Quintum/1.0.0 (via Cicero softswitch I think)     415 Unsupported Media Type

sipgate.co.uk (derived from Asterisk?)            100 Trying, followed by
                                                  488 Not acceptable here

I'm surprised how inconsistent and even broken this behaviour is.

So I've been trying to think of some other approaches. Here's what I've
considered:

1. Drive the media exchange from the other end. That is, send an empty
INVITE for the second call leg; the initial SDP offer comes from the far end
in the 200 OK, and the media setup is completed in the ACK.

However this doesn't help us in the case where the *incoming* leg already
has an empty INVITE; we learn nothing about the capabilities of that side
until we complete the call, so we're back to square one.

[In any case, this method of working may not be implemented properly by UAs,
even though it's allowed by the RFCs. You can also get audio clipping at the
start of the call, as described in RFC 3960]

2. For calls placed to dynamic terminals which have registered, you could
probe their capabilities with OPTIONS at registration time. This might be
more efficient than sending an OPTIONS request at call setup time.

This is caching of information, so the cache can become stale (i.e. the UA
may have changed what codecs it permits between registration and the call
coming in). I don't think that will be a problem in practice.

For calls going out to upstream providers, in principle you could probe them
in the same way. However if they run a VoIP switch like Asterisk, I expect
they will give a very general response (i.e. "I have all these codecs")
which doesn't help you much.

3. Revert to 1.2 behaviour (i.e. offer all codecs with media proxying, and
re-INVITE to set up native bridging afterwards). However, re-order the codec
list so that the codecs included in the incoming SIP INVITE appear before
the others. This might make it a bit more likely that the far end will
choose a codec which doesn't need transcoding.

This will work of course. However the codec decision is made by the far end
based on its own preferences, so you may still end up with unnecessary
transcoding; and even if you don't, there may still be an audio glitch while
the re-invite takes place.

4. Like (3) but turn it on its head: that is, set up the call with direct
media first, offering all codecs; and quickly re-INVITE to fix up the media
stream where transcoding turns out to be necessary.

An example in more detail:

- first phone sends offer: codecs A,C,E from IP address X

- Asterisk sends offer to second phone: codecs A,C,E,B,D,F from IP addr X

- second phone sends response to Asterisk

  - if it picked codec A,C or E then Asterisk completes the direct
    media path by sending an OK to the first phone

  - if it picked codec B,D or F, then Asterisk immediately re-INVITEs
    to the second phone, redirecting the media stream to Asterisk,
    then sends OK to the first phone also directing media at Asterisk.(*)

In the second case, there will be a brief period of time when the first
phone receives media packets for a codec it doesn't grok. Hopefully this
won't crash it; if it does, then you set canreinvite=no on that phone, and
so Asterisk falls back to the behaviour of unconditionally proxying media.

I'd say this is a reasonable solution, if you assume that *not* transcoding
is going to be the most common scenario. It gives a clean glitch-free direct
audio setup in that case. It's also pretty close to the current (but broken)
1.4 behaviour.

  (*) Asterisk could choose whichever of A,C or E is cheapest to transcode;
      or it could honour the preference of the originator by picking the
      first one that it supports (i.e. A, unless Asterisk doesn't have
      an SLIN translator for A)

This is all starting to feel like a circle-squaring exercise though...
people want immediate end-to-end media setup, but people want RTP proxying
and transcoding to work immediately too :-{

Regards,

Brian.