[Asterisk-Dev] video in iax2 spec

Thu Apr 28 16:28:36 MST 2005

Steve wrote:
> Derek Smithies wrote:
> 
>> Steve,
>> my reading the paragraph (you wrote)
>> 
>> 
>>> The packetization for everything except theora seems pretty trivial
>>> to do; just take the output of the encoder, and drop it into frames.

Garbage. Read the relevent RFC's and become enlightened.

>>> Theora, however, is like vorbis, in that it has codebooks and stuff
>>> that need to be sent reliably between endpoints. I might need to use
>>> an IE for this kind of thing, although they could be sent in
>>> reliably transmitted frames. I plan to follow the i-d for RTP
>>> packetization for this, but they're changing that around a bit.
>>> (you can see my comments about this in the xiph-rtp mailing list).

>> suggests that you are intending different call handling for each
>> different codec. So that Theora needs no additional packetization
>> information added to the mini video frame. Other codecs need
>> additional packetization information added to the mini video frames.
>> 
>> What happens when you are going from h323 to iax2 (with video?). My
>> view is that the video packetization format chosen should make voip
>> protocol conversion easy. If the iax2 video packetization is codec
>> dependendant, well, voip protocol conversion becomes hard.
>> 
>> 
> every codec already defines different formats for how they
> must be packetized.

So you agree with Derek that IAX video formats should use packetization
schemes(payload formats) as described in the relevent RTP RFC's.

> For theora, I've already written the
> packetization such that it matches the (present) i-d
> describing the RTP packetization. The problem is that that
> specification depends on using SDP as well, as a reliable
> channel to pass along the codebooks, and IAX2 doesn't use SDP.

Theora, that's excellent(if you just want to talk to your own
implimentation)

> Also, you need to send new codebooks whenever the source of
> the video stream changes; This can happen quite a lot in
> asterisk (i.e. you can be in a conference, and the
> video-source switches a lot, or you can be "watching" several
> video-voicemails, etc). In SIP you'd need to do a "re-invite"
> or something in order to send new SDP information each time.
> 
> For the other codecs, I haven't really looked into how their
> RTP packetization works, but I would imagine it's simpler to
> implement,

You imagine wrong.

> and may just be dumping the output from the
> encoder into the data portion of the packets (with some
> additional work required if it's more than some maximum size
> for RTP packets, and you need to span them).

Wrong again.

>> I think the RTP header used in H323 for video works very well, and
>> should be used in the video packets. Yes, I know RTP headers take up
>> bytes. However, given that we are sending compressed video, an extra
>> couple of bytes are not going to "break the bank".
>> 
>> 
> I'm not sure that encapsulating RTP inside of IAX is
> necessarily going to be a good idea; you basically will then
> end up with multiple timestamps, and stuff like that. I'd
> just get the data portion (payload) to be the same, and then
> it should be as easy to translate the headers as it is for audio.

Don't forget the format specifc payload headers(used to indicate how to
unpacketize), to which I believe Derek was actually refering to. In anycase,
if we are using packetization as per the relevent RTP payload specification
then conversion from H.323/SIP H.263 to IAX would involve no
transcoding/re-packetization, rather just replacing the root RTP header with
an IAX equivilent. Required OOB data is another issue, though at least
agreeing on the payload format of video frames is a step in the right
direction. This can be overcome in the mean time by using a specific
configuration.

>> I think also that the video packetization should be as consistant  as
>> possible for the major video codecs.
>> 
>> 
> Hmm, I was aiming for inconsistency. I figured that the
> payload format would be some random number of bytes from one
> frame, then some random data, and then some random number of
> bytes from another frame. This should change with each frame.
> Sometimes, you should XOR the frame's data with some other
> random data too, just for fun.

At this point in time without a clear definition of how video is to be
handled end to end you might as well send a random number of bytes. 

Cheers,

Ben.