[Asterisk-Dev] video in iax2 spec

Thu Apr 28 14:09:47 MST 2005

Derek Smithies wrote:

>Steve,
>my reading the paragraph (you wrote)
>  
>
>>The packetization for everything except theora seems pretty trivial to 
>>do; just take the output of the encoder, and drop it into frames. 
>>Theora, however, is like vorbis, in that it has codebooks and stuff that 
>>need to be sent reliably between endpoints. I might need to use an IE 
>>for this kind of thing, although they could be sent in reliably 
>>transmitted frames. I plan to follow the i-d for RTP packetization for 
>>this, but they're changing that around a bit. (you can see my comments 
>>about this in the xiph-rtp mailing list).
>>
>>    
>>
>
>suggests that you are intending different call handling for each different 
>codec. So that Theora needs no additional packetization information added 
>to the mini video frame. Other codecs need additional packetization 
>information added to the mini video frames.
>
>What happens when you are going from h323 to iax2 (with video?). My view 
>is that the video packetization format chosen should make voip protocol
>conversion easy. If the iax2 video packetization is codec dependendant, 
>well, voip protocol conversion becomes hard.
>  
>
every codec already defines different formats for how they must be 
packetized. For theora, I've already written the packetization such that 
it matches the (present) i-d describing the RTP packetization. The 
problem is that that specification depends on using SDP as well, as a 
reliable channel to pass along the codebooks, and IAX2 doesn't use SDP.

Also, you need to send new codebooks whenever the source of the video 
stream changes; This can happen quite a lot in asterisk (i.e. you can be 
in a conference, and the video-source switches a lot, or you can be 
"watching" several video-voicemails, etc). In SIP you'd need to do a 
"re-invite" or something in order to send new SDP information each time.

For the other codecs, I haven't really looked into how their RTP 
packetization works, but I would imagine it's simpler to implement, and 
may just be dumping the output from the encoder into the data portion of 
the packets (with some additional work required if it's more than some 
maximum size for RTP packets, and you need to span them).

>I think the RTP header used in H323 for video works very well, and should 
>be used in the video packets. Yes, I know RTP headers take up bytes.
>However, given that we are sending compressed video, an extra couple of 
>bytes are not going to "break the bank".
>  
>
I'm not sure that encapsulating RTP inside of IAX is necessarily going 
to be a good idea; you basically will then end up with multiple 
timestamps, and stuff like that. I'd just get the data portion (payload) 
to be the same, and then it should be as easy to translate the headers 
as it is for audio.

>I think also that the video packetization should be as  consistant  as 
>possible for the major video codecs.
>  
>
Hmm, I was aiming for inconsistency. I figured that the payload format 
would be some random number of bytes from one frame, then some random 
data, and then some random number of bytes from another frame. This 
should change with each frame. Sometimes, you should XOR the frame's 
data with some other random data too, just for fun.