[Asterisk-video] theora

Fri Jan 28 10:44:07 CST 2011

On Jan 28, 2011, at 10:37 AM, Maris wrote:

> around 2007, an item in the tracker system (11087) presented an incomplete patch for inplementing theora passthrough.
> The item has never been continued.
> 
> Meanwhile, things have changed, especially theora has been improved largely over the past two years.
> 
> Well, readers of this list should be known with pros and cons, nevertheless I summarize:
> Why theora? Clearly better than h.263(p) (and open source..., no licensing issues)
> Why not h.264? Not typically supported with sip clients, moreover a cpu resource hog.
> And mp4v (new as per asterisk 1.8?) isn't typically with sip clients, too.
> 
> Theora/h264 objective comparison with tables:
> http://keyj.s2000.ws/?p=356
> 
> Any opinions?

Hi,

I am the author of the 11087 patch. I have also worked extensively on developing a video chat/conference solution based on Theora and Asterisk/IAX2. The code can be found on SourceForge, looks for the iaxclient and appconference projects. 

In my opinion, Theora is a decent codec for video storage and playback, but sub-par when it comes to realtime applications such as video conferencing. Here's a list of issues we had with it and the approaches we took to overcome or sidestep these issues. This is a verbatim copy of a comment I made in the iaxclient source code:

/*
 * Some comments about Theora streaming
 * Theora video codec has two problems when it comes to streaming
 * and broadcasting video:
 *
 * - Large headers that need to be passed from the encoder to the decoder
 *   to initialize it. The conventional wisdom says we should transfer the
 *   headers out of band, but that complicates things with IAX, which does
 *   not have a separate signalling channel. Also, it makes things really
 *   difficult in a video conference scenario, where video gets switched
 *   between participants regularly. To solve this issue, we initialize
 *   the encoder and the decoder at the same time, using the headers from
 *   the local encoder to initialize the decoder. This works if the
 *   endpoints use the exact same version of Theora and the exact same
 *   parameters for initialization.
 *
 * - No support for splitting the frame into multiple slices.  Frames can
 *   be relatively large. For a 320x240 video stream, you can see key
 *   frames larger than 9KB, which is the maximum UDP packet size on Mac
 *   OS X. To work around this limitation, we use the slice API to fragment
 *   encoded frames to a reasonable size that UDP can safely transport
 * 
 * Other miscellaneous comments:
 *
 * - For quality reasons, when we detect a video stream switch, we reject all
 *   incoming frames until we receive a key frame.
 *
 * - Theora only accepts video that has dimensions multiple of 16. If we combine
 *   his with a 4:3 aspect ratio requirement, we get a very limited number
 *   of available resolutions. To work around this limitation, we pad the video
 *   on encoding, up to the closest multiple of 16. On the decoding side, we
 *   remove the padding. This way, video resolution can be any multiple of 2
 *
 * We should probably look more into this (how to deal with missing and
 * out of order slices)
 */

As you can see, we had to go through some pains to get things running. Unfortunately, at the time, Theora was the only license free video codec available, so we had to go with it. Nowadays, you also have WebM - it looks like it doesn't have many of these limitations. I am aware that Theora improved over the past few years, but I think improvements were mostly performance related and not fundamentals. Might be wrong though, it's been a while since I really looked at it.

Mihai