[asterisk-dev] Strategies for handling RTCP feedback in codec modules

Fri Nov 11 08:09:51 CST 2016

2016-11-07 16:31 GMT+01:00 Lorenzo Miniero <lminiero at gmail.com>:

> 2016-11-07 16:17 GMT+01:00 Joshua Colp <jcolp at digium.com>:
>
>> Lorenzo Miniero wrote:
>>
>>> Hi all,
>>>
>>> apologies if this has been discussed before, but I couldn't find
>>> anything in the recent months on this group so I thought I'd write
>>> anyway.
>>>
>>> As a few others, I believe, I have been trying to find a way to make
>>> codec modules more aware of what's happening on the wire. In particular,
>>> the motivation for that comes from an attempt to make the open source
>>> Opus codec module more responsive and adaptive to changes in the
>>> network, taking advantage of the functionality the library provides in
>>> that respect (e.g., dynamic bitrate adaptation). The best approach to do
>>> that would obviously be providing codec modules with info on the RTCP
>>> feedback loop, e.g., in terms of losses the recipient has experienced,
>>> so that you can, for instance, change the bitrate in the encoder.
>>> Unfortunately, as of now there doesn't seem to be any way to make this
>>> possible, at least not in an easy way, in Asterisk out of the box.
>>> I've been investigating a few ways to do that, and have come up with
>>> some possible approaches, that I first wanted to discuss with you guys
>>> though, first of all to make sure I'm actually on the right path, and
>>> then to evaluate whether or not any of those can actually be integrated
>>> within the Asterisk code base (as I do believe such a feedback loop
>>> would be beneficial to other codec modules as well, and not only Opus).
>>> If you're interested in some more motivation behind this, you can read
>>> my discussion with Alexander Traud in a comment to his fork to the
>>> asterisk-opus repo here: https://github.com/traud/asterisk-opus/issues/3
>>>
>>>
>>> For the sake of completeness, Alexander himself thought of a possible
>>> approach for integrating this feedback in a comment to another post:
>>> https://github.com/seanbright/asterisk-opus/issues/25#issuec
>>> omment-249420010
>>> where the idea is to pass a reference of the ast_rtp_instance into the
>>> codec module itself. While this could possibly do the trick, I don't
>>> believe this would be a viable option, as it would break the
>>> architecture and module relationships, but I thought I'd mention it
>>> anyway.
>>>
>>> One possible option that I had thought about was extending ast_frame to
>>> convey RTCP feedback to modules, along with media to translate. This
>>> would allow such feedback to take the same "path" as media packets,
>>> meaning codec modules wouldn't need to be aware of any additional
>>> core-related feature, but only that sometimes they might receive control
>>> data instead of media to translate. Anyway, this could probably be
>>> problematic to integrate with the translator's architecture, and would
>>> probably need "cooperation" from channel modules as well, so may be a
>>> bit overkill and bug-prone.
>>>
>>
>> I don't think the ast_frame needs to be extended, but merely a new frame
>> type added that has a payload defined for this purpose. Right now reading
>> RTCP data returns a null frame. It can be changed to return this new frame
>> with the data. The ast_read function can recognize this frame type and
>> invoke what ever logic may be required. This can include giving this data
>> to the translation path (implementations can be changed to explicitly
>> define that they support it). This also does not alter the threading model
>> and gives a guarantee that the codec won't have to protect itself, like
>> would need to be done for a stasis message.
>>
>>
>
> Yes, a new ast_frame type is what I assumed would be a way as well, but as
> you point out it would indeed require some refactoring here and there to
> make the parties, and the translator core in particular, aware of that.
> Anyway, I guess this is fair and definitely makes sense and fits the
> architecture.
>
> I'll try to experiment with this approach as well and come back with some
> proof of concept code that shows this working.
>
>

Hi all,

I just completed a tentative patch that implements what we discussed. You
can find it attached (assuming mailman doesn't strip it), and I used the
Asterisk 14.1.2 as a basis if you want to try it. Can you give a look to
see if this is in line with what we discussed, and if as an approach it is
in line with the architecture philosophy of Asterisk itself?

To summarize, this is what I did:


   1. I added a new ast_frame frametype called AST_FRAME_RTCP (I guess we
   could re-use AST_FRAME_CONTROL with something like an AST_CONTROL_RTCP
   subclass, but that's semantics);
   2. I also added a new callback that translators can implement, called
   "feedback", that takes a pvt and a frame as parameters just as "framein";
   as a proof of concept, I implemented a placeholder method for codec_speex
   that prints the incoming feedback;
   3. when the RTP stack receives a SR/RR, ast_rtcp_read returns a frame
   containing a copy of the ast_rtp_rtcp_report object instead of a null frame;
   4. when __ast_read gets an AST_FRAME_RTCP frame, it checks if a "write"
   translator exists (as we want to notify the encoder, not the decoder) via
   ast_channel_writetrans(chan) and calls ast_translate over it to pass over
   the frame;
   5. when ast_translate gets an AST_FRAME_RTCP frame, it checks if the
   "feedback" callback exists for the specified translator, and if it does it
   sends the frame there;
   6. the feedback callback in the translator can be used by the codec to
   parse the ast_rtp_rtcp_report object and handle it accordingly.


Not sure if this is exactly what you meant or envisaged but, while probably
ugly, it seems to work fine, and does not affect codec modules not
interested in or aware of the feature.

That said, there are probably a few things to think about. For one, the
feedback is "routed" by __ast_read automatically, without involving channel
modules themselves. Not sure if this is good or not, but I thought it made
sense to have something that worked automatically no matter the channel,
and without requiring any update on their code either. Nevertheless, there
may be reasons I'm not aware of for having this go through channel modules
anyway, so let me know if that's indeed an issue.

Another potential issue in the patch in its current form is that I'm not
sure if codec translator chaining can happen in calls and if we should
care. If, for instance, there's a slin->slin16->codecX because codecX only
has "native" slin16->codecX support but we're bridging to slin, then this
approach may need be extended, namely to make sure the feedback gets to all
the pieces, or at least to the one that really matters (the last one?) In
fact, I guess ast_channel_writetrans(chan) would pass the feedback to
slin->slin16, and not to slin16->codecX which is where the real deal should
happen instead. Not exactly sure on how we could handle this: at the
moment, the feedback callback doesn't expect the module to return anything,
something we might change to see if we can have this feedback crawl up.
Alternatively this could be done automatically within translate.c itself,
who would be aware of such a chain. Again, not even sure this is an issue
but I thought I'd mention it.

To conclude, the patch doesn't currently do anything with the feedback. I
guess that at this stage that's beyond the point, as the main result I
wanted to achieve was putting up some sort of feedback mechanism in the
first place. Should you believe I'm on the right way I'll work a bit on the
codec stuff itself (and speex here is a good candidate as it does have many
knobs one can play with).

As a side note, I anticipated the code is ugly in its current form, and I
haven't checked if there can be leaks as it is. I've been away from
Asterisk coding for a while, so fresh eyes can probably help spot those, if
any! :-)

Hope this helps, looking forward to your thoughts!
Lorenzo



>
>>
>>> Another possible approach, and possibly the way to go, is to make use of
>>> the Stasis message bus, something I was not aware of until I watched
>>> Matt's excellent presentation at a conference recently. I saw how the
>>> RTP engine in Asterisk does publish RTCP feecback on the bus, and that
>>> you can subscribe to that as other events (as the HEP integration does,
>>> for instance) from other parts of the code. I tried doing the same
>>> within the Opus codec implementation, and apart from some quirks (e.g.,
>>> weird fields in report blocks, like negative source_ssrc) it seemed to
>>> do exactly what I was looking for. The only problem, though, is that
>>> while a Stasis event contains a whole lot of info, codec modules are
>>> pretty much clueless and have no way of matching a specific event
>>> related to a specific call to a translator context they're handling. In
>>> fact, AFAIK codec modules have no visibility at all of the channel a
>>> translator is associated with, or of other identifiers it could rely on.
>>> Thinking about this I did find a way to implement some kind of loose
>>> mapping by extending the ast_frame structure with two new properties,
>>> "ssrc" and "themssrc": basically, anytime an RTP packet is received, the
>>> RTP engine copies the local and remote SSRC to the frame before passing
>>> it to the core. When the first packet gets to the codec module, it can
>>> keep track of them and save them locally to its own internal struct. So,
>>> when a new event comes later from Stasis (e.g., an RTCP RR), it can look
>>> at the SSRC it relates to and match it with the SSRCs it is aware of, so
>>> that it knows it's related to a specific translator context and react
>>> accordingly. While this seems effective, it has a few issues, though.
>>> For instance, there's no way to assume remote SSRCs will be unique,
>>> which means this could result in either missing or misleading feedback
>>> in some cases, or even that they'll stay the same for the whole call.
>>> Another aspect I haven't considered is the possible overhead, although I
>>> don't think crawling a JSON object should take much resources. Besides,
>>> I still haven't understood how Asterisk hashtables work, so this part is
>>> still just theory :-)
>>>
>>
>> Personally I'm not a huge fan of using the stasis messages for exactly
>> the reason you've given in your follow-up email. There's no obvious
>> reliable data to key off of and it also requires synchronization between
>> the thread handling the message and the thread doing the transcoding. This
>> changes the threading model for transcoding enough to be of a concern to me
>> and also introduces another lock.
>>
>>
>
> I had thought about synchronization and locking may be easily avoided if
> you just keep track of what needs to be changed when receiving the message,
> and only enforce it when you get the next packet you need to transcode. But
> yes, it does make things more complicated threading-wise, especially
> considering that even just keeping track of things may require atomic
> operations to be safe.
>
> Thanks,
> Lorenzo
>
>
>
>>
>>> That's basically it. Do you have any feeling/feedback on this? Is this
>>> actually worth investigating, and do you believe I'm on the right track?
>>> Any suggestion on how to make the mapping more effective in the codec
>>> module? I'm of course willing to contribute to such an effort, if it's
>>> deemed worthwhile.
>>>
>>
>> Cheers,
>>
>> --
>> Joshua Colp
>> Digium, Inc. | Senior Software Developer
>> 445 Jan Davis Drive NW - Huntsville, AL 35806 - US
>> Check us out at: www.digium.com & www.asterisk.org
>>
>> --
>> _____________________________________________________________________
>> -- Bandwidth and Colocation Provided by http://www.api-digital.com --
>>
>> asterisk-dev mailing list
>> To UNSUBSCRIBE or update options visit:
>>   http://lists.digium.com/mailman/listinfo/asterisk-dev
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20161111/70693be1/attachment-0001.html>
-------------- next part --------------
diff -urN asterisk-14.1.2/codecs/codec_speex.c asterisk-14.1.2-PATCHED/codecs/codec_speex.c

--- asterisk-14.1.2/codecs/codec_speex.c	2016-11-10 20:43:02.000000000 +0100
+++ asterisk-14.1.2-PATCHED/codecs/codec_speex.c	2016-11-11 14:38:04.408658123 +0100
@@ -57,6 +57,9 @@
 #include "asterisk/frame.h"
 #include "asterisk/linkedlists.h"
 
+/* For struct ast_rtp_rtcp_report and struct ast_rtp_rtcp_report_block */
+#include "asterisk/rtp_engine.h"
+
 /* codec variables */
 static int quality = 3;
 static int complexity = 2;
@@ -344,6 +347,21 @@
 	return result;
 }
 
+/*! \brief handle incoming RTCP feedback and possibly edit encoder settings */
+static void lintospeex_feedback(struct ast_trans_pvt *pvt, struct ast_frame *feedback)
+{
+	/* TODO */
+	ast_verb(3, "Got RTCP!\n");
+	struct ast_rtp_rtcp_report *rtcp_report = (struct ast_rtp_rtcp_report *)feedback->data.ptr;
+	ast_verb(3, "  -- type=%d\n", rtcp_report->type);
+	ast_verb(3, "  -- rc=%d\n", rtcp_report->reception_report_count);
+	if(rtcp_report->reception_report_count > 0) {
+		struct ast_rtp_rtcp_report_block *report_block = rtcp_report->report_block[0];
+		ast_verb(3, "  -- -- ssrc=%u\n", report_block->source_ssrc);
+		ast_verb(3, "  -- -- fraction=%u\n", report_block->lost_count.fraction);
+	}
+}
+
 static void speextolin_destroy(struct ast_trans_pvt *arg)
 {
 	struct speex_coder_pvt *pvt = arg->pvt;
@@ -402,6 +420,7 @@
 	.newpvt = lintospeex_new,
 	.framein = lintospeex_framein,
 	.frameout = lintospeex_frameout,
+	.feedback = lintospeex_feedback,
 	.destroy = lintospeex_destroy,
 	.sample = slin8_sample,
 	.desc_size = sizeof(struct speex_coder_pvt),
@@ -448,6 +467,7 @@
 	.newpvt = lin16tospeexwb_new,
 	.framein = lintospeex_framein,
 	.frameout = lintospeex_frameout,
+	.feedback = lintospeex_feedback,
 	.destroy = lintospeex_destroy,
 	.sample = slin16_sample,
 	.desc_size = sizeof(struct speex_coder_pvt),
@@ -493,6 +513,7 @@
 	.newpvt = lin32tospeexuwb_new,
 	.framein = lintospeex_framein,
 	.frameout = lintospeex_frameout,
+	.feedback = lintospeex_feedback,
 	.destroy = lintospeex_destroy,
 	.desc_size = sizeof(struct speex_coder_pvt),
 	.buffer_samples = BUFFER_SAMPLES,
diff -urN asterisk-14.1.2/include/asterisk/frame.h asterisk-14.1.2-PATCHED/include/asterisk/frame.h
--- asterisk-14.1.2/include/asterisk/frame.h	2016-11-10 20:43:02.000000000 +0100
+++ asterisk-14.1.2-PATCHED/include/asterisk/frame.h	2016-11-11 14:37:08.986415407 +0100
@@ -127,6 +127,8 @@
 	 * directly into bridges.
 	 */
 	AST_FRAME_BRIDGE_ACTION_SYNC,
+	/*! RTCP feedback */
+	AST_FRAME_RTCP,
 };
 #define AST_FRAME_DTMF AST_FRAME_DTMF_END
 
diff -urN asterisk-14.1.2/include/asterisk/translate.h asterisk-14.1.2-PATCHED/include/asterisk/translate.h
--- asterisk-14.1.2/include/asterisk/translate.h	2016-11-10 20:43:02.000000000 +0100
+++ asterisk-14.1.2-PATCHED/include/asterisk/translate.h	2016-11-11 14:37:08.987415413 +0100
@@ -121,7 +121,7 @@
  *
  * As a minimum, a translator should supply name, srcfmt and dstfmt,
  * the required buf_size (in bytes) and buffer_samples (in samples),
- * and a few callbacks (framein, frameout, sample).
+ * and a few callbacks (framein, frameout, feedback, sample).
  * The outbuf is automatically prepended by AST_FRIENDLY_OFFSET
  * spare bytes so generic routines can place data in there.
  *
@@ -159,6 +159,10 @@
 	                                       /*!< Output frame callback. Generate a frame 
 	                                        *   with outbuf content. */
 
+	void (*feedback)(struct ast_trans_pvt *pvt, struct ast_frame *feedback);
+	                                       /*!< Feedback frame callback. Handle 
+	                                        *   input frame. */
+
 	void (*destroy)(struct ast_trans_pvt *pvt);
 	                                       /*!< cleanup private data, if needed 
 	                                        *   (often unnecessary). */
diff -urN asterisk-14.1.2/main/channel.c asterisk-14.1.2-PATCHED/main/channel.c
--- asterisk-14.1.2/main/channel.c	2016-11-10 20:43:02.000000000 +0100
+++ asterisk-14.1.2-PATCHED/main/channel.c	2016-11-11 14:36:25.932189549 +0100
@@ -1476,6 +1476,7 @@
 	case AST_FRAME_IAX:
 	case AST_FRAME_CNG:
 	case AST_FRAME_MODEM:
+	case AST_FRAME_RTCP:
 		return 0;
 	}
 	return 0;
@@ -2832,6 +2833,7 @@
 				case AST_FRAME_IMAGE:
 				case AST_FRAME_HTML:
 				case AST_FRAME_MODEM:
+				case AST_FRAME_RTCP:
 					done = 1;
 					break;
 				case AST_FRAME_CONTROL:
@@ -4284,6 +4286,13 @@
 			 */
 			ast_read_generator_actions(chan, f);
 			break;
+		case AST_FRAME_RTCP:
+			/* Incoming RTCP feedback needs to get to the translator for
+			 * outgoing media, which means we treat it as an ast_write */
+			if(ast_channel_writetrans(chan))
+				ast_translate(ast_channel_writetrans(chan), f, 0);
+			ast_frfree(f);
+			f = &ast_null_frame;
 		default:
 			/* Just pass it on! */
 			break;
diff -urN asterisk-14.1.2/main/frame.c asterisk-14.1.2-PATCHED/main/frame.c
--- asterisk-14.1.2/main/frame.c	2016-11-10 20:43:02.000000000 +0100
+++ asterisk-14.1.2-PATCHED/main/frame.c	2016-11-11 14:36:25.930189538 +0100
@@ -535,6 +535,8 @@
 			break;
 		}
 		break;
+	case AST_FRAME_RTCP:
+		ast_copy_string(subclass, "RTCP", slen);
 	default:
 		ast_copy_string(subclass, "Unknown Subclass", slen);
 		break;
@@ -586,6 +588,9 @@
 	case AST_FRAME_VIDEO:
 		ast_copy_string(ftype, "Video", len);
 		break;
+	case AST_FRAME_RTCP:
+		ast_copy_string(ftype, "RTCP", len);
+		break;
 	default:
 		snprintf(ftype, len, "Unknown Frametype '%u'", frame_type);
 		break;
@@ -623,6 +628,9 @@
 	if (f->frametype == AST_FRAME_VIDEO) {
 		return;
 	}
+	if (f->frametype == AST_FRAME_RTCP) {
+		return;
+	}
 
 	ast_frame_type2str(f->frametype, ftype, sizeof(ftype));
 	ast_frame_subclass2str(f, subclass, sizeof(subclass), moreinfo, sizeof(moreinfo));
diff -urN asterisk-14.1.2/main/translate.c asterisk-14.1.2-PATCHED/main/translate.c
--- asterisk-14.1.2/main/translate.c	2016-11-10 20:43:02.000000000 +0100
+++ asterisk-14.1.2-PATCHED/main/translate.c	2016-11-11 14:36:25.931189544 +0100
@@ -532,6 +532,14 @@
 	long len;
 	int seqno;
 
+	if(f->frametype == AST_FRAME_RTCP) {
+		/* Just pass the feedback to the right callback, if it exists.
+		 * This "translation" does nothing so return a null frame. */
+		if(p->t->feedback)		
+			p->t->feedback(p, f);
+		return &ast_null_frame;		
+	}
+
 	has_timing_info = ast_test_flag(f, AST_FRFLAG_HAS_TIMING_INFO);
 	ts = f->ts;
 	len = f->len;
diff -urN asterisk-14.1.2/res/res_rtp_asterisk.c asterisk-14.1.2-PATCHED/res/res_rtp_asterisk.c
--- asterisk-14.1.2/res/res_rtp_asterisk.c	2016-11-10 20:43:02.000000000 +0100
+++ asterisk-14.1.2-PATCHED/res/res_rtp_asterisk.c	2016-11-11 14:37:29.235508267 +0100
@@ -4236,6 +4236,28 @@
 					rtcp_report,
 					message_blob);
 			ast_json_unref(message_blob);
+
+			/* Return an AST_FRAME_RTCP frame with the ast_rtp_rtcp_report
+			 * object as a its data */
+			rtp->f.frametype = AST_FRAME_RTCP;
+			rtp->f.data.ptr = rtp->rawdata + AST_FRIENDLY_OFFSET;
+			memcpy(rtp->f.data.ptr, rtcp_report, sizeof(struct ast_rtp_rtcp_report));
+			rtp->f.datalen = sizeof(struct ast_rtp_rtcp_report);
+			if(rc > 0) {
+				/* There's always a single report block stored, here */
+				report_block = rtp->f.data.ptr + rtp->f.datalen + sizeof(struct ast_rtp_rtcp_report_block *);
+				memcpy(report_block, rtcp_report->report_block[0], sizeof(struct ast_rtp_rtcp_report_block));
+				struct ast_rtp_rtcp_report *rtcp_report2 = (struct ast_rtp_rtcp_report *)rtp->f.data.ptr;
+				rtcp_report2->report_block[0] = report_block;
+				rtp->f.datalen += sizeof(struct ast_rtp_rtcp_report_block);
+			}
+			rtp->f.offset = AST_FRIENDLY_OFFSET;
+			rtp->f.samples = 0;
+			rtp->f.mallocd = 0;
+			rtp->f.delivery.tv_sec = 0;
+			rtp->f.delivery.tv_usec = 0;
+			rtp->f.src = "RTP";
+			f = &rtp->f;
 			break;
 		case RTCP_PT_FUR:
 		/* Handle RTCP FIR as FUR */