<html>
<head>
<base href="https://wiki.asterisk.org/wiki">
<link rel="stylesheet" href="/wiki/s/en/2166/4/9/_/styles/combined.css?spaceKey=TOP&forWysiwyg=true" type="text/css">
</head>
<body style="background: white;" bgcolor="white" class="email-body">
<div id="pageContent">
<div id="notificationFormat">
<div class="wiki-content">
<div class="email">
<h2><a href="https://wiki.asterisk.org/wiki/display/TOP/Media+Operations">Media Operations</a></h2>
<h4>Page <b>edited</b> by <a href="https://wiki.asterisk.org/wiki/display/~kpfleming">Kevin P. Fleming</a>
</h4>
<div id="versionComment">
<b>Comment:</b>
reduced size of Gliffy diagrams<br />
</div>
<br/>
<h4>Changes (3)</h4>
<div id="page-diffs">
<table class="diff" cellpadding="0" cellspacing="0">
<tr><td class="diff-snipped" >...<br></td></tr>
<tr><td class="diff-unchanged" >The session gateway, upon setting up the G.711 u-Law stream source for the session, will inspect the configured media operations to employ. First, the pitch shifter is looked up. The pitch shifter is found using the service locator, and its {{MediaOperationFactory}} proxy is returned. From it, a {MediaOperation}} is instantiated. <br> <br></td></tr>
<tr><td class="diff-changed-lines" ><span class="diff-changed-words">{gliffy:name=media_flow_mismatch|align=left|size=<span class="diff-deleted-chars"style="color:#999;background-color:#fdd;text-decoration:line-through;">L</span><span class="diff-added-chars"style="background-color: #dfd;">M</span>|version=3}</span> <br></td></tr>
<tr><td class="diff-unchanged" > <br>However, this on its own cannot work. The 8 kHz signed linear sink of the pitch shifter cannot accept G.711 u-Law media frames from the media stream source. The session gateway calls into the service locator to find a {{MediaOperation}} that can translate between G.711 u-Law and 8 kHz signed linear. The service locator attempts to find a media operation with {{MediaOperationAttributes}} that satisfy this need. In this case, the Media Transformation service's wizardry is not required because there is a single {{MediaOperation}} that can translate between the two formats, and the service locator knows about its {{MediaOperationFactory}}. <br> <br></td></tr>
<tr><td class="diff-changed-lines" ><span class="diff-changed-words">{gliffy:name=media_flow|align=left|size=<span class="diff-deleted-chars"style="color:#999;background-color:#fdd;text-decoration:line-through;">L</span><span class="diff-added-chars"style="background-color: #dfd;">M</span>|version=3}</span> <br></td></tr>
<tr><td class="diff-unchanged" > <br>h1. Example 2 <br></td></tr>
<tr><td class="diff-snipped" >...<br></td></tr>
<tr><td class="diff-unchanged" >Let's consider a case dealing with a bridge instead. In this scenario, two sessions wish to communicate. One session's source audio is G.722 and the sink provided by the other session is G.711 u-Law. The bridge, upon seeing that the two formats are incompatible, will request that the service locator find a {{MediaOperation}} with {{MediaOperationAtrributes}} that satisfy this transformation. In this case, the Media Transformation service has registered such a {{MediaOperationFactory}} with the service locator. In reality, the Media Transformation service has derived this {{MediaOperation}} from three separate operations. The picture below illustrates how the Media Transformation service chains the operations together in order to create what appears to be a single {{MediaOperation}}. <br> <br></td></tr>
<tr><td class="diff-changed-lines" ><span class="diff-changed-words">{gliffy:name=bridge_media_flow|align=left|size=<span class="diff-deleted-chars"style="color:#999;background-color:#fdd;text-decoration:line-through;">L</span><span class="diff-added-chars"style="background-color: #dfd;">M</span>|version=4}</span> <br></td></tr>
</table>
</div> <h4>Full Content</h4>
<div class="notificationGreySide">
<div class='panelMacro'><table class='warningMacro'><colgroup><col width='24'><col></colgroup><tr><td valign='top'><img src="/wiki/images/icons/emoticons/forbidden.gif" width="16" height="16" align="absmiddle" alt="" border="0"></td><td>This is a work in progress.</td></tr></table></div>
<h1><a name="MediaOperations-Introduction"></a>Introduction</h1>
<p>In Asterisk SCF, a "media operation" refers to a service which manipulates media. Commonly used types of media operations include transcoding (codec/format conversion), transrating (sample rate conversion), jitter removal, volume adjustment, pitch shifting, tone detection and generation, and more. What follows is a discussion of how media operations fit into the architecture of Asterisk SCF, and how they can be used.</p>
<p>Without such media operations, Asterisk SCF's media handling would be straightforward, but not very effective. Each media stream on a session provides a <tt>source</tt> and a <tt>sink</tt>. A stream source provides a stream of media frames, and a stream sink accepts a stream of media frames. The <tt>source</tt> from one media stream can be connected to the <tt>sink</tt> of another media stream (assuming their formats are compatible) in order to connect the media flow between those streams; the bridging service does this for simple two-session bridges, for example.</p>
<h1><a name="MediaOperations-Categoriesofmediaoperations"></a>Categories of media operations</h1>
<p>Media operations can be categorized as follows:</p>
<ul>
        <li>Transforming: Transforming operations modify incoming media frames into an different format before sending them onwards. The change to the media may be a format change or a change to the parameters of the format. Examples of transforming media operations are transcoding and transrating. Transforming operations are not usually put into a media stream path by choice (as a result of configuration, for example), but rather they are put there in order to allow two media streams of incompatible formats to be able to exchange media frames.</li>
</ul>
<ul>
        <li>Adjusting: Adjusting operations are those that make a change to the media without changing its compatibility in any way. Examples of adjusting media operations are jitter buffering and volume adjusting. Adjusting media operations will frequently be put into a media stream path by choice (as a result of configuration, for example) in order to enhance the user experience, but may also be required in some situations to complete an otherwise <b>transforming</b> media stream path.</li>
</ul>
<h1><a name="MediaOperations-Useofmediaoperations"></a>Use of media operations</h1>
<p>Media operations are valid to use anywhere that stream-oriented media exists. In practice, this means that there are two levels at which media operations may be used:</p>
<ul>
        <li>Session: Session level media operations will affect media to or from a specific session. This can be useful for operations that should not affect all parties in a bridged call. If a conference participant has hearing issues, then it may be good to place a volume adjustment media operation for media going to his session.</li>
</ul>
<ul>
        <li>Bridge level. Bridge level media operations will affect media to and/or from all sessions involved in a particular bridge. This type will be more rare.</li>
</ul>
<p>How does a session or bridge know to use media operations? There are two ways to use them:</p>
<ul>
        <li>Configuration. Endpoints and bridges may be configured to have specific media operations used by default.</li>
        <li>Hooks. Session creation hooks and bridge creation hooks are ways to dynamically insert media operations for a session or bridge.</li>
</ul>
<h1><a name="MediaOperations-MediaOperationSlicedefinitions"></a>Media Operation Slice definitions</h1>
<div class="code panel" style="border-width: 1px;"><div class="codeContent panelContent">
<pre class="theme: Confluence; brush: java; gutter: false">interface MediaOperation
{
AsteriskSCF::Media::V1::StreamSource* getSource();
AsteriskSCF::Media::V1::StreamSink* getSink();
void destroy();
};
interface MediaOperationFactory
{
/**
* Create an instance of a MediaOperation.
* @param source if provided, will be used only to determine the media format that will
* supplied to the desired MediaOperation
* @param sink if provided, will be used only to determine the media format that the
* desired MediaOperation must produce
*/
MediaOperation* createMediaOperation(
AsteriskSCF::Media::V1::StreamSource* source,
AsteriskSCF::Media::V1::StreamSink* sink);
};</pre>
</div></div>
<p>When a media operation is needed to transform a media stream, an Asterisk SCF component can locate a suitable <tt>MediaOperationFactory</tt> using the Service Locator, and specifying the source and sink formats that the desired <tt>MediaOperation</tt> must support. If a suitable <tt>MediaOperationFactory</tt> is located, it can be used to construct <tt>MediaOperation</tt> instances, each of which can be used to handle a single media stream.</p>
<p>Some <tt>MediaOperationFactory</tt> services will only be able to instantiate a single type of <tt>MediaOperation</tt>, that support a single source format and a single sink format. Others will be able to construct a variety of <tt>MediaOperation</tt> instances. </p>
<p>For <tt>MediaOperationFactory</tt> services that can provide a variety of <tt>MediaOperation</tt> instances, in order to describe to the <tt>MediaOperationFactory</tt> the <b>specific</b> source and sink formats that are needed, the <tt>createMediaOperation</tt> operation accepts a <tt>source</tt> parameter and a <tt>sink</tt> parameter. The <tt>MediaOperationFactory</tt> will learn from the source and sink what formats they support, and will then attempt to instantiate a <tt>MediaOperation</tt> that can transform the media from the source's format to the sink's format.</p>
<p>When the <tt>MediaOperation</tt> object is no longer needed, its <tt>destroy</tt> operation should be invoked so that it can release any resources it may have been using.</p>
<h1><a name="MediaOperations-CombiningMediaOperations"></a>Combining Media Operations</h1>
<p>In many situations it will be necessary to combine two or more transforming <tt>MediaOperation</tt> instances in order to achieve the desired transformations. Rather than requiring every Asterisk SCF service that needs to connect media streams to know how to connect <tt>MediaOperation</tt> instances together, and more importantly, to know how to choose an optimal sequence of <tt>MediaOperation</tt> instances, Asterisk SCF will include a Media Transformation service.</p>
<p>This service will maintain a graph of available source/sink formats and their ideal translation paths. Given any two formats, the Media Transformation service will be able to construct a path of <tt>MediaOperation</tt> isntances to achieve the required transformations.</p>
<p>In order to accomplish this, the Media Transformation service will listen for events emitted by the Service Locator; when an event indicates that a service has been registered, the Media Transformation service will request the details of that service's registration. If the newly registered service is a transforming <tt>MediaOperationFactory</tt>, the Media Transformation service will extract the attributes that the factory supports, and add them to its transformation graph.</p>
<div class="code panel" style="border-width: 1px;"><div class="codeContent panelContent">
<pre class="theme: Confluence; brush: java; gutter: false">struct MediaOperationAttributes
{
/**
* The input format for a specific operation
*/
AsteriskSCF::Media::V1::Format inputFormat;
/**
* The output format for a specific operation
*/
AsteriskSCF::Media::V1::Format outputFormat;
/**
* The cost of the operation.
* Lower cost indicates an "easier" translation,
* either because it is faster or uses fewer resources.
*/
int cost;
};
sequence<MediaOperationAttributes> MediaOperationAttributesSeq;
unsliceable class MediaOperationServiceLocatorParams extends AsteriskSCF::Core::Discovery::V1::ServiceLocatorParams
{
MediaOperationAttributeSeq attributes;
}</pre>
</div></div>
<p>The attributes provided for the <tt>MediaOperationFactory</tt> describe pairs of source/sink formats, and the approximate 'cost' to transform a frame of media between the formats in a pair. The Media Transformation service can treat these pairs as new weighted directed edges to add to its graph of transformations. As new edges are added, the Media Transformation service can determine new paths that are available and update its own registration with the Service Locator. As a result, the Service Locator will now be able to successfully answer location requests for a <tt>MediaOperationFactory</tt> instance that can accomplish these multi-step transformations, even though no single <tt>MediaOperation</tt> in the system can provide it.</p>
<p>Instead, the Media Transformation service <b>itself</b> will act as a <tt>MediaOperationFactory</tt>, which is capable of instantiating a <tt>MediaOperation</tt> instance for any source/sink format transformation that is available in the transformation graph. When the Media Transformation's own <tt>createMediaOperation</tt> operation is invoked, the resulting <tt>MediaOperation</tt> will actually internally hold two (or more) lower-level <tt>MediaOperation</tt> instances, that when chained together result in the desired transformation. As a result, the consumer of the top-level <tt>MediaOperation</tt> that was requested receives what it asked for, even though a path of <tt>MediaOperation</tt> instances had to be constructed in order to do it.</p>
<h1><a name="MediaOperations-Example1"></a>Example 1</h1>
<div class='panelMacro'><table class='warningMacro'><colgroup><col width='24'><col></colgroup><tr><td valign='top'><img src="/wiki/images/icons/emoticons/forbidden.gif" width="16" height="16" align="absmiddle" alt="" border="0"></td><td>While the examples below help to illustrate how <tt>MediaOperations</tt> are established, they do not go into detail about how to properly chain <tt>MediaOperations</tt> together. That is for a future discussion.</td></tr></table></div>
<p>Let's say that a session gateway creates a session with an endpoint. We will be receiving G.711 u-Law audio from this endpoint. This endpoint has been configured to have a pitch shift media operation on its incoming media path. The pitch shifter requires 8 kHz signed linear audio as input and outputs the same (since it is an adjusting media operation).</p>
<p>The session gateway, upon setting up the G.711 u-Law stream source for the session, will inspect the configured media operations to employ. First, the pitch shifter is looked up. The pitch shifter is found using the service locator, and its <tt>MediaOperationFactory</tt> proxy is returned. From it, a {MediaOperation}} is instantiated.</p>
<table width="100%">
<tr>
<td align="left">
<table>
<caption align="bottom">
</caption>
<tr>
<td>
<img style="border: none; width: 807px; height: 302px;"
usemap="#GLIFFY_MAP_17203407_media_flow_mismatch"
src="/wiki/download/attachments/17203407/media_flow_mismatch.png?version=3&modificationDate=1312580451441"
alt="A&#32;Gliffy&#32;Diagram&#32;named&#58;&#32;media&#95;flow&#95;mismatch"/>
</td>
</tr>
</table>
</td>
</tr>
</table>
<p>However, this on its own cannot work. The 8 kHz signed linear sink of the pitch shifter cannot accept G.711 u-Law media frames from the media stream source. The session gateway calls into the service locator to find a <tt>MediaOperation</tt> that can translate between G.711 u-Law and 8 kHz signed linear. The service locator attempts to find a media operation with <tt>MediaOperationAttributes</tt> that satisfy this need. In this case, the Media Transformation service's wizardry is not required because there is a single <tt>MediaOperation</tt> that can translate between the two formats, and the service locator knows about its <tt>MediaOperationFactory</tt>.</p>
<table width="100%">
<tr>
<td align="left">
<table>
<caption align="bottom">
</caption>
<tr>
<td>
<img style="border: none; width: 807px; height: 292px;"
usemap="#GLIFFY_MAP_17203407_media_flow"
src="/wiki/download/attachments/17203407/media_flow.png?version=3&modificationDate=1312580495332"
alt="A&#32;Gliffy&#32;Diagram&#32;named&#58;&#32;media&#95;flow"/>
</td>
</tr>
</table>
</td>
</tr>
</table>
<h1><a name="MediaOperations-Example2"></a>Example 2</h1>
<p>Let's consider a case dealing with a bridge instead. In this scenario, two sessions wish to communicate. One session's source audio is G.722 and the sink provided by the other session is G.711 u-Law. The bridge, upon seeing that the two formats are incompatible, will request that the service locator find a <tt>MediaOperation</tt> with <tt>MediaOperationAtrributes</tt> that satisfy this transformation. In this case, the Media Transformation service has registered such a <tt>MediaOperationFactory</tt> with the service locator. In reality, the Media Transformation service has derived this <tt>MediaOperation</tt> from three separate operations. The picture below illustrates how the Media Transformation service chains the operations together in order to create what appears to be a single <tt>MediaOperation</tt>.</p>
<table width="100%">
<tr>
<td align="left">
<table>
<caption align="bottom">
</caption>
<tr>
<td>
<img style="border: none; width: 889px; height: 366px;"
usemap="#GLIFFY_MAP_17203407_bridge_media_flow"
src="/wiki/download/attachments/17203407/bridge_media_flow.png?version=4&modificationDate=1311371701128"
alt="A&#32;Gliffy&#32;Diagram&#32;named&#58;&#32;bridge&#95;media&#95;flow"/>
</td>
</tr>
</table>
</td>
</tr>
</table>
</div>
<div id="commentsSection" class="wiki-content pageSection">
<div style="float: right;" class="grey">
<a href="https://wiki.asterisk.org/wiki/users/removespacenotification.action?spaceKey=TOP">Stop watching space</a>
<span style="padding: 0px 5px;">|</span>
<a href="https://wiki.asterisk.org/wiki/users/editmyemailsettings.action">Change email notification preferences</a>
</div>
<a href="https://wiki.asterisk.org/wiki/display/TOP/Media+Operations">View Online</a>
|
<a href="https://wiki.asterisk.org/wiki/pages/diffpagesbyversion.action?pageId=17203407&revisedVersion=30&originalVersion=29">View Changes</a>
|
<a href="https://wiki.asterisk.org/wiki/display/TOP/Media+Operations?showComments=true&showCommentArea=true#addcomment">Add Comment</a>
</div>
</div>
</div>
</div>
</div>
</body>
</html>