[asterisk-dev] Asterisk 14 - Remote URI Playback

Wed Nov 5 01:45:15 CST 2014

On 11/4/14, 3:40 PM, Matthew Jordan wrote:
> On Tue, Nov 4, 2014 at 12:57 PM, BJ Weschke <bweschke at btwtech.com> wrote:
>>   Matt -
>>
>>   This is a pretty neat idea, indeed, but I've got some questions/thoughts on
>> implementation. :-)   Apologies if all of this was already
>> considered/accounted for already..
>>
>>   1) Does the entire file need to be downloaded and in place on the HTTP
>> Media Cache before you can call an ast_openstream on it? This could cause
>> some problems with larger files not sitting on a fat pipe local to the
>> Asterisk instance.
> It does need to be completely on the local file system, which would be
> a problem for extremely large files and/or slow network connections.
>
> The ability to do an 'asynchronous' version of this is not really
> present. The filestream code in the core of Asterisk doesn't have
> anything present that would allow it to buffer the file partially
> before playing back with some expected max size. If we went down that
> road, it'd almost be a completely separate filestream concept from
> what we have today, which is pretty non-trivial.
>
> I don't think I have a good solution for really large files just yet.
> There's some ways to do this using cURL (where we get back a chunk of
> binary data, buffer it, and immediately start turning it into frames
> for a channel) - but that feels like it would need a lot of work,
> since we'd be essentially creating a new remote filestream type.
  I know there's going to be a large population of Asterisk users that 
will want the simplicity of just specifying a URI for playback and 
expecting "sorcery" to happen. A decent number of them may even be OK 
with what may be a sub-second awkward silence to the caller on the line 
while things like the servicing thread synchronously queues the URI 
resource into the local HTTP media cache before playback. That's 
probably going to be an acceptable experience for a decent number of 
functional use cases. However, I think one somewhat common use case 
where this wouldn't go so well would be a list of URI resources that 
weren't already in HTTP media cache since they'd be fetched serially 
in-line at the time where playback really should be starting and block 
the channel with silence until the resource is set in media cache.

  eg - 
Playback(http://myserver.com/monkeys.wav&http://myserver.com/can.wav&http://myserver.com/act.wav&http://myserver.com/like.wav&http://myserver.com/weasels.wav) 
<--- On an empty HTTP Media cache, the previous app invocation would 
probably sound pretty bad to the first caller going through this 
workflow.  :-)

  Also, I think the inability to use & in a URI for playback really 
limits the usefulness of this change. I totally understand why the 
typical URI decode doesn't work, but perhaps a combination of a URI 
encoded & with an HTML entity representation is a suitable alternative?  
eg - (%26amp; == & in a URI in Playback and do that pattern replacement 
in the URI before any other URI decoding/encoding operations. Ya, I 
know, it's a hack, but not allowing multiple parameters in a loaded 
queryString URL is way too restricting IMHO).

>>   2) What kind of locking is in place on the design to prevent HTTP Media
>> Cache from trying to update an expired resource that's already in the middle
>> of being streamed to a channel?
> Items in the cache are reference counted, so if something is using an
> item in the cache while the cache is being purged, that is safely
> handled. The buckets API (which is based on sorcery) assumes a 'if
> you're using it, you can hold it safely while something else swaps it
> out' model of management - so it is safe to update the entry in the
> cache with something new while something else uses the old cached
> entry. The 'local file name' associated with the URI would be created
> with mkstemp, so the risk of collision with local file names is low.
>
> In the same fashion, a local file that is currently open and being
> streamed has a reference associated with it in the OS. Calling unlink
> on it will not cause the file to be disposed of until it is released.

  I had to do a little bit of reading up on the Bucket File API, but 
yes, that definitely resolves the concern I had, and that's pretty cool. 
:-)
>>   3) I think you need to also introduce a PUT method on HTTP Media Cache
>> because I can think of a bunch of scenarios where having a write operation
>> on func_curl may be lacking in the file needing to be retrieved (eg - trying
>> to pull ACL'd media from an S3 volume where you need custom HTTP request
>> headers, etc). We shouldn't try to architect/design for all of these
>> scenarios in Asterisk via a write operation on func_curl and a PUT to HTTP
>> Media Cache seems like a reasonable approach to handle that.
>>
> I had thought about this, but didn't have a strong use case for it -
> thanks for providing one!
>
> How about something like:
>
> GET /media_cache - retrieve a List of [Sound] in the cache
> PUT /media_cache (note: would need to have parameters passed in the body)
>      uri=URI to retrieve the media from
>      headers=JSON list of key/value pairs to pass with the uri
> DELETE /media_cache?uri
>      uri=URI to remove from the cache
>
> Sounds data model would be updated with something like the following:
>    "uri": {
>         "required": false,
>         "description": "If retrieved from a remote source, the
> originating URI of the sound",
>         "type": "string"
>     },
>     "local_timestamp": {
>         "required": false,
>         "description": "Creation timestamp of the sound on the local system",
>         "type": "datetime"
>     },
>     "remote_timestamp": {
>          "required": false,
>          "description": "Creation timestamp of the sound as known by
> the remote system (if remote)",
>          "type": "datetime"
>     }
>
>
  Well, kind of. I think you're still envisioning using CURL behind the 
scenes using the input provided in the JSON body of the PUT to 
/media_cache to go and grab the resource from the remote server. If you 
go that way, I think not only should we handle custom headers, but it's 
probably also not unreasonable to provide a way to do basic/digest 
authentication for the GET call as well. However, instead of that, I had 
envisioned being able to do a PUT to /media_cache as a multipart MIME 
request where one part is the JSON descriptor and the second part is the 
binary resource itself you're looking to place into HTTP Media cache. 
The advantage of doing things this way is that if you're running call 
control via some sort of API, that API will know for certain when 
files/resources are ready to be played back and you don't run the risk 
of the awkward blocking silence scenario that you have above. However, 
when you do it this way, the URI description/parameter itself doesn't 
make too much sense because it's not really where the resource came 
from. I guess there's also a question as to whether or not we follow the 
true REST practice with using POST for a brand new resource and PUT for 
updates to existing resources.

  As for the timestamps for deciding whether the local cache is dirty, I 
don't think we should try to reinvent the wheel here. We should stick 
what's already well established for stuff like this and use the entity 
tag (Etag) response header stored and then use the "If-None-Match" 
request header approach. Google does a much better job of explaining it 
than I can here: 
https://developers.google.com/web/fundamentals/performance/optimizing-content-efficiency/http-caching