[asterisk-dev] adjusting the playback speed of voicemail messages.

Mon Apr 14 19:59:20 CDT 2008

>> <snip>
>>
>>     
>>> I don't know 
>>> enough about how JACK works to say what a filter would look like to 
>>> slow down or speed up audio without distorting the pitch.  Let us 
>>> know if this is the right solution, and the methods and details of 
>>> any successful tests!
>>>       
>
> Jack just passes blocks of floating point samples around, what DSP is
> done is entirely up to the jack applications.
>
> There are two open source libraries doing the constant pitch at variable
> speed thing that I know of, Rubberband (written in C++) and
> libsoundtouch which is in C. 
>
> Both are fundamentally phase vocoders but they have different tradeoffs,
> they both work and may be worth looking into. 
>
>   
Hi all,

Thank you all for your feedback and ideas on this.

The JACK API in the 1.6/trunk sounds interesting, however we are still 
on 1.4 and likely will be slow to transition. Plus it was not clear to 
me how I would be able to in real time increase or decrease the speed of 
playback with the phone keypad (like how is currently done to skip ahead 
/ back with the #, * keys,  (those call the seek() function of the 
ast_format interface).

And the libsoundtouch looked nice as well, but we decided it was just ok 
to have chipmunk sounding voices, in favoring of the reduced CPU effort 
(by not needing to FFT the samples in the vocoder algorithm), and just 
transform the sample rates in a small window buffer, plus that needing 
to make separate offline temporary files, I just couldn't think of good 
sensible ways to do that so that there would be that real time experience.

What I ended up using for now was libsamplerate, (and although not 
entirely necessary, libsndfile, because it has an API for reading the 
samples data values as floats, instead of the simple .wav file 
implementation, which is convenient when working with small buffered 
windows for the libsamplerate). Actually the libsamplerate comes with an 
example to vary the speed of playback of an audio file. It does make the 
voice sound like chipmunks when it is faster, and lower pitch when 
slower, but for the purposes of our current needs, that is acceptable, 
our clients old voicemail system apparently did that, and apparently 
they are used to and prefer to hear it that way. And it was important 
that I get something of useable quality working sooner than it would 
take me to brush the dust off of my calculus books to understand and 
develop a custom built vocoder, and the libsamplerate seems to do 
resampling good-enough with its API.  And to be considerate for those 
concerns about patents, hopefully the existing and commonly available 
GNU licensed projects that make libsamplerate would be done in a way 
that is mindful of any patents.

To make this concept work in asterisk, I created new version of the 
ast_control_streamfile (i called it vm_control_streamfile and stuffed it 
into apps/app_voicemail.c for now, added additional parameters for the 
input keys for speed up / slow down.) I avoided having to extend the 
ast_format structure to have a speedup() and slowdown() functions, (and 
thus every file format implementation) by adding a variable to the 
ast_filestream, which is a float of the ratio for which the file should 
be played back. This gets set from the keypress on the 
vm_control_streamfile and is read from the libsamplerate api that is now 
inside the wav_read() in formats/format_wav.c . So existing formats that 
do not support or know about play back at a different ratio, would just 
ignore the src_ratio variable in the ast_filestream structure,

To fuel the the immediate discussion to follow on if it is a good idea 
or not to include libsamplerate and libsndfile as an optional dependency 
into asterisk,  I was thinking
 - using some #ifdef HAVE_LIBSAMPLERATE  things where there would be the 
new alternate code to make this work,
 - a new option to the configure script for --with-libsamplerate=..... 
and if libsamplerate is not detected / enabled on the configure script, 
then the ability to do this is not available, furthermore, not having 
these (GPL licensed) third party library files causes everything to 
compile and work as it already does now.

This is just for the basic low-level (wav) file playback manipulation 
support, And then in the app_voicemail would still have a voicemail 
option to use speedup = on/off, so the current means of skipping ahead 
can be done instead if desired. though currently both classical skip 
ahead and increase speed options should be possible.