[Asterisk-Dev] Voice energy detection: coder wanted

Sat Nov 8 11:31:20 MST 2003

>On Fri, 7 Nov 2003, John Todd wrote:
>
>>  I have a requirement from one of my customers (in the emergency
>>  services arena, I am told) to develop a voice energy detection system
>>  for Asterisk.  This would be to detect the difference between an
>>  answering machine, and a human.  This detection need only be very
>>  basic, and probably will hook into the existing routines in dsp.c
>>  (unless you have a cadence and tonal module already built.)
>
>So I'm curious as to the algorithms used.  All I can think of is that
>an answering machine talks for longer than a real human caller.
>
>dsp.c can already detect voice as opposed to various tones.  So
>wouldn't answering machine detection go something like:
>
>if <start detecting voice for the first time>
>   note that the call is answered
>   if you don't hear say 1sec silence within 5 secs then
>     note that it was probably an answering machine
>
>I'd allow the possibility that people talk "differently" when
>recording an announcement - ie in "posh telephone voice" which perhaps
>has a different spectrum to their usual voice - but seeing you dont
>know their usual voice I'm not sure how you could use that.
>
>Steve

The word "hello" has specific characteristics that might be matched 
in a single-filter voice recognition system.  Or, your specific 
time-based filter might be sufficient; I am open to options as to how 
the system works.  Additionally, the other method is to start playing 
the recording upon answer, but listen for a tone indicating that an 
answering machine is now recording and fail out if that is the case.

JT