[Asterisk-Dev] Voice energy detection: coder wanted
John Todd
jtodd at loligo.com
Sat Nov 8 21:19:54 MST 2003
Freddi -
Thanks for the data. However, your experiences are with detecting
two-way conversations, where you are attempting to determine if the
two legs of a call have humans attached to them. My customer's
problem is only one-half of that issue, which is that the system
needs to detect if a human is at one end of the call (versus an
answering machine) which is a bit more difficult. Do your pattern
matching archives cover any such instances for reliable detection?
JT
>Hi,
>I hope that you can use a couple of hints from this. I have been
>working for a global carrier
>in more then 10 years and part of the job was to do 'soft-answer
>supervision' and call progress
>detection. We had equipment in more than 90 countries so I have seen
>quite a few different way of
>doing things.
>It was my experience that an answering machine would always trip a
>simple 'voice energy/cadance detection'
>since it's actually voice that you have recorded. The answer
>detection we used was simply based upon the fact
>that a 'conversation' would normally be 'bi-directional'. So our
>'answer-detector' was actually 2 VAD's
>monitored by a 'speech-direction' detector. In order to say that
>speech direction was from A to B the
>VAD 'from A' should say 'speech present' while the VAD 'from B'
>should say 'no speech'.
>Our criteria was typically set to '3 direction shifts within 30
>seconds' for installations in US and Europe.
>I do still have pattern matching call progress info for most the
>countries we worked in if this stuff still
>has someones interest.
>b.r.
>Freddi
>
>>Message: 8
>>Date: Sat, 8 Nov 2003 06:09:39 +0000 (GMT)
>>From: Stephen Davies <steve at daviesfam.org>
>>To: asterisk-dev at lists.digium.com
>>Subject: Re: [Asterisk-Dev] Voice energy detection: coder wanted
>>Reply-To: asterisk-dev at lists.digium.com
>>
>>On Fri, 7 Nov 2003, John Todd wrote:
>>
>>
>>>I have a requirement from one of my customers (in the emergency
>>>services arena, I am told) to develop a voice energy detection
>>>system for Asterisk. This would be to detect the difference
>>>between an answering machine, and a human. This detection need
>>>only be very basic, and probably will hook into the existing
>>>routines in dsp.c (unless you have a cadence and tonal module
>>>already built.)
>>>
>>>
>>
>>So I'm curious as to the algorithms used. All I can think of is that
>>an answering machine talks for longer than a real human caller.
>>
>>dsp.c can already detect voice as opposed to various tones. So
>>wouldn't answering machine detection go something like:
>>
>>if <start detecting voice for the first time>
>> note that the call is answered
>> if you don't hear say 1sec silence within 5 secs then
>> note that it was probably an answering machine
>>
>>I'd allow the possibility that people talk "differently" when
>>recording an announcement - ie in "posh telephone voice" which perhaps
>>has a different spectrum to their usual voice - but seeing you dont
>>know their usual voice I'm not sure how you could use that.
>>
>>Steve
More information about the asterisk-dev
mailing list