[asterisk-users] Opensource Speech recognition for Asterisk

Sun Aug 22 17:20:02 CDT 2010

Jeff-

> On Sun, 22 Aug 2010, David Backeberg wrote:
> 
> > On Sat, Aug 21, 2010 at 10:49 PM, Duncan Turnbull <duncan at e-simple.co.nz> wrote:
> >> Voice recognition is a pain for people with accents and poor lines and when
> >
> > Everybody has an accent. Some people live in a place where the people
> > they talk to sound like themselves, so they forget that fact.
> >
> > Of course, this is a huge problem if you, for example, want to have an
> > English language voice recognition system that works across the
> > continental United States. Even for people who speak 'correct' or
> > 'common' English for their region, these systems aren't that great in
> > my experience. The bigger of a vocabulary you have, the worse trouble
> > you'll have, because these systems, again, in my experience, only know
> > synonyms or alternate regional words for the same thing if they were
> > programmed by somebody who thought of the synonyms / alternate words /
> > alternate legitimate pronunciations.
> >
> > Anybody with an imagination can think of plenty examples, for example,
> > from the United States:
> > * soda / pop / soft drink / beverage / drink / Coke / other trademarked names
> 
> Comes down to the designer - most of the systems I am used to using (like
> American Airlines system, which is quite good IMO) are focused on the
> basics - digits 0-9, yes/no, "agent", etc.  I don't think it is overly
> difficult to make this work even with varying accents, though UK folks
> used to saying "double naught" might have issues :)

In my opinion the AA system does not work well.  It fails if you:

  -use an accent, try southern US, German (your best
   Arnold impersonation), etc

  -speak too fast, hesitate, have other people talking
   in the background

  -induce false positives. For example if you say
   "Mississippi" for a flight number, it will give you
   flight info for some flight

I would suggest that in any system dependent on speech recognition, allow DTMF entry
as a backup.  The AA system doesn't do this, and probably that contributes to user
frustration.  You can say "agent", "help", etc many times before the system
understands you (or gives up trying to understand you) and actually transfers you to
an agent.  At that point, if you complain about the automated system, the first thing
they ask you is if you're on a mobile phone and if so you have to call from a quiet
place (i.e. not a car).

In the late 1980s AA was sued over DFW Airport signs that caused drivers to take
their eyes off the road in order to figure out gates.  They lost and had to pay
millions, so I can understand if disabling DTMF results from a desire to reduce legal
liability for people who would rather take their eyes off the road to tap keys.  But
I don't understand their inability to field a more robust speech recognition system.

In my opinion, state-of-the-art for speech recognition systems hasn't advanced much
since the early 1990s.

-Jeff