[asterisk-speech-rec] Pizza demo problems and LVSRE7.5
Andrew Willerding
awillerding at callistacti.com
Tue Apr 17 07:57:32 MST 2007
I had a few problems getting the Pizza demo to work - specifically getting
the grammars to load. It turns out that when I changed the grammars files
to include the UTF-8 parameter on the first line of the file and placed an
extra CR/LF between each line in the files so that it matched the
ABNFDigits.gram example that is installed with the LVSRE 7.5 files the
grammars were able to load.
Now that I have the grammars loading I can't get the engine to actually
recognize "Takeout or deliver." Barging over the prompt works fine as the
prompt immediately stops playing so it is recognizing that there is incoming
speech but the result from the SPEECH(results) parameter is always 0. I've
increased the rxgain on the zap channel to 8.0 in case the speech is too
quiet but so far no recognition is happening in the dialplan. From the
Decode log file I do see that some sort of recognition is happening but it
doesn't appear to be passed back up to the asterisk dialplan. The log file
appears below.
Andrew Willerding
04/17/2007 22:50:42,333,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] Begin
Decode Using Context-Free Grammar
04/17/2007 22:50:42,334,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] Loading
Context Free Gramamr
04/17/2007 22:50:42,335,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
Retrieving words
04/17/2007 22:50:42,335,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] Word
retrieval time: 0
04/17/2007 22:50:42,335,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] Load
grammar time: 0ms
04/17/2007 22:50:42,335,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
Context-Free Grammar Activated: 2 ms
04/17/2007 22:50:42,335,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
LumenVox(161): LM("NO_OOV") deleted
04/17/2007 22:50:42,344,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] LM=
"NO_OOV"
04/17/2007 22:50:42,344,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] nbest
will not be used on this decode run
04/17/2007 22:50:42,344,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] Begin
Core Decode Port
04/17/2007 22:50:42,344,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] starting
search
04/17/2007 22:50:42,347,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
uttproc_end_utt
04/17/2007 22:50:42,348,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
uttproc_result
04/17/2007 22:50:42,391,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] winding
up utterance...
04/17/2007 22:50:42,391,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
processing final frames...
04/17/2007 22:50:42,391,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] ...last
frames processed
04/17/2007 22:50:42,391,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
Deactivating models...
04/17/2007 22:50:42,392,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
deactivating root models
04/17/2007 22:50:42,392,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
deactivating non-root models
04/17/2007 22:50:42,392,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
deactivating word models
04/17/2007 22:50:42,392,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
deactivating singleton models
04/17/2007 22:50:42,392,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] computing
lattice density
04/17/2007 22:50:42,392,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] computing
phoneme perplexity
04/17/2007 22:50:42,393,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] post
processing the forward search
04/17/2007 22:50:42,393,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] preparing
acoustic confidence scores
04/17/2007 22:50:42,393,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] forward
pass search finished
04/17/2007 22:50:42,394,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] begin
creating lattice confidence metrics
04/17/2007 22:50:42,394,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] end
lattice confidence metric calculation. elapsed time: 1ms
04/17/2007 22:50:42,395,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] word 1:
<s>(SIL ) sf: 4 ef: 6 score: 0.100723 ascr:-622592 tscr:0 edge_score:
1.000000
04/17/2007 22:50:42,395,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] word 2:
T EY K AW T(T EY K AW T ) sf: 7 ef: 61 score: 0.879903 ascr:-8622080
tscr:-455488 edge_score: 1.000000
04/17/2007 22:50:42,395,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] word 3:
</s>(SIL ) sf: 62 ef: 69 score: 0.123286 ascr:-1563648 tscr:-6462
edge_score: 1.000000
04/17/2007 22:50:42,395,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] ...all
wound up
04/17/2007 22:50:42,395,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] End Core
Decode Port
04/17/2007 22:50:42,396,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] populate
answers from hypothesis
04/17/2007 22:50:42,396,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] create
answers
04/17/2007 22:50:42,399,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] decode
time, Total: 70 ms
04/17/2007 22:50:51,171,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] Begin
Decode Using Context-Free Grammar
04/17/2007 22:50:51,171,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] Loading
Context Free Gramamr
04/17/2007 22:50:51,172,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
Retrieving words
04/17/2007 22:50:51,173,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] Word
retrieval time: 0
04/17/2007 22:50:51,173,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] Load
grammar time: 0ms
04/17/2007 22:50:51,173,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
Context-Free Grammar Activated: 2 ms
04/17/2007 22:50:51,173,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
LumenVox(161): LM("NO_OOV") deleted
04/17/2007 22:50:51,181,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] LM=
"NO_OOV"
04/17/2007 22:50:51,181,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] nbest
will not be used on this decode run
04/17/2007 22:50:51,182,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] Begin
Core Decode Port
04/17/2007 22:50:51,182,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] starting
search
04/17/2007 22:50:51,185,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
uttproc_end_utt
04/17/2007 22:50:51,186,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
uttproc_result
04/17/2007 22:50:51,225,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] winding
up utterance...
04/17/2007 22:50:51,225,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
processing final frames...
04/17/2007 22:50:51,226,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] ...last
frames processed
04/17/2007 22:50:51,226,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
Deactivating models...
04/17/2007 22:50:51,226,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
deactivating root models
04/17/2007 22:50:51,226,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
deactivating non-root models
04/17/2007 22:50:51,226,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
deactivating word models
04/17/2007 22:50:51,226,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
deactivating singleton models
04/17/2007 22:50:51,226,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] computing
lattice density
04/17/2007 22:50:51,226,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] computing
phoneme perplexity
04/17/2007 22:50:51,227,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] post
processing the forward search
04/17/2007 22:50:51,227,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] preparing
acoustic confidence scores
04/17/2007 22:50:51,228,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] forward
pass search finished
04/17/2007 22:50:51,228,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] begin
creating lattice confidence metrics
04/17/2007 22:50:51,229,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] end
lattice confidence metric calculation. elapsed time: 0ms
04/17/2007 22:50:51,229,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] word 1:
<s>(SIL ) sf: 4 ef: 6 score: 0.044328 ascr:-748544 tscr:0 edge_score:
1.000000
04/17/2007 22:50:51,229,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] word 2:
D AX L IH V AXR IY(D AX L IH V AXR IY ) sf: 7 ef: 60 score: 0.751461
ascr:-9648128 tscr:-455488 edge_score: 1.000000
04/17/2007 22:50:51,229,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] word 3:
</s>(SIL ) sf: 61 ef: 63 score: 0.233312 ascr:-740352 tscr:-6462
edge_score: 1.000000
04/17/2007 22:50:51,230,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] ...all
wound up
04/17/2007 22:50:51,230,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] End Core
Decode Port
04/17/2007 22:50:51,230,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] populate
answers from hypothesis
04/17/2007 22:50:51,230,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] create
answers
04/17/2007 22:50:51,233,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] decode
time, Total: 59 ms
04/17/2007 22:51:01,428,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] Begin
Decode Using Context-Free Grammar
04/17/2007 22:51:01,429,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] Loading
Context Free Gramamr
04/17/2007 22:51:01,430,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
Retrieving words
04/17/2007 22:51:01,430,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] Word
retrieval time: 0
04/17/2007 22:51:01,430,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] Load
grammar time: 0ms
04/17/2007 22:51:01,430,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
Context-Free Grammar Activated: 2 ms
04/17/2007 22:51:01,430,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
LumenVox(161): LM("NO_OOV") deleted
04/17/2007 22:51:01,439,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] LM=
"NO_OOV"
04/17/2007 22:51:01,439,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] nbest
will not be used on this decode run
04/17/2007 22:51:01,439,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] Begin
Core Decode Port
04/17/2007 22:51:01,439,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] starting
search
04/17/2007 22:51:01,443,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
uttproc_end_utt
04/17/2007 22:51:01,443,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
uttproc_result
04/17/2007 22:51:01,477,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] winding
up utterance...
04/17/2007 22:51:01,477,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
processing final frames...
04/17/2007 22:51:01,477,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] ...last
frames processed
04/17/2007 22:51:01,477,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
Deactivating models...
04/17/2007 22:51:01,477,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
deactivating root models
04/17/2007 22:51:01,478,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
deactivating non-root models
04/17/2007 22:51:01,478,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
deactivating word models
04/17/2007 22:51:01,478,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0]
deactivating singleton models
04/17/2007 22:51:01,478,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] computing
lattice density
04/17/2007 22:51:01,478,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] computing
phoneme perplexity
04/17/2007 22:51:01,478,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] post
processing the forward search
04/17/2007 22:51:01,479,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] preparing
acoustic confidence scores
04/17/2007 22:51:01,479,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] forward
pass search finished
04/17/2007 22:51:01,479,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] begin
creating lattice confidence metrics
04/17/2007 22:51:01,480,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] end
lattice confidence metric calculation. elapsed time: 1ms
04/17/2007 22:51:01,480,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] word 1:
<s>(SIL ) sf: 4 ef: 6 score: 0.046426 ascr:-794624 tscr:0 edge_score:
1.000000
04/17/2007 22:51:01,481,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] word 2:
T EY K AW T(T EY K AW T ) sf: 7 ef: 44 score: 0.539766 ascr:-6518784
tscr:-455488 edge_score: 1.000000
04/17/2007 22:51:01,481,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] word 3:
</s>(SIL ) sf: 45 ef: 54 score: 0.142245 ascr:-1923072 tscr:-6462
edge_score: 1.000000
04/17/2007 22:51:01,481,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] ...all
wound up
04/17/2007 22:51:01,481,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] End Core
Decode Port
04/17/2007 22:51:01,482,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] populate
answers from hypothesis
04/17/2007 22:51:01,482,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] create
answers
04/17/2007 22:51:01,485,0,[AmericanEnglish][MODEL_LOW:ThreadNdx:0] decode
time, Total: 50 ms
04/17/2007 22:51:09,607,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0] Begin
Decode Using Context-Free Grammar
04/17/2007 22:51:09,607,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0] Loading
Context Free Gramamr
04/17/2007 22:51:09,608,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0]
Retrieving words
04/17/2007 22:51:09,609,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0] Word
retrieval time: 0
04/17/2007 22:51:09,609,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0] Load
grammar time: 0ms
04/17/2007 22:51:09,609,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0]
Context-Free Grammar Activated: 2 ms
04/17/2007 22:51:09,617,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0]
LumenVox(750): max nonroot chan increased to 136
04/17/2007 22:51:09,617,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0] LM=
"NO_OOV"
04/17/2007 22:51:09,617,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0] nbest
will not be used on this decode run
04/17/2007 22:51:09,617,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0] Begin
Core Decode Port
04/17/2007 22:51:09,617,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0]
starting search
04/17/2007 22:51:09,620,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0]
uttproc_end_utt
04/17/2007 22:51:09,621,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0]
uttproc_result
04/17/2007 22:51:09,652,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0] winding
up utterance...
04/17/2007 22:51:09,652,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0]
processing final frames...
04/17/2007 22:51:09,652,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0] ...last
frames processed
04/17/2007 22:51:09,652,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0]
Deactivating models...
04/17/2007 22:51:09,652,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0]
deactivating root models
04/17/2007 22:51:09,652,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0]
deactivating non-root models
04/17/2007 22:51:09,652,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0]
deactivating word models
04/17/2007 22:51:09,652,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0]
deactivating singleton models
04/17/2007 22:51:09,652,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0]
computing lattice density
04/17/2007 22:51:09,652,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0]
computing phoneme perplexity
04/17/2007 22:51:09,653,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0] post
processing the forward search
04/17/2007 22:51:09,653,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0]
preparing acoustic confidence scores
04/17/2007 22:51:09,654,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0]
00000001: search_pscr_path() didn't end in final state
04/17/2007 22:51:09,654,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0] forward
pass search finished
04/17/2007 22:51:09,654,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0] begin
creating lattice confidence metrics
04/17/2007 22:51:09,654,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0] end
lattice confidence metric calculation. elapsed time: 1ms
04/17/2007 22:51:09,655,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0] word
1: <s>(SIL ) sf: 4 ef: 6 score: 0.122586 ascr:-721920 tscr:0
edge_score: 1.000100
04/17/2007 22:51:09,655,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0] word
2: D AX L IH V AXR IY() sf: 7 ef: 36 score: 0.235422 ascr:-6464512
tscr:-455488 edge_score: 0.389671
04/17/2007 22:51:09,655,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0] word
3: </s>() sf: 37 ef: 47 score: 0.195623 ascr:-1897472 tscr:-6462
edge_score: 1.000000
04/17/2007 22:51:09,655,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0] ...all
wound up
04/17/2007 22:51:09,655,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0] End
Core Decode Port
04/17/2007 22:51:09,656,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0]
populate answers from hypothesis
04/17/2007 22:51:09,656,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0] create
answers
04/17/2007 22:51:09,659,0,[AmericanEnglish][MODEL_HIGHM:ThreadNdx:0] decode
time, Total: 50 ms
More information about the asterisk-speech-rec
mailing list