VoxForge
hello, I want to write a grammar for phone but knowing that the unit is the word not the phoneme,
one, two........,nine
I wrote this grammar:
/*
* Task grammar
*/
$digit0 = zero;$digit2= deux;$digit6= six;$digit3= trois;$digit7= sept;$digit9= neuf;$digit1= un;$digit5= sinq;$digit8= huit;$digit4= quatre;
({ $digit0 $digit2 $digit6 $digit3 $digit7 $digit9 $digit1 $digit5 $digit8 $digit4 $digit0 $digit2 $digit6 $digit3 $digit7 $digit9 $digit1 $digit5 $digit8 $digit4 $digit0 $digit2 $digit6 $digit3 $digit7 $digit9 $digit1 $digit5 $digit8 $digit4 $digit0 $digit2 $digit6 $digit3 $digit7 $digit9 $digit1 $digit5 $digit8 $digit4 $digit0 $digit2 $digit6 $digit3 $digit7 $digit9 $digit1 $digit5 $digit8 $digit4 $digit0 $digit2 $digit6 $digit3 $digit7 $digit9 $digit1 $digit5 $digit8 $digit4 $digit0 $digit2 $digit6 $digit3 $digit7 $digit9 $digit1 $digit5 $digit8 $digit4 $digit0 $digit2 $digit6 $digit3 $digit7 $digit9 $digit1 $digit5 $digit8 $digit4 $digit0 $digit2 $digit6 $digit3 $digit7 $digit9 $digit1 $digit5 $digit8 $digit4 $digit0 $digit2 $digit6 $digit3 $digit7 $digit9 $digit1 $digit5 $digit8 $digit4} )
because in train signals, the number is pronounced as follows
0 2 6 3 7 9 1 5 8 4
10 times in each file. sig of train signals.
Normally, if we use the same files for the train in test, we get the recognition rate = 100% but my problem is that the recognition rate = 60%
why ??????? I can not find the cause?? please can you help me? it is very urgent.
╚════════════════════════════════════════════════════╝
* Valeur de la ligne : HResults -I ref.mlf hmmlist.txt rec.mlf
====================== HTK Results Analysis =======================
Date: Mon Apr 25 11:46:28 2011
Ref : ref.mlf
Rec : rec.mlf
------------------------ Overall Results --------------------------
SENT: %Correct=60.00 [H=6, S=4, N=10]
WORD: %Corr=99.60, Acc=99.60 [H=996, D=0, S=4, I=0, N=1000]
===================================================================
> if we use the same files for the train in test, we get the recognition rate = 100%
why would you do that?
>my problem is that the recognition rate = 60% why ??????? I can not find the cause?
perhaps you need more speech in your training set?
Random digits are difficult recognize because they don't provide much context (which is what your grammar or statistical language model are for)