VoxForge
hi all,
I am using the nightly build acoustic model provided in the site and created the language model using the master prompts in the following way..
HLStats -o -s 'SENT-START' 'SENT-END' -b master_bg ../HTK_AcousticModel/wlist ../HTK_AcousticModel/words.mlf
HBuild -b -s 'SENT-START' "SENT-END' -n master_bg ../HTK_AcousticModel/wlist master_wdnet
I tested them using the master test prompts and the recognition is very poor.
Am I doing something wrong here?
thanks in advance.
Regards,
Sagar.
--- (Edited on 2/19/2010 5:30 am [GMT-0600] by sagarvenkata) ---
>created the language model using the master prompts
If you are using Julius 3.x you need a forward 2-gram and a reverse word 3-gram.
Julius-4 can do recognition with forward N-gram or a backward N-gram.
--- (Edited on 2/21/2010 9:41 pm [GMT-0500] by kmaclean) ---
hi,
thanks for the reply.
I am using back off 2-gram model built from master prompts now usinng LBuild and using HDecode for decoding. The acoustic model is cross-phone tied mixup state model.
for decoding im using
the 's' value of 5.0
the 'p' value of 0.0
i am getting word accuracy of 70% on test prompts. how can i improve it?
i need accuracy of 90% atleast.
Thanks&Regards,
Sagar
--- (Edited on 2/21/2010 10:56 pm [GMT-0600] by sagarvenkata) ---
>i am getting word accuracy of 70% on test prompts. how can i
>improve it?
From the HTK Book:
3.4.1 Step 11 - Recognising the Test Data
[...]
The options -p and -s set the word insertion penalty and the grammar scale factor, respectively. The word insertion penalty is a fixed value added to each token when it transits from the end of one word to the start of the next. The grammar scale factor is the amount by which the language model probability is scaled before being added to each token as it transits from the end of one word to the start of the next. These parameters can have a significant effect on recognition performance and hence, some tuning on development test data is well worthwhile.
I not very familiar with HDecode, but in order to test the VoxForge acoustic model with HVite, I just used trial and error, and came up with: -p 0.0 -s 5.0.
--- (Edited on 2/26/2010 2:14 pm [GMT-0500] by kmaclean) ---
hi,
i have tried with different s,p s=0, p=12 seems to give best result..
with an increase from 66% to 70%. nothing more.
So to increase the recognition what else can i do?
Thanks & Regards,
Sagar
--- (Edited on 2/27/2010 11:11 am [GMT-0600] by Visitor) ---
>So to increase the recognition what else can i do?
Use the release build rather than the nightly build (which still needs some cleanup).
If you still want to use the nightly build, review the forced alignment logs (the HVite output in Step 8) and remove any audio that has a high "No tokens survived to final node of network at beam" entries. If this improves recognition, please let us know what you removed.
thanks,
Ken
--- (Edited on 3/9/2010 12:37 am [GMT-0500] by kmaclean) ---