VoxForge
Hi I am using a new set of features for monophone recognition in continuous speech. I created monophone models and did forced alignment on training data and the alignments I got are pretty good when I tested them. But the problem is that I am not able to do phone recognition (I am testing it on same training data, as of now). When I use HVite for phone recognition, I get only one phone per one speech file while clearly I have a whole sentence in a file. I am not able to understand it as I get good alignments but I am not able to get anythinng in recognition. The HVite command I used for recognition is as follows :
HVite -A -C config2 -w wdnet -H models/hmm10/hmmdefs -S train_test.scp -i recog.out -p 1.0 -t 250.0 150.0 1000.0 phone_dictionary_v2_pau.txt phone_list_v2.txt
I have tried various numbers for -p option and -t option. Each time I get only one phone per wav file.
My dictionaty is as follows :
aa aa
ae ae
ah ah
ao ao
aw aw
ay ay
b b
ch ch
d d
dh dh
eh eh
er er
ey ey
f f
g g
hh hh
ih ih
iy iy
jh jh
k k
l l
m m
n n
ng ng
ow ow
oy oy
p p
pau pau
r r
s s
sh sh
t t
th th
uh uh
uw uw
v v
w w
y y
z z
zh zh
SENT-START
SENT-END
--- (Edited on 2/1/2015 9:18 pm [GMT-0600] by Terminator) ---
Easy!
You use:
( $digit )
Which is just one digit. TFM (e.g. http://www.ee.columbia.edu/ln/LabROSA/doc/HTKBook21/node131.html) says that "{} denotes zero or more repetitions" so you want:
( { $digit } )
Let us know if that doesn't work for you.
Tony
--
Dr Tony Robinson
Founder Cantab Research Ltd
--- (Edited on 2-February-2015 1:32 pm [GMT+0000] by TonyR) ---
Hi,
Thanks Tony for the pointer. This worked ( <digit> )
The angular brackets seem to tell the task grammar that there can be multiple phones in a single file.
--- (Edited on 2/4/2015 6:36 am [GMT-0600] by Terminator) ---