VoxForge
hi,
when i try to use HCopy to copy a file sample1.wav(which came along with the voxforge example) to .mfc, i encounter this message:
______________________________________________________
./HCopy -A -D -T 1 -C config sample1.wav 1.mfc
HTK Configuration Parameters[11]
Module/Tool Parameter Value
# ENORMALISE FALSE
# NUMCEPS 12
# CEPLIFTER 22
# NUMCHANS 26
# PREEMCOEF 0.970000
# USEHAMMING TRUE
# WINDOWSIZE 250000.000000
# SAVEWITHCRC TRUE
# SAVECOMPRESSED TRUE
# TARGETRATE 100000.000000
# TARGETKIND MFCC_0
ERROR [+6313] OpenParmChannel: cannot read HTK Header in File sample1.wav
ERROR [+6313] OpenAsChannel: OpenParmChannel failed
ERROR [+6316] OpenBuffer: OpenAsChannel failed
ERROR [+1050] OpenParmFile: Config parameters invalid
FATAL ERROR - Terminating program ./HCopy
__________________________________________________
what might be the reason? i do not foresee any problems with the wav file because it is used in training the models and its mfc file is already there with the example
Cheer
--- (Edited on 6/8/2009 7:49 am [GMT-0500] by dsubbu) ---
> i do not foresee any problems with the wav
Probably you need to look closer. Your config misses the source format of the input file:
SOURCEFORMAT = WAV
--- (Edited on 6/8/2009 2:38 pm [GMT-0500] by nsh) ---
thanks a ton! that helped.
But when I try to recognise my own test data using HVite, i get the following error
ERROR [+3231] ProcessFile: Incompatible sample kind MFCC_0 vs MFCC_D_N_Z_0
so i tried to change my config file for HCopy(to TARGETKIND= MFCC_0_D_Z instead of the previos one) but it does not seem to work.
Is there any way to make my test data compatible with the model(i.e. convert to MFCC_0_D_Z) other than retraining the model?
Cheers!
--- (Edited on 6/8/2009 11:53 pm [GMT-0500] by dsubbu) ---
> TARGETKIND= MFCC_0_D_Z
It should be MFCC_D_N_Z_0, just be more careful and everything will work.
--- (Edited on 6/9/2009 12:20 am [GMT-0500] by nsh) ---
im sorry, that was a typo i think.. i had used MFCC_D_N_Z_0 only.
HTK Configuration Parameters[12]
Module/Tool Parameter Value
# SOURCEFORMAT WAV
# ENORMALISE FALSE
# NUMCEPS 12
# CEPLIFTER 22
# NUMCHANS 26
# PREEMCOEF 0.970000
# USEHAMMING TRUE
# WINDOWSIZE 250000.000000
# SAVEWITHCRC TRUE
# SAVECOMPRESSED TRUE
# TARGETRATE 100000.000000
# TARGETKIND MFCC_D_N_Z_0
ERROR [+1019] SetConfParms: incompatible TARGETKIND=MFCC_D_N_Z_0 for coding
FATAL ERROR - Terminating program ./HCopy
Can HCopy convert wav files to MFCC_D_N_Z_0? if yes, what am i doing wrong?( the config file has the correct TARGETKIND)
if not, how can I do it?
--- (Edited on 6/9/2009 12:43 am [GMT-0500] by dsubbu) ---
http://www.voxforge.org/home/forums/message-boards/acoustic-model-discussions/mfcc_d_n_z_0-format
--- (Edited on 6/9/2009 12:52 am [GMT-0500] by nsh) ---
Hi,
went through the link you posted.
>>as a work around, I had to create MFCC files using MFCC_O_D feature format.
this, I am(and was) able to do easily
>>and then convert them to the desired target format (MFCC_D_N_Z_0) in the proto file and use the HComp command convert them to the correct feature format (as set out in step 6 of the VoxForge tutorial) .
I do not quite understand how this works. Both operations seem to be disjoint since the MFCC_0_D to MFCC_D_N_Z_0 is a front end process and HCompV is used to initialize hmm parameters. pl. let me know how to use this
thanks again.
--- (Edited on 6/9/2009 5:12 am [GMT-0500] by dsubbu) ---
>Both operations seem to be disjoint since the MFCC_0_D to
>MFCC_D_N_Z_0 is a front end process and HCompV is used to
>initialize hmm parameters.
It would seem to be that the conversion is *not* actually done in the proto (as I had written in the article referred to in nsh's post), but in the config file. From Step 6 of the VoxForge Tutorial:
You also need a configuration file. Create a file called 'config' in your 'voxforge/manual' directory and add the following data:
TARGETKIND = MFCC_0_D_N_Z
TARGETRATE = 100000.0
SAVECOMPRESSED = T
SAVEWITHCRC = T
WINDOWSIZE = 250000.0
USEHAMMING = T
PREEMCOEF = 0.97
NUMCHANS = 26
CEPLIFTER = 22
NUMCEPS = 12
Which is then used inHCompV:
$HCompV -A -D -T 1 -C config -f 0.01 -m -S train.scp -M hmm0 proto
Thanks for catching that!
Ken
--- (Edited on 6/9/2009 5:31 pm [GMT-0400] by kmaclean) ---
hi Ken,
i think there is a small misunderstanding there.i'll put down my situation more clearly.
The voxforge tutorial has trained models using MFCC_0_D_N_Z type mfc files. so, the files i use to test how good the model is must also be in the same format.
i have a set of .wav files in which i have spoken only digits. Now, i want to do a batch decode (not using julian as done in the tutorial) using HVite or any other capability offered by HTK.
Now, when i convert them to .mfc, I need the config file to do a .wav to MFCC_0_D_N_Z. But HCopy does not support that. My final requirement is a .mfc file from a .wav with MFCC_0_D_N_Z format. sorry i still cant figure out how HCompV helps in this case.
Thanks a lot!
--- (Edited on 6/10/2009 6:53 am [GMT-0500] by dsubbu) ---
Hi dsubbu,
> My final requirement is a .mfc file from a .wav with MFCC_0_D_N_Z format.
HCopy can't do that directly... usually the main reason you want to convert to this particular format is for training...
I think what you are trying to do is covered in the Testing Your Acoustic Model with HTK & Julius tutorial.
Basically you convert your audio to MFCC_0_D (see Step 4 - Coding the Data) and then specify "TARGETKIND = MFCC_0_D_N_Z" in your config (see Test Acoustic Model Using HTK) for the decode (uses HVite in this case, but should be similar process for HDecode...).
Ken
--- (Edited on 6/10/2009 12:47 pm [GMT-0400] by kmaclean) ---