VoxForge
Hi, i have been following this link
http://www.speech.cs.cmu.edu/cmusphinx/moinmoin/AcousticModelAdaptation
to adapt my voice data base on arctic transcription recorded using audacity and exported to raw files.
I follow exactly the naming of arctic20 examples to save my voice data.
I'm able to finished the adaptation by following the PocketSphinx section. But when come to Sphinx4 and Sphinx3 section i encounter some error which i can't figure out why.
For sphinx4, i got an "initialization failed" error.
what i did is just duplicate my PocketSphinx working directory and use the bw command below:
./bw \
-hmmdir wsj1 \
-moddeffn wsj1/mdef \
-ts2cbfn .cont. \
-feat s2_4x -cmn current -agc none \
-dictfn arctic20.dic \
-ctlfn arctic20.listoffiles \
-lsnfn arctic20.transcription \
-accumdir .
I've change the -moddeffn to wsj1/mdef as mentioned in the tutorial that i don't need to unpack "the binary mdef" file. And ofcourse the .semi. has been change to .cont.
If i retain the -moddeffn to wsj1/mdef.txt i will get a "segmentation fault" error.
and here's my sphinx4 output.
# ./bw \
> -hmmdir wsj1 \
> -moddeffn wsj1/mdef \
> -ts2cbfn .cont. \
> -feat s2_4x -cmn current -agc none \
> -dictfn arctic20.dic \
> -ctlfn arctic20.listoffiles \
> -lsnfn arctic20.transcription \
> -accumdir .
INFO: main.c(196): Compiled on Aug 26 2009 at 01:59:38
./bw \
-hmmdir wsj1 \
-moddeffn wsj1/mdef \
-ts2cbfn .cont. \
-feat s2_4x \
-cmn current \
-agc none \
-dictfn arctic20.dic \
-ctlfn arctic20.listoffiles \
-lsnfn arctic20.transcription \
-accumdir .
[Switch] [Default] [Value]
-help no no
-example no no
-hmmdir wsj1
-moddeffn wsj1/mdef
-tmatfn
-mixwfn
-meanfn
-varfn
-fullvar no no
-diagfull no no
-mwfloor 0.00001 1.000000e-05
-tpfloor 0.0001 1.000000e-04
-varfloor 0.00001 1.000000e-05
-topn 4 4
-dictfn arctic20.dic
-fdictfn
-ltsoov no no
-ctlfn arctic20.listoffiles
-nskip
-runlen -1 -1
-part
-npart
-cepext mfc mfc
-cepdir
-phsegext phseg phseg
-phsegdir
-outphsegdir
-sentdir
-sentext sent sent
-lsnfn arctic20.transcription
-accumdir .
-ceplen 13 13
-cepwin 0 0
-agc max none
-cmn current current
-varnorm no no
-silcomp none none
-sildel no no
-siltag SIL SIL
-abeam 1e-100 1.000000e-100
-bbeam 1e-100 1.000000e-100
-varreest yes yes
-meanreest yes yes
-mixwreest yes yes
-tmatreest yes yes
-mllrmat
-cb2mllrfn .1cls. .1cls.
-ts2cbfn .cont.
-feat 1s_c_d_dd s2_4x
-svspec
-ldafn
-ldadim 29 29
-ldaaccum no no
-timing yes yes
-viterbi no no
-2passvar no no
-sildelfn
-spthresh 0.0 0.000000e+00
-maxuttlen 0 0
-ckptintv
-outputfullpath no no
-fullsuffixmatch no no
-pdumpdir
INFO: main.c(255): Reading wsj1/mdef
ERROR: "model_def_io.c", line 452: ERROR version(wsj1/mdef) == "BMDF", but expected 0.3 at line 1.
FATAL_ERROR: "main.c", line 1054: initialization failed
Where by for Sphinx3, i got " Assertion `key != ((void *)0)' failed." error.
What i did is same thing duplicate the PocketSphinx working directory to create addtional copy of the raw files.
then execute the bw command as below:
./bw \
-hmmdir /root/sphinx/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd \
> -hmmdir /root/sphinx/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd \
> -ts2cbfn .cont. -feat 1s_c_d_dd -cmn current -agc none \
> -dictfn arctic20.dic \
> -fdictfn arctic20.filler \
> -ctlfn arctic20.listoffiles \
> -lsnfn arctic20.transcription -accumdir .
arctic20.filler is created base on the tutorial in linux file format.
and here's my output:
./bw \
-hmmdir /root/sphinx/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd \
-ts2cbfn .cont. \
-feat 1s_c_d_dd \
-cmn current \
-agc none \
-dictfn arctic20.dic \
-fdictfn arctic20.filler \
-ctlfn arctic20.listoffiles \
-lsnfn arctic20.transcription \
-accumdir .
[Switch] [Default] [Value]
-help no no
-example no no
-hmmdir /root/sphinx/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd
-moddeffn
-tmatfn
-mixwfn
-meanfn
-varfn
-fullvar no no
-diagfull no no
-mwfloor 0.00001 1.000000e-05
-tpfloor 0.0001 1.000000e-04
-varfloor 0.00001 1.000000e-05
-topn 4 4
-dictfn arctic20.dic
-fdictfn arctic20.filler
-ltsoov no no
-ctlfn arctic20.listoffiles
-nskip
-runlen -1 -1
-part
-npart
-cepext mfc mfc
-cepdir
-phsegext phseg phseg
-phsegdir
-outphsegdir
-sentdir
-sentext sent sent
-lsnfn arctic20.transcription
-accumdir .
-ceplen 13 13
-cepwin 0 0
-agc max none
-cmn current current
-varnorm no no
-silcomp none none
-sildel no no
-siltag SIL SIL
-abeam 1e-100 1.000000e-100
-bbeam 1e-100 1.000000e-100
-varreest yes yes
-meanreest yes yes
-mixwreest yes yes
-tmatreest yes yes
-mllrmat
-cb2mllrfn .1cls. .1cls.
-ts2cbfn .cont.
-feat 1s_c_d_dd 1s_c_d_dd
-svspec
-ldafn
-ldadim 29 29
-ldaaccum no no
-timing yes yes
-viterbi no no
-2passvar no no
-sildelfn
-spthresh 0.0 0.000000e+00
-maxuttlen 0 0
-ckptintv
-outputfullpath no no
-fullsuffixmatch no no
-pdumpdir
INFO: main.c(255): Reading /root/sphinx/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/mdef
INFO: model_def_io.c(587): Model definition info:
INFO: model_def_io.c(588): 133548 total models defined (48 base, 133500 tri)
INFO: model_def_io.c(589): 534192 total states
INFO: model_def_io.c(590): 6144 total tied states
INFO: model_def_io.c(591): 144 total tied CI states
INFO: model_def_io.c(592): 48 total tied transition matrices
INFO: model_def_io.c(593): 4 max state/model
INFO: model_def_io.c(594): 4 min state/model
INFO: s3mixw_io.c(116): Read /root/sphinx/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/mixture_weights [6144x1x8 array]
INFO: mod_inv.c(405): Norm failed for 3 mixw: 6 7 8
INFO: s3tmat_io.c(115): Read /root/sphinx/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/transition_matrices [48x3x4 array]
INFO: mod_inv.c(297): inserting tprob floor 1.000000e-04 and renormalizing
INFO: s3gau_io.c(166): Read /root/sphinx/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/means [6144x1x8 array]
INFO: s3gau_io.c(166): Read /root/sphinx/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/variances [6144x1x8 array]
INFO: gauden.c(181): 6144 total mgau
INFO: gauden.c(155): 1 feature streams (|0|=39 )
INFO: gauden.c(192): 8 total densities
INFO: gauden.c(98): min_var=1.000000e-05
INFO: gauden.c(170): compute 4 densities/frame
INFO: main.c(363): Will reestimate mixing weights.
INFO: main.c(365): Will reestimate means.
INFO: main.c(367): Will reestimate variances.
INFO: main.c(369): WIll NOT optionally delete silence in Baum Welch or Viterbi.
INFO: main.c(377): Will reestimate transition matrices
INFO: main.c(390): Reading main lexicon: arctic20.dic
INFO: lexicon.c(233): 174 entries added from arctic20.dic
INFO: main.c(402): Reading filler lexicon: arctic20.filler
bw: hash.c:254: hash_enter: Assertion `key != ((void *)0)' failed.
Aborted
Please help!
Thank you very much!.
--- (Edited on 9/12/2009 12:09 am [GMT-0500] by degra) ---
> I've change the -moddeffn to wsj1/mdef as mentioned in the tutorial that i don't need to unpack "the binary mdef" file. And ofcourse the .semi. has been change to .cont.
The mistake here is that you took wsj1 semicontinuous model for adaptation. Sphinx4 works with continuous models. Also, the feature type of the models is usually 1s_c_d_dd, so your -feat should be the same in bw command line.
> Where by for Sphinx3, i got " Assertion `key != ((void *)0)' failed." error.
Most likely you put something wrong into the filler file, an empty newline for example.
--- (Edited on 9/12/2009 4:40 am [GMT-0500] by nsh) ---
Thanks for pointing out the mistake.
Now my new sphinx4 adaptation command is as below and i am able to generate out the counts.
./bw \
-hmmdir /root/sphinx/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd \
-moddeffn /root/sphinx/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/mdef \
-ts2cbfn .cont. \
-feat 1s_c_d_dd -cmn current -agc none \
-dictfn arctic20.dic \
-fdictfn arctic20.filler \
-ctlfn arctic20.listoffiles \
-lsnfn arctic20.transcription \
-accumdir .
Whereby for sphinx3, yes you are correct, the filler file has an additional empty line at the end of the file. After i have removed the empty line, the bw command is able to generate the counts.
Now i have some confusion, if i need to do 2 or more adaptations on the same acoustic model, should i do it twice/more time or combine all the raw data to do it once?
If I do twice, how can i point to 2 different mllr_matrix for sphinx3?
For sphinx4, if i understand the tutorial correctly, the final output of the adaptation should be the latest updated acoustic model and the config.xml should point to the latest AM?
If i need to do the adaptation again, should i get the latest AM that has been adapted once or should i get the original one and combine all the new raw data to do it once?
TQ!
degra
--- (Edited on 9/13/2009 6:29 am [GMT-0500] by degra) ---
> Now i have some confusion, if i need to do 2 or more adaptations on the same acoustic model, should i do it twice/more time or combine all the raw data to do it once?
You should combine the data
> If I do twice, how can i point to 2 different mllr_matrix for sphinx3?
You can't do that
> For sphinx4, if i understand the tutorial correctly, the final output of the adaptation should be the latest updated acoustic model and the config.xml should point to the latest AM?
Yes
> If i need to do the adaptation again, should i get the latest AM that has been adapted once or should i get the original one and combine all the new raw data to do it once?
You need to combine the data. It's the basic thing - the more data you have the better your model is trained. Assuming the data is closely related to your target task.
--- (Edited on 9/13/2009 11:29 am [GMT-0500] by nsh) ---
I would like to adapt the acoustic model in an on-going manner. Can i said that the model adaptation is a never ending process?
If i keep accumulate the raw data and combining them to adapt once, one day it will reach a very huge size of raw data and this definately may prolong the adaptation process time from minutes to days......
When that days come, we will not get the updated model so soon until may be a week later (i assume if the adaptatiop takes 5 days to complete due to very huge data). Is this the preferred way or there is other which i don't know?
TQ very much!
degra
--- (Edited on 9/13/2009 9:57 pm [GMT-0500] by degra) ---
It's possible to do that in theory, but there is no corresponding implementation in CMU sphinx. The issue is that the adaptation has no "memory", it always tries to match closely the data you submitted for adaptation. So last incorrect or not so close part of the database could break previosly trained model.
The way to solve this is to try to keep previous MLLR for example and add a new one with some weight. This could be trivially implemented of course, but there is no ready to use code for this.
Also remember that with single class MLLR there is no much difference between hour and 10 minutes of adaptation data.
--- (Edited on 9/14/2009 2:11 am [GMT-0500] by nsh) ---
could you please attach the required files here.i am unable to get it from that page.
Namely i am looking for these files:
--- (Edited on 12/19/2009 10:33 am [GMT-0600] by anp) ---