VoxForge
Hello,
I am trying to recognize some German speech using the following language model "cmusphinx-ptm-generic-de-r20180609" but the accuracy is about 35% which is too low.
I am recording a .wav file by a native german speaker using the following command :
rec -V1 -q -r 16000 -c 1 -b 16 -e signed-integer --endian little "test1.wav"
and then run the following command
pocketsphinx_batch -adcin yes -cepdir wav -cepext .wav -ctl test.fileids -lm "file.lm" -dict "file.dic" -hmm "acoustic_model_directory" -hyp test.hyp
word_align.pl test.transcription test.hyp
Is there something wrong becauause I think it's not normal to get such low accuracy using the provided models?
Hi mohemara92,
couple of remarks:
cheers,
guenter
Hi Guenter,
Thanks a lot for helping.
I have some questions related to your remarks:
Hi Mohamed,
as a rule of thumb when you check your recording in Audacity only few samples should peak outside the +- 0.5 range.
30 seconds is kind of the upper limit but most of the training material is much shorter, i.e. in the 5-12s range (typically a single sentence).
actually I had built the PTM models on request (I am not using CMU Sphinx in any of my projects) so I have very little experience with them. From what I see in the training results it seems to me that you cannot use them as-is for general unknown speaker speech recognition. You will either have to adapt them to a specific voice or tailor the language model for a narrow domain to get useful results (or, even better: do both ;) )
you will have to switch to kaldi-asr to use those