Speech Recognition Engines

Nested
About "Julian Recognition Output"
User: Shawn
Date: 10/5/2007 1:32 am
Views: 5117
Rating: 35

Hi, I finished the whole training job.

But when I run Julian, here come the problems, it can only recognition two words, just like "dial one", it ignore almost all the words that I speak.

I try so many times, but the result is all the same. 

And I download the tutorial of VoxForge, where I can also try the Julian, but the problem is the same. Why is that? 

--- (Edited on 10/5/2007 1:32 am [GMT-0500] by Visitor) ---

Re: About "Julian Recognition Output"
User: kmaclean
Date: 10/5/2007 8:29 am
Views: 2185
Rating: 44

Hi Shawn,

Assuming that you trained you acoustic model using recordings of your own voice, and there were no errors, your problem is likely related to your CMN parameter not being set.  Julian needs about 5 seconds of your speech to adjust to your microphone levels (which you should be set to the same level as the speech recordings you used to create your acoustic models). 

Julian tells you this when it starts up and says:

------------- System Info end -------------

        ************************************************************
        * NOTICE: The first input may not be correctly recognized *
        *         since no CMN parameter is available on startup.  *
        ************************************************************

This is telling you that Julian takes the cepstral mean of the last 5 seconds of speech as the initial cepstral mean at the beginning of each input.  When Julian first starts, it has no speech with which to calculate a cepstral mean, and thus cannot recognize your first few utterances.  Once it gets enough speech to calculate a cepstral mean, your recognition results should  improve noticably.

A workaround (that I use for testing purposes) is to utter the same grammar sentence a few times until Julian recognizes it, then test with the other grammar sentences.  That way, Julian has time to adjust for your speech and microphone levels.

A better solution is to use "-cmnsave filename"  parameter (see the Julian manual) to record a representative average for your environment, and then use "-cmnload filename" and "-cmnnoupdate" to use then cmn you saved and not try to recalculate it on the fly. 

You may also need to play with your microphone levels get get optimal recognition results. 

Hope this helps, 

Ken 

--- (Edited on 10/5/2007 9:29 am [GMT-0400] by kmaclean) ---

PreviousNext