Acoustic Model Discussions

Nested
Is this a valid .grammar and .voca file? Newbie needs some help.
User: ariestav
Date: 10/5/2009 1:10 am
Views: 7023
Rating: 1

Hi There,

I'm trying to setup a simple "command / control" speech recognition system with Julius 4.1.2 and have ventured to create my own .grammar and .voca files.  I suppose I wanted to post this to the community to see if the way I created my files were correct.  I basically want Julius to recognize only three words:

"go", "stop", and "play"


So here is my first attempt at my files:


.grammar file

S : NS_B ANY_COMMAND NS_E
ANY_COMMAND : COMMAND_V

 

.voca file

% NS_B
<s> sil

% NS_E
</s> sil

% COMMAND_V
GO          g ow
STOP       s t aa p
PLAY        p l ey


That's what I came up with.  Will the above files be read correctly by the interpreter program to make the .dfa file necessary to work with Julius?  I plan on using the VoxForgeDict file I downloaded from this site, as well as VoxForge's Acoustic Model file (hmmdefs).  Is there anything else I am missing before I begin testing Julius with this configuration?

Also, quick questions:

1.  Does the interpreter care about white space in the .grammar and .voca files?  Can I place tabs between the words and their phoneme definitions? 

2.  From what I recall from reading the documentation, Julian does not exist anymore in the latest release of Julius?  Julian got merged with Julius 4.1.2, correct?

3.  What is the Julius shell command to convert the .grammar and .voca files into a .dfa file?


Thanks for all your time and help!

Best,

Arie

--- (Edited on 10/5/2009 1:10 am [GMT-0500] by ariestav) ---

Re: Is this a valid .grammar and .voca file? Newbie needs some help.
User: kmaclean
Date: 10/5/2009 12:05 pm
Views: 75
Rating: 1

Hi Arie,

>Will the above files be read correctly by the interpreter program to make

>the .dfa file necessary to work with Julius?

Did you try to compile it with mkdfa.pl (see Step 1 - Task Grammar for more info)?  Trial and error is an acceptable approach to creating grammars... I use it all the time  :)

>1.  Does the interpreter care about white space in the .grammar and

>.voca files?  Can I place tabs between the words and their phoneme

>definitions? 

I don't think so, but try it out to confirm...

>2.  From what I recall from reading the documentation, Julian does not

>exist anymore in the latest release of Julius?  Julian got merged with

>Julius 4.1.2, correct?

correct

>3.  What is the Julius shell command to convert the .grammar and .voca

>files into a .dfa file?

mkdfa.pl see Step 1 - Task Grammar.

Ken

--- (Edited on 10/5/2009 1:05 pm [GMT-0400] by kmaclean) ---

--- (Edited on 10/5/2009 10:42 pm [GMT-0400] by kmaclean) ---

Re: Is this a valid .grammar and .voca file? Newbie needs some help.
User: ariestav
Date: 10/5/2009 4:00 pm
Views: 84
Rating: 1

Okay,

The files seemed to have generated fine

/usr/local/bin/mkdfa.pl ngale                                                │
ngale.grammar has 2 rules
ngale.voca    has 3 categories and 5 words
---
Now parsing grammar file
Now modifying grammar to minimize states[-1]
Now parsing vocabulary file
Now making nondeterministic finite automaton[4/4]
Now making deterministic finite automaton[4/4] 
Now making triplet list[4/4]
3 categories, 4 nodes, 3 arcs
-> minimized: 4 nodes, 3 arcs
---
generated: ngale.dfa ngale.term ngale.dict


Now, when I execute Julius-4.1.2 with these options, I am getting some errors:

# julius-4.1.2 -v ngale.dict -dfa ngale.dfa -h acoustic_model_files/hmmdefs
STAT: jconf successfully finalized
STAT: *** loading AM00 _default
Stat: init_phmm: Reading in HMM definition
Stat: rdhmmdef: ascii format HMM definition
Stat: rdhmmdef: limit check passed
Stat: check_hmm_restriction: an HMM with several arcs from initial state found: "sp"
Stat: rdhmmdef: this HMM requires multipath handling at decoding
Stat: init_phmm: defined HMMs:  8002
Stat: init_phmm: logical names:  8002
Stat: init_phmm: base phones:    44 used in logical
Stat: init_phmm: finished reading HMM definitions
STAT: making pseudo bi/mono-phone for IW-triphone
Stat: hmm_lookup: 1071 pseudo phones are added to logical HMM list
STAT: *** AM00 _default loaded
STAT: *** loading LM00 _default
STAT: reading [ngale.dfa] and [ngale.dict]...
Error: voca_load_htkdict: line 4: triphone "t-aa+p" not found
Error: voca_load_htkdict: the line content was: 2    [STOP]    s t aa p
Error: voca_load_htkdict: line 5: triphone "p-l+ey" not found
Error: voca_load_htkdict: the line content was: 2    [PLAY]    p l ey
Error: voca_load_htkdict: begin missing phones
Error: voca_load_htkdict: p-l+ey
Error: voca_load_htkdict: t-aa+p
Error: voca_load_htkdict: end missing phones
Error: init_voca: error in reading ngale.dict: 2 words failed out of 3 words
ERROR: failed to read dictionary "ngale.dict"
ERROR: m_fusion: some error occured in reading grammars
ERROR: Error in loading model

I don't understand why Julius says it cannot find the phones p-ley, t-aa+p?  But why can it find the phones for go?  I am using the hmmdefs file that is packaged with the QuickStart Julius from this site.  I also used an hmmdefs from the most recent nightly build.  What should I do to troubleshoot this error?

 

--- (Edited on 10/5/2009 4:00 pm [GMT-0500] by ariestav) ---

Re: Is this a valid .grammar and .voca file? Newbie needs some help.
User: kmaclean
Date: 10/5/2009 4:12 pm
Views: 75
Rating: 1

>phones p-ley, t-aa+p?

you need to use the phones that the VoxForge acoustic model was created with... look in the pronunciation dictionary included in the Quickstart package (omit the sp at the end of each line on the phones), and use those phones in your grammar file...

Ken

--- (Edited on 10/5/2009 5:12 pm [GMT-0400] by kmaclean) ---

Re: Is this a valid .grammar and .voca file? Newbie needs some help.
User: Visitor
Date: 10/5/2009 4:44 pm
Views: 73
Rating: 1

Hi Ken,

Thanks for helping out.  Here are some excerpts from the file:

GNOSTIC         [GNOSTIC]       n aa s t ix k sp
GO              [GO]            g ow sp
GOAL            [GOAL]          g ow l sp
...
PLATES          [PLATES]        p l ey t s sp
PLAUSIBLE       [PLAUSIBLE]     p l ao z ax b ax l sp
PLAY            [PLAY]          p l ey sp
PLAYED          [PLAYED]        p l ey d sp
...
STOOPED         [STOOPED]       s t uw p t sp
STOP            [STOP]          s t aa p sp
STOPPED         [STOPPED]       s t aa p t sp

 

Here is my .voca file:

% NS_B
<s> sil

%NS_E
</s> sil

% COMMAND_V
GO g ow
STOP s t aa p
PLAY p l ey

I thought I defined them as per the VoxForge dictionary.  Am I missing something?  Perhaps I am using an incorrect hmmdef file?  Do I need to install HTK?  I thought that I wouldn't have to do that if I already had the hmmdefs file. 

I appreciate all your time and help!  Thanks Ken!

--- (Edited on 10/5/2009 4:44 pm [GMT-0500] by Visitor) ---

Re: Is this a valid .grammar and .voca file? Newbie needs some help.
User: kmaclean
Date: 10/5/2009 10:13 pm
Views: 82
Rating: 1

In Step 10 - Making Tied-State Triphones we create an acoustic model using all the words in the VoxForge pronunciation dictionary, even though we don't actually have any (or enough...) recordings to create particular triphones. 

What the HTK acoustic model training process does is cheat... it says that certain "logical" triphones located in the pronunciation dictionary (for which we have no audio training data) are mapped to "physical" triphones in the acoustic model (triphones in the hmmdefs file created from actual speech recordings). 

The file that does this is the tiedlist file (see the Running Julian Live section of the tutorial).

Therefore, the command you should use to run julius is:

  $julius-4.1.2  -v ngale.dict -dfa ngale.dfa -h hmmdefs -multipath -hlist tiedlist

The reason for the 'multipath' parameter is described in ticket #2.

Ken

--- (Edited on 10/5/2009 11:13 pm [GMT-0400] by kmaclean) ---

Re: Is this a valid .grammar and .voca file? Newbie needs some help.
User: ariestav
Date: 10/6/2009 9:55 am
Views: 2416
Rating: 1

Ken,

You are the man!  Thank you for that command line w/ the options.  You were correct, and my audio files are being recognized!  Wow, I didn't think I could do this with open source software.  Your project here is great, and I will defiintely contribute my voice and time for the Acoustic Model. 

I have a new question, but will start another thread for it.

Thanks so much for helping out!

Best,

Arie

--- (Edited on 10/6/2009 9:55 am [GMT-0500] by ariestav) ---

PreviousNext