VoxForge
--- (Edited on 3/10/2008 8:25 am [GMT-0500] by colbec) ---
--- (Edited on 3/10/2008 11:39 am [GMT-0500] by nsh) ---
Apologies for the terseness, here is a bit more detail:
For example this morning I was playing with a grammar designed to query a database. I followed the rules as described in the Julius and HTK manuals to come up with a good selection of words (or so I thought!).
Clearly, it is important to choose the right words, there must be a good differentiation between words to help the engine distinguish clearly between them and in order to do this, as the tutorial shows, you select the right mix of phonemes.
Building the right fundamental grammar and the best prompts to go with it could be a bit tedious given the need to keep track of large numbers of words, and avoiding the temptation to select from the top end of an alphabetically sorted list.
So I built a small access database and read into it my own grammar and the list of words from the voxforge lexicon.
This means I can quickly: check that my chosen words are in the lexicon, update my table of words with the appropriate phonemes, print out a .voca file, check to see which phonemes my word list does not contain or where my current word mix is weak phonemetrically, isolate words from the lexicon that offer the needed phonemes, etc.
With a bit more work it could come up with suggested sentences however meaningless that still contain complementary prompts.
--- (Edited on 3/10/2008 12:15 pm [GMT-0500] by colbec) ---
Well actually modern state of art in grammar design includes many more things. Current command and control is more a keyword spotting with low-probability phone loop for OOV words than a strict grammar selection. I wonder if phoneset selection is really efficient here, user can say anything actually, you only need to catch the right phrase
Google tech talk
http://video.google.com/url?docid=-1475844291453002082&esrc=rss_uds&ev=v&len=3686&q=Google%2B%22Google%2BTech%2BTalks%22%2Bduration%3Along&srcurl=http%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3D64S_b7An3p4&vidurl=http%3A%2F%2Fvideo.google.com%2Fvideoplay%3Fdocid%3D-1475844291453002082&usg=AL29H21f3BG33repUbe2PGbA_sDsxH0XZA
has a little introduction into this.
About prompts generation from the grammar, HTK has tools for that:
HSGen -l -n 200 wdnet dict
will do everything you need.
--- (Edited on 3/10/2008 12:47 pm [GMT-0500] by nsh) ---
Thanks for the thoughts. I don't have a 'net connection that can handle video stuff so cannot benefit from that.
I suppose it is still arguable that efficient grammar design and phoneset selection helps to catch the right phrase earlier than otherwise.
--- (Edited on 3/10/2008 1:41 pm [GMT-0500] by colbec) ---