Acoustic Model Discussions

Nested
HTK and Sphinx training tutorials
User: DavidGelbart
Date: 4/3/2008 3:12 pm
Views: 6266
Rating: 22

Today I came across something by Keith Vertanen, who is involved in the speech version of Dasher (Speech Dasher):

http://www.inference.phy.cam.ac.uk/kv227/papers/baseline_wsj_recipes.pdf

It is a detailed explanation of how he set up HTK & Sphinx training for the Wall Street Journal corpus, with URLs that can be used to download the training scripts.   He wrote, "My goal is to provide practical advice and results to researchers who are thinking of using HTK or Sphinx for real-time recognition on dictation-like tasks".  While WSJ is a proprietary corpus, his work could still be useful as a source of examples.  His scripts have support for monophones, word-internal triphones, and cross-word triphones.  Also, he says "Many of the acoustic models used in the experiments described later in this paper are available for download"

He also has some language model training scripts at http://www.inference.phy.cam.ac.uk/kv227/ 

--- (Edited on 4/3/2008 3:12 pm [GMT-0500] by DavidGelbart) ---

Re: HTK and Sphinx training tutorials
User: DavidGelbart
Date: 4/3/2008 3:24 pm
Views: 2657
Rating: 26
Oh, and the language models are available for download too.

--- (Edited on 4/3/2008 3:24 pm [GMT-0500] by DavidGelbart) ---

PreviousNext