Speech Recognition Engines

Nested
Incorporating high-order N-grams for recognition with HTK
User: kometa_triatlon
Date: 11/20/2011 9:06 am
Views: 9773
Rating: 5

Hi everyone.

Sorry for the dumb question, but how can one use 3-gram (built with LBuild, ARPA/MIT-LL format) for recognition with HVite/HDecode?

I found the paper describing HTK LVCSR system using a class-based interpolated 4-grams, but there is no recipe how to do this.

As far as I know, HVite can be supplied just with a word lattice. I cannot build it with HBuild, because it works just with bigrams. 

In the absence of a word lattice, HDecode is supplied with a language model (sic!), in this case it performs a full decoding.

However, the current version supports just bigram full decoding...

I read everything from HTK book concerning language modeling and word networks, but found no cue how to incorporate developed 3-grams and 4-grams for speech recognition with HTK. I did not find anything in the HTK users mailing lists archive as well.

Thanks in advance.

--- (Edited on 11/20/2011 9:06 am [GMT-0600] by kometa_triatlon) ---

Re: Incorporating high-order N-grams for recognition with HTK
User: TonyR
Date: 11/20/2011 9:31 am
Views: 95
Rating: 5

There's nothing to stop you writing code to convert a trigram into a lattice format, but unless you've got very low perplexity HVite will be too slow to be useful.

For HDecode TFM on page 267 says to use -w arpa.lm.

It is not correct to say HDecode "supports just bigram full decoding".

 

Tony

-- 
Dr Tony Robinson
Founder and owner: Cantab Research Ltd

 

--- (Edited on 20-November-2011 3:31 pm [GMT+0000] by TonyR) ---

Re: Incorporating high-order N-grams for recognition with HTK
User: kometa_triatlon
Date: 11/20/2011 11:15 am
Views: 129
Rating: 7

Well, that is what I found in the HTK book v.3.4:

"The search space of the recognition process is defined by a model based network, produced from

expanding a supplied language model or a word level lattice using the dictionary. In the absence of

a word lattice, a language model must be supplied to perform a full decoding. The current version

of HDecode only supports bigram full decoding." (page 248, HDecode reference).

Actually I don't know how to convert n-gram into a lattice...

And what is TFM?

 

Best,

Dima.

--- (Edited on 11/20/2011 11:15 am [GMT-0600] by kometa_triatlon) ---

Re: Incorporating high-order N-grams for recognition with HTK
User: TonyR
Date: 11/20/2011 11:53 am
Views: 69
Rating: 6

Dima,

Trust me - HDecode works with trigrams.   I have it running on a large number of cores right now using trigrams.

Make sure you have HTK 3.4.1.

Alternatively, run HDecode with bigrams, generate lattices, and rescore using higher order n-grams.  I do this for 5-grams for some of my work.

 

Tony

-- 
Dr Tony Robinson
Founder and Owner: Cantab Research Ltd

--- (Edited on 20-November-2011 5:53 pm [GMT+0000] by TonyR) ---

Re: Incorporating high-order N-grams for recognition with HTK
User: kometa_triatlon
Date: 11/20/2011 12:06 pm
Views: 902
Rating: 6

Thanks, I will try!

hmm, I did not think about rescoring, this is also an option, thanks for the advice!

--- (Edited on 11/20/2011 12:06 pm [GMT-0600] by kometa_triatlon) ---

Re: Incorporating high-order N-grams for recognition with HTK
User: mariya celin
Date: 10/28/2014 4:40 am
Views: 299
Rating: 1

Hi.

Is HDecode for creating network for any n-gram language model or a recognition command like HVite?

 

--- (Edited on 10/28/2014 4:40 am [GMT-0500] by mariya celin) ---

Re: Incorporating high-order N-grams for recognition with HTK
User: Yasin Mhmd
Date: 3/2/2016 10:01 pm
Views: 363
Rating: 0

HDecode is like HVite, is used for estimation.

To build N-gram Language model you can check htkbook 203

 

--- (Edited on 3/2/2016 10:01 pm [GMT-0600] by Visitor) ---

Re: Incorporating high-order N-grams for recognition with HTK
User: [email protected]
Date: 11/4/2017 10:43 am
Views: 1763
Rating: 0

Is there any way to use mlf files that contain coded text in generating n-gram LM?

--- (Edited on 11/4/2017 10:43 am [GMT-0500] by ) ---

PreviousNext