Audio and Prompts Discussions

Nested
Multiple pronunciations and Automated Audio Segmentation Using Forced Alignment
User: btownshend
Date: 11/16/2008 2:16 am
Views: 5986
Rating: 2

I was trying out the forced alignment using HTK as described in the "Automated Audio Segmentation Using Forced Alignment" document.   Everything worked great, except that I noticed that the VoxForge dictionary has multiple pronunciations for many words using a (2) suffix on the word.   When running this process, the dictionary created for doing the forced alignment uses only the first pronunciations.   Is that intended, or is there a mismatch here in the lexicon format that HTK expects?

Brent

PS: Thanks for that great tutorial!

--- (Edited on 11/16/2008 2:16 am [GMT-0600] by Visitor) ---

Re: Multiple pronunciations and Automated Audio Segmentation Using Forced Alignment
User: kmaclean
Date: 11/16/2008 8:21 am
Views: 118
Rating: 3

Hi Brent,

>When running this process, the dictionary created for doing the forced

>alignment uses only the first pronunciations.   Is that intended, or is there a

>mismatch here in the lexicon format that HTK expects?

HTK's "forced alignment" matches the phoneme sounds it hears to the words in the pronunciation dictionary (kind of  like a 'reverse lookup'...).  Therefore, it seems that your pronunciations best match the first instance of the words with multiple pronunciations.

Ken

--- (Edited on 11/16/2008 9:21 am [GMT-0500] by kmaclean) ---

Re: Multiple pronunciations and Automated Audio Segmentation Using Forced Alignment
User: kmaclean
Date: 11/16/2008 8:48 am
Views: 2401
Rating: 3

You might also be interested in an Perl audio segmentation script that I started on a while ago: Audiobook.pm.  The documentation is inline Perldoc.

I did not add it to the Automated Audio Segmentation Using Forced Alignment page because it still needs some refinements... but I have successfully used it to segment speech text.

Ken

--- (Edited on 11/16/2008 9:48 am [GMT-0500] by kmaclean) ---

PreviousNext