VoxForge
hi when i ran julian then it gave me the follwoing error...could anyone tell me what is the problem?? thank you
include config: julian.jconf
###### check configurations
###### initialize input device
fragment size = 1024 bytes (10 msec)
AD-in thread created
###### build up system
Reading in HMM definition...(ascii)...limit check passed
defined HMMs: 50
logical names: 506 in HMMList
base phones: 44 used in logical
done
Making pseudo bi/mono-phone for IW-triphone...369 added as logical...done
reading [sample.dfa] and [sample.dict]...
Reading in dictionary...
line 18: triphone "*-z+ir" or biphone "z+ir" not found
line 18: triphone "z-ir+ow" not found
line 18: triphone "ir-ow+*" or biphone "ir-ow" not found
> 5 [ZERO] z ir ow
////// Missing phones:
*-z+ir or biphone z+ir
ir-ow+* or biphone ir-ow
z-ir+ow
//////////////////////
error in reading sample.dict: 1 words failed out of 18 words
ERROR: failed to read dictionary, terminated
Terminated
>error in reading sample.dict: 1 words failed out of 18 words
Add the word zero to your pronunciation dictationary, re-run script
hi ken,
zero is already there in the pronunciation dictionary. there is still the same problem.
>zero is already there in the pronunciation dictionary
add a prompt in step 2 of the Howto with the word zero, e.g.:
*/sample32 ZERO ZERO ZERO ZERO ZERO ZERO
record the new prompt in Step 3,
re-run the script. in Step 4.
Thanks for your response.
Just to clarify what I'm trying to do. I am not interested in the words spoken, just the phonemes present in an audio file and their durations/timing information i.e. the start and end of each phoneme.
Hi Ken,
Please could you give me some pointers on how to find the timing information of the phonemes. The start and end of each phoneme. I have to generate .lab files with the phone separation along with their timing for each sample speech file recorded.
Thank you very much.
>Please could you give me some pointers on how to find the
>timing information of the phonemes.
if you are looking for something like this: aligned.out , then check out how to do force alignment on this page: Automated Audio Segmentation Using Forced Alignment (Draft)
Hi Ken,
Thank you very much for the quick response. Yes, I am trying to get a result like that. But the thing, is I am trying to do a free alignment not forced so do you have any suggestions about that?
Thank you
>free alignment
How is this different than forcd alignment?
Hi Ken,
In forced alignment the speech recognition system is given apriori information about the words and phonemes that are uttered. The system therefore looks for *all* phonemes in the words even when speech is fast and sloppy and phonemes are dropped or mispronounced. Furthermore, in forced alignment, the system also looks for or forces pauses between words even when the pauses do not exist. (No one really pauses between words.)
In free alignment however, the system is not given apriori information about what words are uttered, and the system is free to find search for whatever phonemes are uttered only. As such pauses are only found where they truly exist.
Thank you.