VoxForge
Hello,
I am using HREsults in order to know how my HTK recogniser works. I am interested in getting base results concerning %WER, %Substitutions(S), %Deletions(D) and %Insertions(I), but always a report of these S, D and I (i.e. the word 'he' has been S by the word 'she' 17 times, etc..).
I have read HTK Book section about HREsults -options, but I do not find the way to do it. Is it possible? I am using a 65K vocabulary (cross matrix does not work well..)
Thank you in advance
>I am interested in getting base results concerning %WER,
>%Substitutions(S), %Deletions(D) and %Insertions(I),
Sorry I don't understand what you are asking, pleasee rephrase your question
Ok, I am sorry.
I was trying to say that I use HREsults in order to get the word accuracy based on dynamic programming, as the HTK Book explains. I am interested in knowing the % of Deletions(D), Substitutions (S) and Insertions(I), which I get directly typing:
HREsults -n -A -T 1 -I original_mlf_files recout_files
but furthermore I would like getting a summarize indicating which words have been deleted and which words inserted and replaced, how many times for each word considered (and for substitutions and insertions, which words have been finally recognized, i.e. the correct word was 'he' but the recognized word has been 'she' and it has happened 25 times out of 75 total substitutions).
I suppose that HTK knows this information as it is necessary to be able to calculate the total number of S, D and I words, respectively.. but is there any way of taking it into a file?
I read the HREsults -options from HTK Book, and at the beginning I thought that using the cross matrix would help me, but as I use a 65K vocabulary, the matrix dimensions are too large in order to calculate all cross results..
Thank you for your help
> I would like getting a summarize indicating which words have been deleted
>and which words inserted and replaced, how many times for each word
>considered
Not sure if this can be done in HTK...best to ask on the HTK email list.
You could probably just write a script to do what you want, comparing the original_mlf_files to the recout_files.
Ok,
I have already thought in editing a script for it, but it is not easy because of dynamic programming.
I really think it exists an easy way of getting this kind of results from HTK, because HREsults needs them in order to calculate the number of insertions, deletions and substitutions.
I am waiting a response from HTK email list.
Thank you for your help and time