General Discussion

Flat
Re: Acoustic model 0.1.2
User: kmaclean
Date: 9/5/2008 8:23 am
Views: 216
Rating: 7

HI Dano,

>And I created a .po file (currently reviewed) to launchpad (launchpad.net/voxforge .) I will also set up the .po file for

>translation of the site and prompts if you want to.

That would be awesome.  Though I'm still not 100% clear on how a Java .po file works (or how collecting translations on Launchpad works...), but the ability to add new languages without tinkering with code seems like a good thing to me!

Do you need access to the Subversion repository?

Ken

--- (Edited on 9/5/2008 9:23 am [GMT-0400] by kmaclean) ---

Re: Acoustic model 0.1.2
User: dano
Date: 9/6/2008 6:36 am
Views: 68
Rating: 7

Maybe this link is helpful?

http://www.gnu.org/software/autoconf/manual/gettext/Java.html

Unfortunately I have not much Java experience, but this seems not very difficult to me.
 
Daniël

--- (Edited on 06-09-2008 1:36 pm [GMT+0200] by dano) ---

Re: Acoustic model 0.1.2
User: kmaclean
Date: 9/9/2008 11:35 am
Views: 69
Rating: 6

Hi Daniël,

 

>Maybe this link is helpful?

Yes, thanks,

I am glad to see that GNU gettext po files are implemented using Sun's own Java Internationalization mechanism.

>but this seems not very difficult to me

well... as they say: "the devil is in the details "

Ken

--- (Edited on 9/9/2008 12:35 pm [GMT-0400] by kmaclean) ---

Re: Acoustic model 0.1.2
User: kmaclean
Date: 9/2/2008 12:40 pm
Views: 83
Rating: 10

Hi Dano,

>http://www.dev.voxforge.org/projects/Main/ticket/366 is about

>other languages :)

oops...maybe I should read the thing... :)

The one I was thinking of is: ticket #376 - Nightly Build Acoustic Model Performance Decrease.

 

Ken

--- (Edited on 9/2/2008 1:40 pm [GMT-0400] by kmaclean) ---

Re: Acoustic model 0.1.2
User: nsh
Date: 9/6/2008 6:06 am
Views: 218
Rating: 8

Well, it's important to have clean data and quantative tests, without them it's impossible to move forward.

Turned by this discussion I started to train sphinx model, it will take a week I suppose on my machine, but probably we'll move training to the cluster.

I already hav found the following problems in prompts:

corno1979-10102006
kylegoetz-10122006
corno1979-10102006-NR - bad PROMPTS

mfread* - no PROMPTS, just prompts.txt

douglaid-20080205 vf-01 instead of vf-1

many PROMPTS has ../../../Audio/MFCC/XXkHz_YYbit/MFCC_0_D/ inside

douglaid-20080203 - incorrect prompt line

mojomove411-20071102-poe/wav/iaf0007 KILTARTAN\342\200\231S - bad word

 And conducted the list of problematic utterances for which alignemnt failed, it would be nice to review them:

douglaid-20080219/wav/vf11-07,
douglaid-20080219/wav/vf11-08,
douglaid-20080219/wav/vf11-11,
knotyouraveragejo-20080428-adv/wav/adv0231,
G-20080425-itf/wav/b0002,
xaviergonz-20080419-uje/wav/a0398,
xaviergonz-20080419-uje/wav/a0404,
ductapeguy-20070308b/wav/bab.0023,
peterwhy-20080503-win/wav/win0151,
chocoholic-20070524/wav/eti0091,
chocoholic-20070524/wav/eti0237,
anonymous-20080204-hnl/wav/ar-24,
anonymous-20080716-sfu/wav/a0340,
knotyouraveragejo-20080502-adv/wav/adv0280,
anonymous-20080630-lhi/wav/a0285,
gilrim-20080120-vgs/wav/b0415,
rjmunro-20080517-win/wav/a0236,
Toyo-20080229-ogz.zip/wav/a0104,
Toyo-20080229-ogz.zip/wav/a0105,
Toyo-20080229-ogz.zip/wav/a0106,
Toyo-20080229-ogz.zip/wav/a0108,
Toyo-20080229-ogz.zip/wav/a0111,
Toyo-20080229-ogz.zip/wav/a0112,
mjmm-20080526-hca/wav/b0074,
mjmm-20080526-hca/wav/b0075,
mjmm-20080526-hca/wav/b0076,
mjmm-20080526-hca/wav/b0077,
mjmm-20080526-hca/wav/b0078,
mjmm-20080526-hca/wav/b0079,
mjmm-20080526-hca/wav/b0080,
mjmm-20080526-hca/wav/b0081,
mjmm-20080526-hca/wav/b0082,
knotyouraveragejo-20070621-sci/wav/sci0150,
nestea247-20080301-sbn/wav/a0310,
corno1979-10102006-NR/wav/cc011,
corno1979-10102006-NR/wav/cc012,
corno1979-10102006-NR/wav/cc016,
corno1979-10102006-NR/wav/cc018,
corno1979-10102006-NR/wav/cc026,
corno1979-10102006-NR/wav/cc033,
corno1979-10102006-NR/wav/cc036,
corno1979-10102006-NR/wav/cc039,
Mark_Reynolds-20070531-cc/wav/cc-27,
cebidae-20080522-nsi/wav/b0385,
gilrim-20080120-ohc/wav/a0495,
gilrim-20080120-ohc/wav/a0500,
xenobyte72-20080530-pgo/wav/b0131,
kayray-20070611-ele/wav/ele0116,
chocoholic-20070612-eti33/wav/eti0278,
bloomtom-20080612-pfg/wav/a0401,
KnitGirl-20071113-dil/wav/b0274,
gilrim-20080120-uxi/wav/a0093,
gilrim-20080120-uxi/wav/a0094,
gilrim-20080120-uxi/wav/a0095,
gilrim-20080120-uxi/wav/a0096,
gilrim-20080120-uxi/wav/a0097,
robertburrelldonkin-20070918-vf16/wav/vf16-22,
cebidae-20080522-npq/wav/a0264,
cebidae-20080522-npq/wav/a0265,
cebidae-20080522-npq/wav/a0267,
Thomas-20080507-iya/wav/a0187,
vince-20071118-tez/wav/b0297,
gilrim-20080120-rzu/wav/rp-10,
vikramjb-20080416-cls/wav/a0398,
vikramjb-20080416-cls/wav/a0403,
vikramjb-20080416-cls/wav/a0404,
vikramjb-20080416-cls/wav/a0405,
vikramjb-20080416-cls/wav/a0406,
guilherme-20080123-pfh/wav/b0150,
knotyouraveragejo-20070620-sci/wav/sci0135,
anonymous-20080425-ojw/wav/b0363,
russellfeeed-20080211-upk/wav/b0025,
russellfeeed-20080211-upk/wav/b0026,
russellfeeed-20080211-upk/wav/b0027,
russellfeeed-20080211-upk/wav/b0028,
russellfeeed-20080211-upk/wav/b0031,
russellfeeed-20080211-upk/wav/b0033,
russellfeeed-20080211-upk/wav/b0034,
kayray-20070527-per07/wav/per0007,
kayray-20070527-per07/wav/per0014,
kayray-20070527-per07/wav/per0057,
kayray-20070527-per07/wav/per0071,
kayray-20070527-per07/wav/per0120,
kayray-20070527-per07/wav/per0141,
kayray-20070527-per07/wav/per0179,
kayray-20070527-per07/wav/per0231,
kayray-20070527-per07/wav/per0319,
kayray-20070527-per07/wav/per0335,
CptOatmeal-20080721-vnh/wav/a0426,
Joel-20080716-qoz/wav/b0074,
Joel-20080716-qoz/wav/b0075,
Joel-20080716-qoz/wav/b0076,
Joel-20080716-qoz/wav/b0077,
Joel-20080716-qoz/wav/b0078,
Joel-20080716-qoz/wav/b0080,
Joel-20080716-qoz/wav/b0081,
Joel-20080716-qoz/wav/b0082,
Joel-20080716-qoz/wav/b0083,
kayray-20070425-per04/wav/per0041,
kayray-20070425-per04/wav/per0073,
kayray-20070425-per04/wav/per0100,
kayray-20070425-per04/wav/per0105,
bloomtom-20080612-vya/wav/rb-31,
GrahamPhillips-20071111-oxp/wav/a0115,
GrahamPhillips-20071111-oxp/wav/a0117,
anonymous-20071127-rln/wav/a0575,
anonymous-20080318-eaq/wav/b0073,
jaiger-20061231-vf7/wav/vf7-25,
starlite-20070614-fur2/wav/fur0136

 

--- (Edited on 9/6/2008 6:07 am [GMT-0500] by nsh) ---

Re: Acoustic model 0.1.2
User: Visitor
Date: 9/6/2008 7:28 am
Views: 78
Rating: 9

I've not a very fast Internet connection so it takes long to download :( so I take some of the recordings.

 

douglaid-20080219:

incorrect prompt lines (the prompt 5 is skipped)

5= 6

6 = 7

until douglaid-20080219/mfc/vf11-16 THE ADDED WEIGHT HAD A VELOCITY OF FIFTEEN MILES PER HOUR (15 and 16 are equal))

 

G-20080425-itf/wav/b0002 a little tap in the beginning

 

xaviergonz-20080419-uje a0398 seems good, record of a0404 begins too late (the p of PERRAULT is not recorded.)

 

ductapeguy-20070308b/wav/bab.0023 seems good.

 

peterwhy-20080503-win/mfc/win0151 seems good, but I think they are two phrases, so he stops a while after lunch.

 

(peterwhy-20080503-win/mfc/win0150 NOR YOU EITHER IF YOU'VE GOT ANY SENSE AT ALL DON'T EVER REFER TO IT AGAIN PLEASE
peterwhy-20080503-win/mfc/win0151 NOW THEN HERE'S OUR BACKWATER AT LAST WHERE WE'RE GOING TO LUNCH LEAVING THE MAIN STREAM
peterwhy-20080503-win/mfc/win0152 THEY NOW PASSED INTO WHAT SEEMED AT FIRST SIGHT LIKE A LITTLE LAND LOCKED LAKE)

 

anonymous-20080204-hnl (sounds like breathing in in the first part)

 

anonymous-20080716 (little tap in sound)

 

anonymous-20080630-lhi (blows in microphone)

 

 

 

--- (Edited on 9/6/2008 7:28 am [GMT-0500] by Visitor) ---

Re: Acoustic model 0.1.2
User: dano
Date: 9/6/2008 9:08 am
Views: 878
Rating: 8

It was me :)

douglaid-20080219 is very serious as 5 6 7 8 9 10 11 12 13 14 15 are wrong.

--- (Edited on 06-09-2008 4:08 pm [GMT+0200] by dano) ---

some additional files.

anonymous-20080630-lhi wav/a0285 blows in microphone

gilrim-20080120-vgs (all) very noisy, but is comprehendable

rjmunro-20080517-winwav/a0236 big tap

Toyo-20080229-ogz.zip very bad: noisy and can not speak English

mjmm-20080526-hca VERY noisy

nestea247-20080301-sbn begins with tap

corno1979-10102006-NR seems good, but isn't it required to have capitals instead of normal sentences? (I don't know, but the other prompts did have.)

Mark_Reynolds-20070531-cc/mfc/cc-27 AND LAID HER ON HER RIGHT SIDE THEN SARAH CONFIRMED THE VET'S DIAGNOSIS instead of

cc-27 AND LAID HER ON HER RIGHT SIDE THEN SARAH CONFIRMED THE VET'S DIAGNOSIS ? all prompts in this file

cebidae-20080522-ns also previous thing, but says 'that' instead of 'last' and the last words are not good spoken.

 

 

 

 

 

 

 

 

 

 

 

--- (Edited on 06-09-2008 10:43 pm [GMT+0200] by dano) ---

--- (Edited on 06-09-2008 11:10 pm [GMT+0200] by dano) ---

Re: Acoustic model 0.1.2
User: nsh
Date: 9/6/2008 2:53 pm
Views: 3244
Rating: 8

Thanks Dano, indeed there is high probability that listed files are broken. The question is what should we do with them - remove, add as fillers, something else.

Training went faster than I expected, I've got a model already, you can download sphinx voxforge model with setup scripts and logs here:

http://www.mediafire.com/?jxy1bkznozb

At least now we have estimation of the model accuracy, on the 1/10 test set with a custom trigram lm trained on the test prompts it has the following quality:

 TOTAL Words: 28112 Correct: 25767 Errors: 3158
TOTAL Percent correct = 91.66% Error = 11.23% Accuracy = 88.77%
TOTAL Insertions: 813 Deletions: 415 Substitutions: 1930

 Not bad, but I suppose we can raise the accuracy to 97% if we'll try to optimize training.

 Here is another list of suspicious prompts:

 douglaid-20080219/wav/vf11-07,
douglaid-20080219/wav/vf11-08,
knotyouraveragejo-20080426-adv/wav/adv0190,
knotyouraveragejo-20080426-adv/wav/adv0308,
kayray-20070611-leo/wav/leo0210,
knotyouraveragejo-20080502-adv/wav/adv0280,
Toyo-20080229-ogz.zip/wav/a0111,
mjmm-20080526-hca/wav/b0074,
mjmm-20080526-hca/wav/b0075,
mjmm-20080526-hca/wav/b0076,
mjmm-20080526-hca/wav/b0078,
mjmm-20080526-hca/wav/b0079,
mjmm-20080526-hca/wav/b0080,
mjmm-20080526-hca/wav/b0081,
mjmm-20080526-hca/wav/b0082,
leonMire-20080526-lev/wav/lev0063,
corno1979-10102006-NR/wav/cc020,
corno1979-10102006-NR/wav/cc029,
Mark_Reynolds-20070531-cc/wav/cc-27,
kayray-20070608-rhi/wav/rhi0094,
safi-20071118-swr/wav/b0216,
starlite-20070605-che/wav/che0142,
kayray-20070611-ele/wav/ele0262,
robertburrelldonkin-200709011-vf11/wav/vf11-26,
KnitGirl-20071113-dil/wav/b0274,
gilrim-20080120-uxi/wav/a0093,
gilrim-20080120-uxi/wav/a0096,
gilrim-20080120-uxi/wav/a0101,
ttm-20071024-poe/wav/js0002,
topherfangio-20080604-jvb/wav/a0105,
ductapeguy-20080423-ang/wav/sto0020,
tis-20080416-tou/wav/voy0155,
knotyouraveragejo-20080525-mt2/wav/mtn0261,
vikramjb-20080416-cls/wav/a0398,
vikramjb-20080416-cls/wav/a0399,
vikramjb-20080416-cls/wav/a0400,
vikramjb-20080416-cls/wav/a0402,
vikramjb-20080416-cls/wav/a0403,
vikramjb-20080416-cls/wav/a0404,
vikramjb-20080416-cls/wav/a0405,
vikramjb-20080416-cls/wav/a0406,
CptOatmeal-20080721-vnh/wav/a0426,
Joel-20080716-qoz/wav/b0074,
Joel-20080716-qoz/wav/b0075,
Joel-20080716-qoz/wav/b0076,
Joel-20080716-qoz/wav/b0077,
Joel-20080716-qoz/wav/b0078,
Joel-20080716-qoz/wav/b0080,
Joel-20080716-qoz/wav/b0081,
Joel-20080716-qoz/wav/b0082,
Joel-20080716-qoz/wav/b0083,
anonymous-20071127-rln/wav/a0575,
anonymous-20080318-eaq/wav/b0073,
anonymous-20080318-eaq/wav/b0078,
anonymous-20080318-eaq/wav/b0079,
jaiger-20061231-vf7/wav/vf7-25,

--- (Edited on 9/6/2008 2:53 pm [GMT-0500] by nsh) ---

Re: Acoustic model 0.1.2
User: kmaclean
Date: 9/9/2008 11:00 am
Views: 86
Rating: 7

Hi nsh & Dano,

Good work guys! 

>The question is what should we do with them - remove, add as fillers,

>something else.

I will look at these (and any others you may have...) and either correct them (if it is just a section of audio that is causing problems) or just move them to "problem" directory in Subversion (and update the master prompts files) so we always have list of the ones we removed.

thanks,

Ken

--- (Edited on 9/9/2008 12:00 pm [GMT-0400] by kmaclean) ---

Re: Acoustic model 0.1.2
User: kmaclean
Date: 9/9/2008 11:07 am
Views: 117
Rating: 7

HI nsh,

>Training went faster than I expected, I've got a model already, you can

>download sphinx voxforge model with setup scripts and logs here:

>http://www.mediafire.com/?jxy1bkznozb

Awesome!

I will add this to the downloads page.

thanks,

Ken

 

 

--- (Edited on 9/9/2008 12:07 pm [GMT-0400] by kmaclean) ---

PreviousNext