140 hours for each language or for all together?

Comments

Flat

User: Sergey
Date: 5/2/2008 8:49 am

Views: 5749
Rating: 24

Hello,

Does submission of audio in any language counts toward the 140 hour goal, or should it be 140 hours for each language?

--Thanks,

Sergey

Re: 140 hours for each language or for all together?

User: kmaclean
Date: 5/2/2008 11:23 am

Views: 257
Rating: 25

Hi Sergey,

> Does submission of audio in any language counts toward the 140 hour goal,

>or should it be 140 hours for each language?

The metrics page only applies to the English language.

I don't know exactly how much speech is required for a good command and control acoustic model. The Acoustic Models used by Sphinx were trained using 140 hours of 1996 and 1997 hub4 training data. I just used that as a target. I assume that the same would apply to other languages.

Note: nsh has mentioned that it's not really the amount of speech that is important, but the quality (accurate transcriptions, low noise, ...).

Ken

Previous • Next •


Username	Password