VoxForge
That is the plan, but I want to get the processes and scripts down cold for English before tackling other languages. What other languages might you be interested in?
thanks,
Ken
Hi,
I'm interest in speech recognition. There are a lot of project of english language speech recognition and i want one of them adapt to lithuanian speech. I take an interest in this project, so would be exciting to consult with you.
In french ! I'm not a programmer also, but if you need any help, I can do it too, by contributing myself with my voice, and also by promoting contributions across french-speaking forums related to Open Source movement, etc.
Perhaps I have good news for all the non programmers out there. Setting up any new language actually requires quite a bit of work that can be done without any programming skills.
You can for instance figure out whether or not there exists a phonetic dictionary for your language. If so perhaps it's a good idea to start a new thread (called "language X") and post the link to the dictionary there.
If it doesn't exist, it needs to be made unfortunately!
What also needs to be made are prompt files containing non copyrighted short sentences such as a in the English prompt files. Preferably they should use modern spelling rules. Also it's nice if they contain all the sounds of your language. You can even records some if you want and store them yourself, ready to be submitted when VoxForge is ready for "language X" (but make sure the quality is okay, because we don't want you to waste your time).
Finally promotion is always good! If you have a blog for instance, you can write a tiny bit about VoxForge and post a link (if you know how, use interesting keywords such as "open source" "speech recognition" etc. - in your language). Your blog/forum post might bring someone else to Voxforge to help you out with your language. You never know.
Robin
Well there is phonetic dictionary I suppose and a language model can be easily constructed. So you can start right now with recording transcribed speech. Any free text from guttenberg.org is acceptable. It should be split on sentences.
Once there will be audio data it's possible to train model.
Hi ralfherzog,
Done ... I've added a new forum for submitting German Speech Files.
You should be able to find German translations of the GPL license on the fsf.org site - please include both German and English versions of it in your submission.
You might also want to talk to the folks at the Simon project (dialog manager that uses Julius). They are working on creating German Acoustic Models using HTK.
thanks,
Ken