VoxForge
Hi Daniël and Ken,
I start another thread for that :
So I imagine next step is to do the transcription of the prompts into phonemes and triphones, right ?
That's a huge work but I know a GPL Phonetizer for French, is there any pb to use it to start the job ?
Phonemes to triphones will be easy If I have understood correctly what you want.
What phonetic alphabet do you need ?
Samuel,
Evening Daniël,
Okay I get you point about the coverage issue .
So actually I've done a quick transcription of prompts with espeak.
The transcription is not that bad, but I need to do some manual modifications.
Btw, without those modifications I have those stats for the first set of prompts I have pushed in the Listen section.
39 phonemes generated by espeak qty from 1 to 3290
6883 triphones qty from 1 to 240
But I think it's not correct because right now pause is considered as phoneme.
Do we have to consider pauses as a phoneme to built triphones or not ?
Thanks,
Samuel,
Hi there,
So to be perfectl sure I've understood, let's take an example with a pause between 2 words :
sentense : Alors, quand tout semble perdu, un homme se dresse.
phonemes : a l O R _ k a~ t u s a~ b l p E R d y _ 9~ n2 O m s @ d R E s
triphones : a+l a-l+O l-O+R O-R+_ R-_+k _-k+a~ k-a~+t a~-t+u t-u+s u-s+a~ s-a~+b a~-b+l b-l+p l-p+E p-E+R E-R+d R-d+y d-y+_ y-_+9~ _-9~+n2 9~-n2+O n2-O+m O-m+s m-s+@ s-@+d @-d+R d-R+E R-E+s E-s
Is that good ?
Samuel,
HI Samuel & Daniël,
>Do we have to consider pauses as a phoneme to built triphones or not ?
I'm not sure I understand why you are trying to create your own triphones... Triphones are generated automatically from your corpus using the acoustic model creation tools - see Step 9 of the VoxForge tutorial.
What you need is a pronunciation dictionary (see LIUM Tools - you need to check licensing though), some French prompt recordings (the French Speech Submission app is live - so you can submit speech that way, or record your own...), and just follow the VoxForge tutorial to create your own monophone acoustic model (Step 1 to Step 8) for HTK/Julius.
In order to create an HTK/Julius triphone-based acoustic model, you'll need a tree.hed script. I've never created one. For more information on how to create a tree.hed file for a new language, see the following links:
See also the HTK manual.
Note that the Sphinx acoustic model creation process creates its own "questions" - see this post for more information: Creating Sphinx acoustic models.
Ken
Hi Ken,
If I've get Daniël's point, it's only to quantify the *language range* covered by the prompts.
But I remember you words -" ... it is more an art than a science ..."
For the acoustic model, I've already found a 63k words GPL dictionnary and a 300k words "GPL inspired" (I'll contact them later to be sure ...)
But, right now, I'll wait a few days to see if everything is alright and then I'll create a wiki page for VoxForge on Ubuntu.fr
Samuel,
Hi Samuel,
>it's only to quantify the *language range* covered by the prompts.
OK, that make sense, but don't get to hung on it at this point, you still need many more speech contributions from many different people.
Ken
Hi Ken,
Alright, my plan is to write a small text to advertise VoxForge (maybe a video demo on YouTube) an to contact a few influent blogers with access to Planet aggregators, to propose them my text to write a post.
But before trying to give an impulsion, I'll wait a few days to see if everything is working well.
Are you okay with that ?
From what I've done this evening, I feel the need to write another set of prompts.
Samuel,
Hi Samuel,
You need a good working demo if you want a good visited YouTube-video. Try for example Gnome Voice Control 0.3 if you want to have a good example.
This is a good example http://nl.youtube.com/watch?v=GCSgkUnlGGA, although the creator has a bad English voice, and if you want to promote VoxForge, you have to be clear where they have to go to.
Daniël