VoxForge
I'm using Julius and am hoping someone could recommend a program to aid in training.
Essentially I'd like to say a phrase into the mic ten times and have it log to a file what it heard rather than what it thinks it heard.
My thoughts are to develop my own voca file based on my dialect, atmosphere, and specific hardware.
For instance I say:
How are you
How are you
How are you
How are you
How are you
How are you
How are you
How are you
How are you
How are you
And it (hopefully) logs:
hh aw aa r y uw
hh aw aa r y uw
hh aw aa r y uw
hh aw aa r y uw
hh aw aa r y uw
hh aw aa r y uw
hh aw aa r y uw
hh aw aa r y uw
hh aw aa r y uw
hh aw aa r y uw
Thank you for taking the time to read, I hope someone has a suggestion...
--- (Edited on 11/9/2020 11:14 am [GMT-0600] by Tom J.) ---
If you look at the Julius recognition output when you run Julius, you will see that it outputs phonemes it recognizes:
...
### Recognition: 2nd pass (RL heuristic best-first)
...
sentence1: <s> PHONE STEVE </s>
wseq1: 0 2 4 1
phseq1: sil | f ow n | s t iy v | sil
cmscore1: 1.000 1.000 1.000 1.000
score1: -16547.482422
--- (Edited on 11/9/2020 4:10 pm [GMT-0500] by kmaclean) ---
Thank you for your response, I knew that was there but was hoping to do a bit more in depth comparison without having to scroll through a couple hundred lines.
I've written a REALLY crude API in C++, if theres no alternative I guess I could modify it for this.
--- (Edited on 11/12/2020 8:06 am [GMT-0600] by Tom J.) ---
I did log phenome strings, it was as simple as a phenome to phenome dictionary.
At any rate I've taken this concept in a different direction and am attempting to hear strictly phonetic to eliminate a big dictionary.
Rather than having two threads on the same concept I'll leave this link to the current thread here.
http://voxforge.org/home/forums/message-boards/general-discussion/log-phenome#COhJIx4B9e5N2SkZBIXFjw
--- (Edited on 11/20/2020 11:07 am [GMT-0600] by Tom J.) ---
I am planning to use this Julius. But I am worried since I've read a lot about this and it seems complicatated.
--- (Edited on 11/17/2021 12:11 am [GMT-0600] by richardsjenkins70) ---
It's not bad but much depends on the project. If you read through my other posts I used it in a fairly complicated chatbot without training.
Keep the voca file as small as possible.
--- (Edited on 11/18/2021 6:04 am [GMT-0600] by Tom J.) ---
Thanks Tom! I'm just starting my project and will try this Julius, hope it will work smoothly
--- (Edited on 12/7/2021 12:15 am [GMT-0600] by chaples199) ---
This plan did not work, Julius wasn't able to return the phenomes on their own accurately. I am running Julius successfully but it's a bit complicated, It's important to keep the dictionary (.voca) short so it has fewer incorrect words to match, so I ended up addressing it in C++ and gave my little robot the ability to stop and start Julius with a different vocabulary for different tasks.
Also Microphone selection is important, I'm running a PS Eye camera and it works well. Noisy rooms are also a problem because Julius wants to match every sound with a word so I did extensive string filtering so she wouldn't hear "are,are,are,are,are" when a fan or furnace was running, ect.
Look around voxforge at my other posts for more information on how I did it.
--- (Edited on 12/7/2021 10:01 am [GMT-0600] by Tom J.) ---
Sad to hear that it didn't work. But thank you for the advice and will surely keep it in mind.
--- (Edited on 5/6/2022 9:24 am [GMT-0500] by chaples199) ---
I don't agree much on this but you have a point.
--- (Edited on 5/20/2022 11:25 pm [GMT-0500] by Marie) ---