VoxForge
I came to this site looking for software which would allow me to dictate longer English texts with a large vocabulary and 95%+ accuracy. I take it that you are still a long way from achieving this - right? Too bad. :-/
You need to mention that JavaScript must be enabled, otherwise no "Attachment" entry and "Browse..." button appears!!!
I'm not a native British speaker, but I think I sound more British than any other type of dialect.
Speaker Characteristics:
Gender: male;
Age range: adult;
Pronunciation dialect: British English (actually, non-native speaker, mother tongue is German)
Recording Information:
Microphone: cheap no-name, carbon;
Audio Card: Intel 82801DB-ICH4;
Audio Recording Software: Audacity rel 1.2.4;
O/S: Linux 2.6.17.9.
File Info:
File type: wav;
Sampling rate: 48kHz;
Sample rate format: 16bit;
Number of channels: 1.
Copyright (C) 2007 Richard Atterer
These files are free software; you can redistribute them and/or
modify them under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.
These files are distributed in the hope that they will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
rp-01 When the sunlight strikes raindrops in the air,
rp-02 they act as a prism and form a rainbow.
rp-03 The rainbow is a division of white light into many beautiful colors.
rp-04 These take the shape of a long round arch, with its path high above,
rp-05 and its two ends apparently beyond the horizon.
rp-06 There is , according to legend, a boiling pot of gold at one end.
rp-07 People look, but no one ever finds it.
rp-08 When a man looks for something beyond his reach,
rp-09 his friends say he is looking for the pot of gold at the end of the rainbow.
rp-10 Throughout the centuries people have explained the rainbow in various ways.
rp-11 Some have accepted it as a miracle without physical explanation.
rp-12 To the Hebrews it was a token that there would be no more universal floods.
rp-13 The Greeks used to imagine that it was a sign
rp-14 from the gods to foretell war or heavy rain.
rp-15 The Norsemen considered the rainbow as a bridge
rp-16 over which the gods passed from earth to their home in the sky.
rp-17 Others have tried to explain the phenomenon physically.
rp-18 Aristotle thought that the rainbow was caused by
rp-19 reflection of the sun's rays by the rain.
rp-20 Since then physicists have found that it is not reflection,
rp-21 but refraction by the raindrops which causes the rainbows.
rp-22 Many complicated ideas about the rainbow have been formed.
rp-23 The difference in the rainbow depends considerably upon the size of the drops,
rp-24 and the width of the colored band increases as the size of the drops increases.
rp-25 The actual primary rainbow observed is said to be the effect of
rp-26 super-imposition of a number of bows.
rp-27 If the red of the second bow falls upon the green of the first,
rp-28 the result is to give a bow with an abnormally wide yellow band,
rp-29 since red and green light when mixed form yellow.
rp-30 This is a very common type of bow, one showing mainly red and yellow,
rp-31 with little or no green or blue.
--- (Edited on 1/20/2007 2:31 pm [GMT-0600] by atterer) ---
Notice: many prompts in "English Speech Files" were adapted from the prompt files contained in the CMU_ARCTIC speech synthesis database, which were in turn derived from out-of-copyright texts from Project Gutenberg, by the FestVox project at the Language Technologies Institute at Carnegie Mellon University. |
Hi Richard,
thanks for the submission,
Yes, unfortunately, we have a long way to go to get a Free or Open Source Dictation application. The amount of audio required for dictation is in the hundreds of hours (and even when we have the required audio, we still need a Language Model with around 1 million words and a Dialog Manager to write the results in the target application).
Command and Control apps for desktop have a much smaller vocabulary, and thus don't need as much audio. For example, the Sphinx group of Speech Recognition Engines uses Acoustic Models trained with around 140 hours of speech, and can get reasonably good results in a command and control environment. 140 hours of speech is our target for release 1.0 of the VoxForge Acoustic Model, so we still have a long way to go even for this amount of speech.
Thanks for the info with respect to Javascript needing to be enabled, I'll update the site to remind people to make sure it is enabled.
Your submission will be incorporated into the VoxForge Acoustic Model tonight. You can try out the results at this link - remember the more speech you submit, the better your recognition results.
all the best,
Ken
--- (Edited on 1/20/2007 5:03 pm [GMT-0500] by kmaclean) ---
Notice: many prompts in "English Speech Files" were adapted from the prompt files contained in the CMU_ARCTIC speech synthesis database, which were in turn derived from out-of-copyright texts from Project Gutenberg, by the FestVox project at the Language Technologies Institute at Carnegie Mellon University. |