VoxForge
Speaker Characteristics:
Gender: male
Age range: adult
Pronunciation dialect: California
Recording Information:
Microphone: C-Media USB Headphone Set
Audio Card: Mac USB built-in
Audio Recording Software: Audacity rel 1.2.5
O/S: Mac OS X 10.4.8
File Info:
File type: wav
Sampling rate: 44.1kHz
Sample rate format: 32-bit float
Number of channels: 1
Copyright (C) 2006 Grant Hulbert
These files are free software; you can redistribute them and/or
modify them under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.
These files are distributed in the hope that they will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
cc-01 Well, here's a story for you: Sarah Perry was a veterinary nurse
cc-02 who had been working daily at an old zoo in a deserted district of the territory,
cc-03 so she was very happy to start a new job at a superb private practice
cc-04 in north square near the Duke Street Tower.
cc-05 That area was much nearer for her and more to her liking.
cc-06 Even so, on her first morning, she felt stressed.
cc-07 She ate a bowl of porridge, checked herself in the mirror
cc-08 and washed her face in a hurry. Then she put on a plain yellow dress
cc-09 and a fleece jacket, picked up her kit and headed for work.
cc-10 When she got there, there was a woman with a goose waiting for her.
cc-11 The woman gave Sarah an official letter from the vet.
cc-12 The letter implied that the animal could be suffering from a rare form
cc-13 of foot and mouth disease, which was surprising,
cc-14 because normally you would only expect to see it in a dog or a goat.
cc-15 Sarah was sentimental, so this made her feel sorry for the beautiful bird.
cc-16 Before long, that itchy goose began to strut around the office like a lunatic,
cc-17 which made an unsanitary mess.
cc-18 The goose's owner, Mary Harrison, kept calling, "Comma, Comma,"
cc-19 which Sarah thought was an odd choice for a name.
cc-20 Comma was strong and huge, so it would take some force to trap her,
cc-21 but Sarah had a different idea.
cc-22 First she tried gently stroking the goose's lower back with her palm,
cc-23 then singing a tune to her. Finally, she administered ether.
cc-24 Her efforts were not futile. In no time, the goose began to tire,
cc-25 so Sarah was able to hold onto Comma and give her a relaxing bath.
cc-26 Once Sarah had managed to bathe the goose, she wiped her off with a cloth
cc-27 and laid her on her right side. Then Sarah confirmed the vet's diagnosis.
cc-28 Almost immediately, she remembered an effective treatment
cc-29 that required her to measure out a lot of medicine.
cc-30 Sarah warned that this course of treatment might be expensive -
cc-31 either five or six times the cost of penicillin.
cc-32 I can't imagine paying so much, but Mrs. Harrison - a millionaire lawyer -
cc-33 thought it was a fair price for a cure.
cc-34 Comma Gets a Cure and derivative works may be used freely for any purpose
cc-35 without special permission provided the present sentence
cc-36 and the following copyright notification accompany the passage in print,
cc-37 if reproduced in print, and in audio format in the case of a sound recording:
cc-38 Copyright 2000 Douglas N. Honorof, Jill McCullough & Barbara Somerville.
cc-39 All rights reserved.
--- (Edited on 12/29/2006 2:26 am [GMT-0600] by granthulbert) ---
Notice: many prompts in "English Speech Files" were adapted from the prompt files contained in the CMU_ARCTIC speech synthesis database, which were in turn derived from out-of-copyright texts from Project Gutenberg, by the FestVox project at the Language Technologies Institute at Carnegie Mellon University. |
--- (Edited on 12/31/2006 1:05 am [GMT-0600] by granthulbert) ---
Notice: many prompts in "English Speech Files" were adapted from the prompt files contained in the CMU_ARCTIC speech synthesis database, which were in turn derived from out-of-copyright texts from Project Gutenberg, by the FestVox project at the Language Technologies Institute at Carnegie Mellon University. |
Hi Grant,
Don't worry, most of my submissions were in the minus range for quite a while!
I need to clarify on the web site that we are not looking for TV announcer quality voices (just listen to my voice recordings ... :) ) or perfect audio quality.
For Free Speech Recognition to work, we need a large variety of speech (from different people, with different dialects/accents, and using different prompts files with various phonemes and triphones) recorded in a variety of environments (rooms with echo, such as hardwood floors or tiles, and rooms with no echo, such as carpet, etc.) and on a variety of recording equipment (headset mics, desktop mics, built in mics, and USB mics, integrated audio, audio cards ...).
A good acoustic model needs to be trained with speech recorded in the environment it is targeted to recognize. A post by David Gelbert explains this a bit better (see this link, scroll down until you get to his message).
The rating system as currently set out does not reflect these requirements (yet ...). I am to blame for this, because my initial approach was to try to collect as much "Clean Speech" as possible and then use noise removal before sending the speech to be recognized to the speech recognition engine. My experiments with noise removal on recorded speech degraded the Acoustic Model's performance noticeably.
Based on David Gelbert's comments, we need to move away from the "Clean Speech" approach and go for what I will call "The Good, the Bad and the Ugly" (my apologies to spaghetti western fans ...) approach to collecting speech - i.e. get as much speech in its "natural environment", regardless of perceived quality, so that the acoustic model can recognize this speech in similar environments.
Ken
--- (Edited on 12/31/2006 11:24 pm [GMT-0500] by kmaclean) ---
Notice: many prompts in "English Speech Files" were adapted from the prompt files contained in the CMU_ARCTIC speech synthesis database, which were in turn derived from out-of-copyright texts from Project Gutenberg, by the FestVox project at the Language Technologies Institute at Carnegie Mellon University. |
Hi Grant,
Sounds Great! This audio comes up with 32-bit float, so maybe you just had incorrect setting in your previous submission.
There are also a few files that seem to be clipping at 0.9/-0.9 level again, but I think it might be originating with your mic or audio card (before Audacity gets the sound), since the audio sounds OK. It is strange, because when I record with the volume too high, clipping causes very noticeable distortion.
Again, I don't think it is anything to be concerned about from an Acoustic Model creation perspective, since we need speech from as many different hardware recording configurations as possible.
all the best,
Ken
--- (Edited on 1/ 3/2007 3:07 pm [GMT-0500] by kmaclean) ---
Notice: many prompts in "English Speech Files" were adapted from the prompt files contained in the CMU_ARCTIC speech synthesis database, which were in turn derived from out-of-copyright texts from Project Gutenberg, by the FestVox project at the Language Technologies Institute at Carnegie Mellon University. |