Get a spanish corpus - voxforge.org

Spanish

Get a spanish corpus

User: manuel
Date: 3/2/2010 5:33 am

Views: 4829
Rating: 15

Hi, I want to get a spanish corpus. I´m now developing an aplication with speech recognition and it´s works perfectly with your english corpus. Now I want to do the recognition with spanish. I know that I´ll need to train the model with my voice but, how can I get the recorded voices that you have? I try to get all from de svn but I haven´t got access.

Of course I want to colaborate and upload my records when I finish my job.

Now I´m doing my records with 48000 hz I know than when I want to integrate it with your voices I´ll need to downsample it. But what is the reason of doing the records with that rate?

Thanks

Re: Get a spanish corpus

User: nsh
Date: 3/2/2010 1:28 pm

Views: 238
Rating: 15

> how can I get the recorded voices that you have?

wget -N -nd -c -e robots=off -A tgz,html -r -np \
hhttp://www.repository.voxforge1.org/downloads/es/Trunk/Audio/Main/8kHz_16bit/

> Now I´m doing my records with 48000 hz I know than when I want to integrate it with your voices I´ll need to downsample it. But what is the reason of doing the records with that rate?

Are you asking yourself why do you record with such rate? Nobody else except you can answer on this question.

Re: Get a spanish corpus

User: Visitor
Date: 3/2/2010 3:05 pm

Views: 124
Rating: 13

I know why I record with that rate. I want to know why do you record the voices with 16 khz or 8 khz rate? I read something about the Nyquist theorem.

Also I supouse that I must downsample my records to integrate it with yours.

Re: Get a spanish corpus

User: manuel
Date: 3/2/2010 3:12 pm

Views: 175
Rating: 14

sorry I was who post the last post, I forgot log in.

Re: Get a spanish corpus

User: nsh
Date: 3/4/2010 3:02 pm

Views: 1685
Rating: 14

> I want to know why do you record the voices with 16 khz or 8 khz rate?

Voxforge audio is recorded with various sample rates. Models are built with 16 kHz because 16 kHz is the sample rate that allows you to decode both 16 kHz and 48 kHz audio without decrease of performance. Telephone models are built with 8kHz because it's the sampling rate of voip codecs.

> I read something about the Nyquist theorem.

I'm not sure how is it related

Previous • Next •


Username	Password