General Discussion

Nested
Corpus Specification
User: odfi
Date: 6/13/2020 7:31 pm
Views: 3743
Rating: 0

Hi there,

I'm just wondering if you can possibly tell me more about the corpus?

My questions are:

 

  • In total, how languages are there? And what are they?
  • How many speakers are there for each language and in total?
  • Also, how long (preferably in hours) is the total speaking time for each language?

Many thanks for your help and best wishes!

 

 

--- (Edited on 6/13/2020 7:32 pm [GMT-0500] by ) ---

Re: Corpus Specification
User: kmaclean
Date: 6/15/2020 10:02 am
Views: 111
Rating: 0

>how long (preferably in hours) is the total speaking time for each language?

each language has a metrics (or a link) on their download page.

--- (Edited on 6/15/2020 11:02 am [GMT-0400] by kmaclean) ---

Re: Corpus Specification
User: odfi
Date: 6/28/2020 12:36 pm
Views: 1669
Rating: 0

Thanks! That was really helpful. 

Do you also happen to know how I can check the total size of the audio files, without downloading all of them?

I'm looking to use about the audio files for the top 7 biggest languages, and I was just wondering how many GBs will be enough. 

--- (Edited on 6/28/2020 12:36 pm [GMT-0500] by odfi) ---

PreviousNext