VoxForge
from the Facebook AI website:
Facebook AI is releasing Multilingual LibriSpeech (MLS), a large-scale, open source data set designed to help advance research in automatic speech recognition (ASR).
MLS provides more than 50,000 hours of audio across eight languages: English, German, Dutch, French, Spanish, Italian, Portuguese, and Polish. It also provides language-model training data and pretrained language models along with baselines to help researchers compare different ASR systems. Because it leverages public domain audiobooks from the LibriVox project, MLS offers a large data set with a broad range of different speakers, and it can be released with a nonrestrictive license.
MLS is available on OpenSL: