Contributing Audio and Phonemes to the German Model

German

Flat

Re: Contributing Audio and Phonemes to the German Model

User: guenter
Date: 12/13/2013 5:19 am

Views: 179
Rating: 0

...but that still will not adress the underlying problem completely - there are many other "words" which have more than one correct pronounciation (let alone regional dialects...). E.g. english words which have german counterparts that have the exact same spelling:

"Personal Computer" vs "Personal Sekritaerin"

abbreviations: some people will read "z.B." "zetbe", others would read "zum Beispiel"

regional dialekts: people from the south are more likely to read "richtig" as "richtik" which others will read "richtich"

not sure how big the impact of all that is - but still, if we want to have really high-quality training material, we should have correct phoneme transcriptions of what people read, not what they were supposed to read if they worked as a tv news anchor... ;)

Re: Contributing Audio and Phonemes to the German Model

User: Andreas
Date: 12/15/2013 11:06 am

Views: 72
Rating: 0

Hello all,

sorry for my late response:

Since we do not have much infrastructure right now, maybe agreeing on a simple common format for those efforts would help already? e.g. we could decide we want to use simple CSV, something along the lines of

submission,prompt,rating,comment

and simply post those here?

I think a central "place" like Google Doc/Drive would be the best solution. This kind of data storage should avoids an otherwise necessary CSV synchronisation.

I'm working on a webcrawler right now but it has a long way to go.

As a Java/Web developer i already developed some more or less small web crawler. Maybe, i can help you?

I also have a "bored" virtual server instance with 3 cores, 2GB ram and a real unlimited traffic which is waiting for a challenging task. Maybe an additional infrastructure part, where the crawler can run 24x7? I could also install an additional SVN repository for data synchronisation (CSV files, ... ?)

I also experienced in digital audio processing, so slicing audio files should not be a problem.

Well, i would really like to help. Where can i start?

(1) Submission rating? If someone explains me, how a "good" submission can be distinguished, thats maybe a part for me :)

(2) Audio slicing? All i need is a short introduction. BTW. I also found last week a possible media source... Maybe a look worth..

(3) ...

I dont know, but i think we should use a good performing way for coordination? (eMail, Instant Messaging, ...?)

Regards,

Andreas

Re: Contributing Audio and Phonemes to the German Model

User: Binh
Date: 12/16/2013 8:49 am

Views: 66
Rating: 0

(1) Submission rating? If someone explains me, how a "good" submission can be distinguished, thats maybe a part for me :)

That's a really good question and the anwser depends on what you do. But I can tell you what "bad" submissions are.

www.messe2media.com/files/Aligning_Fehler.odt

Take a look a this file with files I checked.

We mostly look for transcription - audio mismatch. This for example:

anonymous-20100302-huu/mfc/de8-089

PROMPT: DIE PROVINZ HAT 75 MILLIONEN EINWOHNER

SAID: DIE PROVINZ HAT 7 KOMMA 5 MILLIONEN EINWOHNER

In annother instance the speaker stopped the recording to soon und the end of the last word were missing.

Heavy accent or a lot of static aren't necessary a bad thing if you want to make your model more robust but maybe should be marked as noisy or accent.

But we need some sort of checklist before we start controlling again. I just checked files that were rejected by Force Alignment to find out what happend and ended up checking a lot files twice.

P.S.: If you wonder about some files without PROMPT-SAID Pair or any other comment in my file. Those files were rejected without any apparent reason.

Re: Contributing Audio and Phonemes to the German Model

User: guenter
Date: 12/16/2013 10:06 am

Views: 49
Rating: 1

I like the idea of having a checklist for common errors instead of a numeric rating a lot!

Maybe we can collect properties we would like to check for?

- noisy (would also include background noise/music)

- accent

- transcription mismatch, if possible also note correct transcription

- truncated

- audio distored (e.g. clipping, much too silent, etc)

About aligning errors: at least for HTK my experience so far is that the order in which you process training data matters a lot. For now, I am using the most clean ones first, then slowly work my way towards more noisy files - but I am no expert in this area, maybe people more knowledgable in the field of speech recognition can comment here?

Re: Contributing Audio and Phonemes to the German Model

User: Binh
Date: 2/3/2014 4:56 am

Views: 24
Rating: 0

As a Java/Web developer i already developed some more or less small web crawler. Maybe, i can help you?

You sure can. Anyway I can contact you directly? Well my script so far is perl script based. It's hooks on to google for a certain searchword and download the links it get there.

Having a small construct that I only need to modify would certainly help. Any way I can contact you through email?

If yes write me at [email protected].

Binh

Re: Contributing Audio and Phonemes to the German Model

User: Dr_Grilli
Date: 12/18/2013 9:47 am

Views: 78
Rating: 0

Addressing this pronunciation issue: There is a standard book for education of actors, presenters in radio etc. called "Der kleine Hey, Die Kunst des Sprechens". If anyone is interested in training his or her ears on distinguishing pronunciation and dialects he or she should read this.

Re: Contributing Audio and Phonemes to the German Model

User: guenter
Date: 2/4/2014 3:52 am

Views: 50
Rating: 0

Ken,

I have uploaded new files to the FTP server, could you move them to the german model, once again?

Thanks.

List of new files:

-rw----r-- 1 u41386649-librivox ftpusers 45608067 Feb 3 18:46 guenter-20131126-afn.tgz
-rw----r-- 1 u41386649-librivox ftpusers 23063436 Feb 3 18:46 guenter-20131126-afq.tgz
-rw----r-- 1 u41386649-librivox ftpusers 24256015 Feb 3 18:46 guenter-20131126-ftr.tgz
-rw----r-- 1 u41386649-librivox ftpusers 22394486 Feb 3 18:46 guenter-20131126-qah.tgz
-rw----r-- 1 u41386649-librivox ftpusers 36629977 Feb 3 18:47 guenter-20131126-xck.tgz
-rw----r-- 1 u41386649-librivox ftpusers 28233577 Feb 3 18:47 guenter-20140124-afn.tgz
-rw----r-- 1 u41386649-librivox ftpusers 29508954 Feb 3 18:47 guenter-20140124-ftr.tgz
-rw----r-- 1 u41386649-librivox ftpusers 30036033 Feb 3 18:47 guenter-20140124-qah.tgz
-rw----r-- 1 u41386649-librivox ftpusers 25927918 Feb 3 18:47 guenter-20140124-xck.tgz
-rw----r-- 1 u41386649-librivox ftpusers 29199306 Feb 3 18:48 guenter-20140125-afn.tgz
-rw----r-- 1 u41386649-librivox ftpusers 24780151 Feb 3 18:48 guenter-20140125-ftr.tgz
-rw----r-- 1 u41386649-librivox ftpusers 28504662 Feb 3 18:48 guenter-20140125-qah.tgz
-rw----r-- 1 u41386649-librivox ftpusers 29279477 Feb 3 18:48 guenter-20140125-xck.tgz
-rw----r-- 1 u41386649-librivox ftpusers 29115849 Feb 3 18:48 guenter-20140126-afn.tgz
-rw----r-- 1 u41386649-librivox ftpusers 30190623 Feb 3 18:48 guenter-20140126-afq.tgz
-rw----r-- 1 u41386649-librivox ftpusers 28193981 Feb 3 18:48 guenter-20140126-ftr.tgz
-rw----r-- 1 u41386649-librivox ftpusers 27928786 Feb 3 18:49 guenter-20140126-ofp.tgz
-rw----r-- 1 u41386649-librivox ftpusers 28498014 Feb 3 18:49 guenter-20140126-qah.tgz
-rw----r-- 1 u41386649-librivox ftpusers 27617023 Feb 3 18:49 guenter-20140126-xck.tgz
-rw----r-- 1 u41386649-librivox ftpusers 24763766 Feb 3 18:49 guenter-20140127-afn.tgz
-rw----r-- 1 u41386649-librivox ftpusers 26188447 Feb 3 18:49 guenter-20140127-afq.tgz
-rw----r-- 1 u41386649-librivox ftpusers 27152435 Feb 3 18:49 guenter-20140127-cwp.tgz
-rw----r-- 1 u41386649-librivox ftpusers 27828147 Feb 3 18:50 guenter-20140127-ftr.tgz
-rw----r-- 1 u41386649-librivox ftpusers 25691346 Feb 3 18:50 guenter-20140127-ofp.tgz
-rw----r-- 1 u41386649-librivox ftpusers 31955074 Feb 3 18:50 guenter-20140127-qah.tgz
-rw----r-- 1 u41386649-librivox ftpusers 26817288 Feb 3 18:50 guenter-20140127-sie.tgz
-rw----r-- 1 u41386649-librivox ftpusers 25318291 Feb 3 18:50 guenter-20140127-usn.tgz
-rw----r-- 1 u41386649-librivox ftpusers 29327631 Feb 3 18:50 guenter-20140127-vau.tgz
-rw----r-- 1 u41386649-librivox ftpusers 28645027 Feb 3 18:50 guenter-20140127-xck.tgz
-rw----r-- 1 u41386649-librivox ftpusers 29959817 Feb 3 18:51 guenter-20140127-yic.tgz
-rw----r-- 1 u41386649-librivox ftpusers 43450466 Feb 3 18:51 guenter-20140128-qah.tgz
-rw----r-- 1 u41386649-librivox ftpusers 29519500 Feb 3 18:51 guenter-20140130-ftr.tgz
-rw----r-- 1 u41386649-librivox ftpusers 28615450 Feb 3 18:51 guenter-20140130-qah.tgz
-rw----r-- 1 u41386649-librivox ftpusers 28641057 Feb 3 18:51 guenter-20140130-xck.tgz
-rw----r-- 1 u41386649-librivox ftpusers 37023860 Feb 3 18:51 guenter-20140131-ftr.tgz
-rw----r-- 1 u41386649-librivox ftpusers 31740366 Feb 3 18:51 guenter-20140131-qah.tgz
-rw----r-- 1 u41386649-librivox ftpusers 33904759 Feb 3 18:52 guenter-20140203-afn.tgz
-rw----r-- 1 u41386649-librivox ftpusers 44487141 Feb 3 18:52 guenter-20140203-ftr.tgz
-rw----r-- 1 u41386649-librivox ftpusers 29811999 Feb 3 18:52 guenter-20140203-qah.tgz
-rw----r-- 1 u41386649-librivox ftpusers 44522006 Feb 3 18:52 guenter-20140203-xck.tgz
-rw----r-- 1 u41386649-librivox ftpusers 36050007 Feb 3 18:52 guenter-20140204-ftr.tgz
-rw----r-- 1 u41386649-librivox ftpusers 35056162 Feb 3 18:52 guenter-20140204-qah.tgz

Re: Contributing Audio and Phonemes to the German Model

User: kmaclean
Date: 2/12/2014 8:49 am

Views: 37
Rating: 0

>I have uploaded new files to the FTP server, could you move them to the german model, once again?

done,

for any new submissions, please update the date in your license file

thanks,

Ken

Re: Contributing Audio and Phonemes to the German Model

User: guenter
Date: 2/14/2014 4:57 am

Views: 9
Rating: 0

ah, good point, thanks, will do :)

Two more questions:

- I have created a new, updated german audio model for CMU Sphinx (http://goofy.zamia.org/voxforge/de/) - this is based on our submission rating/tagging effort. Currently I am still polishing the model but once it is ready for use I will open a new Thread about it in the german forum. Would you be willing to host the new model on the official voxforge servers somewhere?

- what is the current policy about using librivox audiobooks to generate submissions - if I create submissions based on librivox audiobooks, should I name and upload them using my personal account or should I create/use a different account?

Re: Contributing Audio and Phonemes to the German Model

User: nsh
Date: 2/14/2014 5:19 am

Views: 42
Rating: 1

> I have created a new, updated german audio model for CMU Sphinx (http://goofy.zamia.org/voxforge/de/) - this is based on our submission rating/tagging effort. Currently I am still polishing the model but once it is ready for use I will open a new Thread about it in the german forum. Would you be willing to host the new model on the official voxforge servers somewhere?

It's better to use sphinxbase/sphinxtrain trunk to train such model, it creates significantly more accurate models which are noise robust too.

It's also better to train models with LDA/MLLT transform, they usually are 20% more accurate.

For the models it's also better to provide test results so others could reproduce them and estimate model accuracy. For the accurate test results it's better to have separate speakers in a test set, currently the model is overtrained for the speakers who have majority of the recordings in the db.

[ «Previous Page | 1 2 3 | Next Page» ]

Previous • Next •


Username	Password