VoxForge
Those following the audio tagging thread already know this, but here is the "official" announcement anyway:
Over the past few weeks I have (with a lot of help from this forum) computed an updated german voxforge model for CMU pocketsphinx (SVN trunk).
The model is available for download here:
http://goofy.zamia.org/voxforge/de/
Some statistics about this model (more in voxforge.html and audio-stats.txt):
Hallo
May i request you to share the audio files 16khz with us?
I would require the files for recognizing numbers.. I can add them to my systems.. Ofcourse i will add all generated audiofile samples for my acoustic model for others too ..
Kleine Anmerkung.
In deinen Submissions sind eine Reihe von Fremdwörtern, wie Hidden Markov Modell. Obwohl sie praktisch natürlich zum deutschen Sprachgebrauch gehören führen Wörter wie "Hidden" auf unserem System zu gewissen Problemen. ( kleine Transitionmatrix Probleme).
Im allgmeinen sollte man fürs erste vermutlich keine Fremdwörter benutzen.Ansonsten . Good work :thumbsup:
Es exisitieren eine ganze Reihe von Papern im Versuch diese zu behandeln, aber ich glaube nicht das irgend eines davon bereits umgesetzt wurde.
sorry for my late response - I have finally found the time to pack and upload the remaining audio files I produced over the past weeks.
everything is available in voxforge / de / audio / main :
http://www.repository.voxforge1.org/downloads/de/Trunk/Audio/Main/16kHz_16bit/
mir ist sehr unklar, warum solche woerter zu problemen fuehren - enthalten sie phoneme, die solche systeme sonst nicht kennen?
grundsaetzlich sollte es doch voellig egal sein, ob es sich um fremdwoerter handelt oder nicht - zumal gerade das deutsche ja recht arm an "eigenen" woertern ist (und selbst die sind ja oft lateinischen oder griechischen ursprungs ;) ) - mir ist nichtmal ganz klar, wie ein computer diese unterscheidung ueberhaupt treffen koennte.
wie dem auch seie - ich sehe im moment fuer mich keine moeglichkeit, ohne fremdwoerter auszukommen, weil ich in richtung eines dialogsystems plane, das moeglichst mit alltagssprache zurecht kommen soll und da gehoeren zumindest woerter wie "computer" und "handy" einfach dazu, auch wenn es fremdwoerter sind.
Es kommt sehr darauf an, welches Wörterbuch dahinter liegt.
Nehmen wir hier mal gleich das Beispiel Handy.
Lies es einmal deutsch und einmal englisch vor.
Sollte ich es ungefähr den wortlaut transcibieren würde ich "händi" nehmen . Natürlich wissen Programme wie espeak das nicht und machen was anderes daraus. "Hand y" wäre eine Variante aber ich weiss es mometan nicht da die Kombination dy so gut wie gar nicht in deutschen nicht-fremdwörtern auftaucht.
Man müßte also erst die englischen Phonemereihe erstellen und diese dann in eine äquivalente deutsche Phonemreihe umwandeln.
You get my point. Solange man auch dein Wörterbuch mit übernimmt und du überall darauf geachtet hast, sollte es eigentlich gehen, aber Leute nur auf Basis des voxforge Audio Materials etwas entwickeln sollte man Fremdwörter in den Submissions möglichst gering halten.
Updated model is available now at
http://goofy.zamia.org/voxforge/de/
the model covers all german voxforge submissions as of yesterday (except the openpento submissions which will require a more sophisticated noise model). dictionary is up to 15423 words now.
Statistics:
total 34930 files, total length: 2791.65min reviewed 34930 files, reviewed length: 2791.65min (100% done) good 32856 files, good length: 2624.69min ( 94% good) unique words in all submissions: 14951, unique words in reviewed good submissions: 14848 Data per user: ALI : 0.83min 67 words AdrianTovar : 5.70min 445 words 0.0% ts werr BRwgt : 8.43min 693 words 0.0% ts werr Black_Galaxy : 0.69min 69 words 0.0% ts werr Crazo : 2.20min 203 words 6.5% ts werr DanielNeuhaus : 3.43min 372 words 14.6% ts werr Defense : 1.48min 127 words 0.0% ts werr DirkSchnelleWalka : 1.41min 142 words 0.0% ts werr FrankJger : 0.96min 63 words HeavensRevenge : 28.97min 2755 words 1.0% ts werr Hornochse : 2.23min 281 words 0.0% ts werr J_N : 4.11min 379 words 14.8% ts werr Komeran : 1.62min 149 words 0.0% ts werr LinuxFan : 0.66min 68 words 0.0% ts werr LucasK : 3.27min 311 words 0.0% ts werr M11 : 3.19min 294 words 100.0% ts werr M12 : 2.93min 267 words M20 : 4.81min 430 words 0.0% ts werr M28 : 4.87min 413 words M32 : 6.08min 537 words M35 : 7.29min 560 words 100.0% ts werr MackyMesser : 2.68min 222 words 0.0% ts werr Manu : 221.94min 19842 words 0.6% ts werr Me : 0.92min 69 words 0.0% ts werr MingTran : 0.00min 7 words Nevs : 0.16min 12 words OssiDlz : 0.85min 79 words 0.0% ts werr RBwgt : 0.84min 63 words 0.0% ts werr RainCT : 0.43min 37 words Rebbidebbi : 2.93min 279 words 0.0% ts werr Spacefish : 1.55min 156 words 0.0% ts werr Steltek : 1.72min 231 words 0.0% ts werr Styx85 : 0.90min 86 words Susi : 0.93min 59 words 0.0% ts werr Tazy : 0.09min 8 words TheLinuxist : 1.68min 150 words 0.0% ts werr Thomas : 4.90min 415 words 0.0% ts werr UrbanCMC : 2.71min 215 words 0.0% ts werr alexander : 0.71min 72 words 0.0% ts werr andromeda : 0.97min 72 words anonymous : 71.12min 6018 words 4.4% ts werr b166er : 0.23min 22 words cairn : 19.39min 1569 words 4.7% ts werr cib : 0.06min 10 words computing : 11.01min 1269 words 3.2% ts werr doogent : 0.80min 73 words dturing : 1.91min 212 words 0.0% ts werr freuerin : 1.33min 146 words 0.0% ts werr geh_weida : 0.07min 8 words geon : 7.21min 743 words 1.0% ts werr gero : 0.71min 64 words 33.3% ts werr gouppe : 2.66min 127 words 20.0% ts werr grisch : 0.86min 71 words 0.0% ts werr guenter : 600.73min 61527 words 0.7% ts werr ich : 4.18min 335 words 0.0% ts werr jcp : 1.87min 131 words 0.0% ts werr justmoon : 1.40min 162 words 0.0% ts werr laserman : 2.88min 216 words 0.0% ts werr lasser : 22.35min 2166 words 1.0% ts werr locoloco : 1.65min 155 words 0.0% ts werr m0rbid : 0.21min 24 words 18.2% ts werr marv : 5.26min 432 words 2.3% ts werr mjw : 4.32min 433 words 2.2% ts werr mk : 0.97min 76 words mr123 : 0.79min 52 words mweinelt : 1.26min 170 words 15.4% ts werr pszacherski : 8.65min 903 words 0.0% ts werr ralfherzog : 1408.82min 94211 words 0.8% ts werr rebecca : 7.33min 580 words 12.0% ts werr rmmg : 43.02min 3454 words 1.7% ts werr robin : 0.04min 4 words rwthafu : 4.89min 498 words 0.0% ts werr sagef : 2.91min 204 words 3.2% ts werr schiebi : 0.09min 8 words sgottwald : 0.80min 77 words 0.0% ts werr stephsphynx : 0.07min 9 words steviehs : 1.91min 141 words 0.0% ts werr suther : 0.88min 64 words 0.0% ts werr thhoof : 14.09min 1069 words 0.0% ts werr thisss : 1.87min 151 words 0.0% ts werr timiobaumann : 0.72min 77 words 0.0% ts werr timobaumann : 14.95min 1532 words 7.4% ts werr tolleiv : 5.22min 584 words 1.4% ts werr wmwie : 1.13min 89 words 0.0% ts werr Total: test set has 14899 words 176 errors => 1.2%
Thank you very much guenter for the work and so much effort:-)
I made a base model for simon from this model and attached it here.
This is also linked from KDE-Files.org
Thanks again.
EDIT: New version :-)
Thanks for your effort! Tiny problem: the demonstration script in your archive refers to model_parameters/voxforge.cd_cont_4000, but the directory is called model_parameters/voxforge.cd_cont_3000 . A question: Can I use your files with g2p ( https://code.google.com/p/phonetisaurus/ ) somehow? The main problem is that I want to create my own dictionary, but g2p has only training data for the english language. Your already provide a .dic-File, but I want to make my own with only the words I need, additionally I have have some special (key-)words with aren't in your dictionary, and I don't want to transcribe them manually ;-) Examples are Cubie, Nibobee and Julya
> Can I use your files with g2p ( https://code.google.com/p/phonetisaurus/ ) somehow?
Yes, you can use the dictionary in phonetisaurus. You could probably just try it. You might want to convert the format though (care about number of spaces between word and phonemes).