VoxForge
i need an open source speech rec model for my business. I can contribute 100,000+ dictations and associated transcriptions to the project (medical reports with all patient and physician information redacted). i also need a very bright programmer/technical person who is into speech recognition to work for money to build a speech model for my company. this person would be responsible for managing the dictations and the contributions to the VoxForge project. I don't know a lot about open source - we are big users of several commercial applications. however, the commercial applications are not as good as they could be - but are all closed up black boxes, so we cannot get at them to do better - we just spend LOTS of money to get what we get. We know we can do better - we have a lot of technical resources - but we need a strong engineer who knows HMM and how to build a real application with it. anybody interested. drop me an email. totally confidential.
>I can contribute 100,000+ dictations and associated transcriptions to the
>project (medical reports with all patient and physician information redacted)
While I appreciate the offer, I am not sure of the Copyright implications of such a donation.
Even though identifying info has been removed, someone might object to having their voice released publically, and since Copyright takes hold automatically, I am not sure how we can legally use this corpus.
Did all the participants in whose voices were used in your corpus assign their Copyrights to your organization?
>i also need a very bright programmer/technical person who is into speech
>recognition to work for money to build a speech model for my company.
Nickolay at Nexiwave is your best bet.
Ken
If voice files specific to medicine are required to create data sets specific for medical dictation, could we set up a way in which individual physicians could donate their own speech and text on voxforge (sans patient data obviously)?
I've tried in the past to incorporate Sphinx into Freemed (also GPL) but was unable to come up with sufficient voice and text files to even start the project. Voxforge seems the perfect venue to make this happen.
Thoughts?
Dan
Hi Dan,
>If voice files specific to medicine are required to create data sets specific
>for medical dictation
The main non-software components used by speech recognition engines (such as Sphinx or Julius) are: a language model and an acoustic model.
For dictation, in addition to the speech recognition engine, another software component would be required: a way continuously improve the acoustic model by updating the standard acoustic model with speech from the user - a process called 'adaption' (Nickolay talks about the importance to implementing adaptation for dictation here)
We would only be addressing the speech audio requirements for creating acoustic models if we were to collect medical speech.
Depending on the sentences you provide, these might be used for language model creation too.
I am not sure if there are open source dictation systems that implement user adaptation of acoustic models: the EvalDictator dictation dialog manager might do this...
>could we set up a way in which individual physicians could donate their
>own speech and text on voxforge
Yes, we could start with a list of common medical words and phrases in a separate 'medical' section (similar to how each language has its own 'read' page). We would target this to physicians, since they would presumably know how to correctly pronounce this specialized vocabulary.
Once we get a few hours of speech, we could start on a pronunciation dictionary (I would need your help for that) so that people could generate their own acoustic models based on the submitted medical speech.
Ken