Dialog Managers

Flat
simon goes alpha
User: bedahr
Date: 1/18/2008 9:36 am
Views: 20018
Rating: 45

Hi!

 simon released its first alpha today.

You can download the .tar.bz2 at sourceforge (http://sourceforge.net/project/showfiles.php?group_id=190872).

 

It _should_ be able to:

-----------------------------------------------------------------------------------------

  •  Import Hadifix and Wiktionary dictionaries
    both including pronunciation (as well as automatic    XSampa-conversion when importing a Wiktionary dictionary) and terminals
  • Compiling the language model
    we cheated and looked at your model-creation script. But we rewrote all those little perl/bash scripts in C and it now is even a bit faster *brag* :P
  • Record training samples and automatically add them to your prompts file
    Training texts can be automatically generated by selecting words to train or imported from .txt files or from the internet (the internet part is for now only a feature-demonstration as there is only one text available online)
  • Maintain a main and a shadow dictionary/vocab
    performance reasons - only the "used" words are in the main dictionary/vocab
  • Add new words with a shadow-dict lookup;
    so you can import many dictionaries and keep them in your shadow dict, so they don't impact your performance but have simon automatically look up new words when you are trying to add them
  • Remove words
    Either move them (back) to the shadow lexicon or delete them completely which also includes all training samples containing that word
  • Import of Trainingdata
    Can import Trainingsdata if it is appropriately named: this_is_a_test.wav for example; All samples will be processed with a configurable "effect-stack" to apply normalisation / downmixing etc. before importing (this is pretty much untested - I wouldn't recommend it for real-life use for now)
  • Type whatever Julius recognizes
  • Simulate shortcuts
    you can for example assign "simon run" to Alt+F2
  • Run Programs / Open places
    you can import program / places over an convenient wizard
  • Simulate mouseclicks
    simon Desktopgrid wil present you with a grid of 9 areas which can then be narrowed down to click precicely what you want - pretty basic for now but it works Smile
    This also includes fake AND real transparency (when used on a composite-enabled desktop; configurable)
  • Import Grammar from personal texts
    This is especially useful after importing huge dictionaries
  • Merge Terminals
    To make it easy to import dictionaries and integrate them with the current grammar
  • Password protection for vital system parts
    as we are developing for children you can lock down everything that is not needed for normal operation
  • Guided First-Run-Setup

 

Stuff that is not working but should be:

-------------------------------------------------------

  • Synchronising the language models with juliusd - the only possible way for now is to start juliusd on the same machine as simon and synchronize it manually; (this will take some time though)
  • Use AT-SPI and MSAA to pick up other applications gui and let simon control them
 

Stuff in active development (besides bugfixing, etc.):

---------------------------------------

  • Making the gui of simon completely controllable by voice - this is achieved by an extension of QT (SimonTableWidget, SimonListWidget, etc.); It is already working (a bit) but there are obvious bugs...

 

However, all this hasn't _really_ been tested with a real model and the list is more like an "what would work if there were no bugs at all".

You offered us your help some time ago - and we'd really appreciate it, if you could try out the current state of simon and give your comments. Please keep in mind that this is still an alpha version so it is not really stable and ABSOLUTELY not suited for productive use. Think of it as an (hopefully impressive ^^) tech-demo.

Looking forward to your feedback!

 

-- bedahr (aka Peter Grasch)

Project administrator

--- (Edited on 1/18/2008 9:50 am [GMT-0600] by bedahr) ---

Re: simon goes alpha
User: kmaclean
Date: 1/23/2008 10:16 am
Views: 608
Rating: 39
KDE-apps page

--- (Edited on 1/23/2008 11:16 am [GMT-0500] by kmaclean) ---

Re: simon goes alpha
User: bedahr
Date: 6/23/2008 3:55 am
Views: 301
Rating: 26

Second alpha is out the door and imho a great release.

Tested it yesterday from scratch and generated a complete speechmodel with a simple grammar and a few words which opened a webbrowser, new tabs and new windows and surfed to google when told to do so. The whole thing was (at least in my opinion) very easy and took me (_with_ explanations) about 30 minutes and simon didn't crash _once_ on me :).

Anyways, don't want to brag, but I thought I'd check in and maybe encourage some testers to check out the new alpha as it really is a huge advancement over the first one.

Also, this new version has been completely translated to English (simon and Juliusd) and is shipped as source package or as precompiled windows binary. This should make testing for our windows folks a lot easier :).

Of course there are still bugs to sort out, missing features and issues to solve but all in all this is (again - imho) a great release.

Check it out!

Download

Homepage

 

 -- bedahr

 

ps.: If you like this project, please consider a small donation: Support us!

--- (Edited on 6/23/2008 3:55 am [GMT-0500] by bedahr) ---

Simon - precompiled Windows binary
User: ralfherzog
Date: 6/23/2008 1:32 pm
Views: 200
Rating: 25
Hello bedahr,

Thank you for offering a "precompiled Windows binary."  I installed this yesterday.  And it looks promising.

--- (Edited on 2008-06-23 1:32 pm [GMT-0500] by ralfherzog) ---

Re: Simon - precompiled Windows binary
User: bedahr
Date: 6/23/2008 3:30 pm
Views: 173
Rating: 28

@ ralfherzog:

Sadly the comments on your blog (http://speech.blau.in/?p=30) are disabled so I am going to say this here (but it is on-topic): 

 

> So to use this program successfully, there are several

> additional programs needed. I need the HTK toolkit, and Julius.

> And there are further components necessary. I think I will stop

> the installation now. Or should I continue? At the moment, I

> am not sure. I think, that I will hit the next button.

> I won’t publish a screenshot from the next step. But it is about

> HTK programs HDman, HCopy, and several other programs. I

> think (but I am not sure) that it is necessary to tell Simon the

> path on which location those programs are installed. A few

> months ago, I made some first steps with HTK and Julius, but

> everything was pretty complicated. At the moment, I am

> reading a few pages in the HTK book, everything is very

> abstract. And it takes a lot of time to get involved.

 

The beauty of simon is, that - if everything works well - the end-user never, ever (really :)) sees HTK or Julius at work.

Trust me - hit the next button :D

 I am currently creating some instructional videos that outline the setup and the main workflows so that should be up soon.

Good luck testing simon everybody!

 

-- bedahr

--- (Edited on 6/23/2008 3:30 pm [GMT-0500] by bedahr) ---

Videos about Simon would be great
User: ralfherzog
Date: 6/23/2008 4:33 pm
Views: 220
Rating: 31
Hi bedahr, some instructional videos would be great.  It would be easy to watch a video (e.g. a screencast made with Wink) that explains how to start and use the main functions of Simon.

--- (Edited on 2008-06-23 4:33 pm [GMT-0500] by ralfherzog) ---

Re: Simon - precompiled Windows binary
User: nsh
Date: 6/24/2008 3:09 am
Views: 291
Rating: 29

> The beauty of simon is, that - if everything works well - the end-user never, ever (really :)) sees HTK or Julius at work.

Hm, that's quite interesting. Are you distributing HTK binaries with simon?

--- (Edited on 6/24/2008 3:09 am [GMT-0500] by nsh) ---

Re: Simon - precompiled Windows binary
User: bedahr
Date: 6/24/2008 3:48 am
Views: 182
Rating: 29

> Hm, that's quite interesting. Are you distributing HTK binaries with simon?

No as the licence of HTK does not allow redistribution. 

What I meant was, that simon uses the supplied binaries (that the user still has to install himself but that is not really hard) to compile the model and everything.

 

The end-user does not even have to know what the HTK toolkit _is_ as long as he knows how to download a binary version of it and point simon to the executables.

 

-- bedahr 

 

 

--- (Edited on 6/24/2008 3:48 am [GMT-0500] by bedahr) ---

Re: Simon - precompiled Windows binary
User: nsh
Date: 6/24/2008 3:41 pm
Views: 183
Rating: 23

> What I meant was, that simon uses the supplied binaries (that the user still has to install himself but that is not really hard) to compile the model and everything.

I see, thanks. Well, in a long term it doesn't look like a good decision.

Also I wanted to know, in theory generic model adapted for speaker must be much better than speaker-dependent one trained from a few utterances.  At least for English where good models like wsj are available.

--- (Edited on 6/24/2008 3:41 pm [GMT-0500] by nsh) ---

Re: Simon - precompiled Windows binary
User: bedahr
Date: 6/25/2008 12:45 am
Views: 192
Rating: 27

> I see, thanks. Well, in a long term it doesn't look like a good decision.

Why? What do you mean?

 

 

> Also I wanted to know, in theory generic model adapted for speaker must be much better than speaker-dependent one trained from a few utterances.  At least for English where good models like wsj are available.

Of course. There are generic models for English, German, etc. which would work way better for normal recognition for you and me. But as stated on the website the project specifically targets handicapped people. Our two testers for example suffer spastic disabilities which limits their motoric skills - they can't write. The disability also affects their voice and puts conventional solutions out of the question.

Or think about stroke vicitms that can only mumble a few distinct sounds. No problem - after an hour or so training they can use simon to e.g. control their webbrowser.

 

As for the "few utterances": The long term goal is to replace or at least accompany the therapy of the affected. Take for example children with spastic disabilities: Their therapy in schools is to draw circles and other figures to ultimatively improve their motorfunctions. The sad part of this: it doesn't work. (The founder of this project is a teacher in a special school for handicapped children)

If we could use this time spent on ineffective motor therapy (in school alone) we would have at least a few hundred hours which can easily be used to continuosly improve the model of the affected. But for this, simon has to be accepted as an alternative solution and be easy enough for teachers and non-technical savvy persons to use. 

 

Hope this helps to understand a few core design principles of the software.

 

But if you want to use simon with other models - go ahead! If you can convert it to an HTK model you are free to use whatever you want. You can for example also use the voxforge corpus with simon. 

 

-- bedahr 

--- (Edited on 6/25/2008 12:45 am [GMT-0500] by bedahr) ---

PreviousNext