VoxForge
--- (Edited on 4/ 3/2007 11:44 am [GMT-0500] by serial_strat) ---
Hi serial_strat,
The Metrics link on the VoxForge Download gives stats on the VoxForge Speech Corpus.
You can get the output of the HDMan run for the previous night's Acoustic Model Creation run in the HTK.tgz tar file (which is updated nightly). Go to the /HTK/AMCreate_scripts/logs/ directory and open the Step2_HDMan1_log file.
Last night's (April 3, 2007) summary statistics are as follows:
Dictionary Usage Statistics
---------------------------
Dictionary TotalWords WordsUsed TotalProns PronsUsed
VoxForgeDict 129528 4367 129545 4371
dict 4367 4367 4371 4371
4367 words required, 0 missing
New Phone Usage Counts
---------------------
1. ax : 1762
2. sp : 4369
3. ae : 579
4. b : 476
5. l : 1270
6. ow : 411
7. n : 1638
8. d : 1005
9. m : 711
10. t : 1336
11. iy : 858
12. s : 1388
13. aa : 536
14. z : 560
15. er : 752
16. ix : 722
17. ey : 447
18. ao : 324
19. r : 1308
20. sh : 276
21. aw : 98
22. ng : 366
23. ah : 269
24. v : 337
25. k : 993
26. dx : 319
27. uw : 301
28. eh : 782
29. p : 659
30. ch : 169
31. jh : 223
32. w : 299
33. y : 160
34. ih : 584
35. f : 457
36. ay : 367
37. g : 309
38. th : 134
39. hh : 238
40. dh : 59
41. uh : 86
42. zh : 40
43. oy : 53
44. sil : 2
>estimate of the "distance" to a working threshold?
No real estimate to a working threshold other than 140 hours of speech, which corresponds to the number of hours of speech used in Sphinx group Acoustic Models.
Note working threshold is relative. Dictation applications would require much more speech audio than 140 hours - this is more of a target for Command and Control and Telephony IVR type applications.
Ken
--- (Edited on 4/ 3/2007 9:02 pm [GMT-0400] by kmaclean) ---
--- (Edited on 4/ 3/2007 9:06 pm [GMT-0400] by kmaclean) ---