|
From: | Eric S. Johansson |
Subject: | Re: [Accessibility] Re: Can you help write a free version of HTK? |
Date: | Mon, 12 Jul 2010 04:23:14 -0400 |
User-agent: | Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.4) Gecko/20100608 Thunderbird/3.1 |
On 7/12/2010 12:42 AM, Bill Cox wrote:
let me suggest getting yourself a copy of NaturallySpeaking. Standard is good enough for ordinary dictation. If you want to start adding on tools like dragonfly or vocola, you'll need to go with preferred. By the way, lesson 1 about speech recognition, microphones matter more than you can possibly imagine. I am what speech recognition vendors call "a goat". The vast majority of microphones out there do not work well with my voice. I need something from VXI. Personally, I think that VXI works well for a large number of people but that's only opinion.In particular, I know practically nothing about speech recognition, Simon's implementation, and such, so I do make mistakes out of ignorance.
It is not uncommon for a speech recognition user to spend hundreds if not a couple thousand dollars on microphones in the first couple years searching for that Mike that matches their voice. It sucks but, that's life for the hand crippled.
Also, if you not comfortable with running Windows even for training purposes, there's been a tremendous amount of progress made with wine in making it run NaturallySpeaking. a friend of mine (Susan Cragin) has been instrumental in testing and validating support. Unfortunately, the support isn't quite good enough and could use more help.
this is one of those measures that will get us another step closer to a fully free speech recognition varmint. If we can free NaturallySpeaking from the Windows context, run in wine, then the only proprietary component is NaturallySpeaking itself. Not perfect I realize but, significantly better than keeping windows in the picture. encapsulates the recognition engine and sets the stage for eventually replacing the speech recognition core.
Would Sphinx do as good a job as Julius? I haven't dived into any of the Sphinx code, but I have tried several Sphinx based programs in the past, and all of them had trouble recognising speech at a rate that would prove useful for real work. Simon is basically the only mostly open source program that has impressed me with it's recognition rate, though I don't understand the details of why this is the case. If switching to Sphinx would not hamper the recognition rate, I'd support such a move.
it's not just recognition rate, its also accuracy. the Sphinx systems have a really great recognition rate and accuracy if you're building an IVR system (fixed grammar, under 5000 word vocabulary). Once you get to but 20,000 words, not so good. The word error rate is documented on the Sphinx pages as is the performance.
--- eric
[Prev in Thread] | Current Thread | [Next in Thread] |