[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Accessibility] Call to Arms
From: |
Eric S. Johansson |
Subject: |
Re: [Accessibility] Call to Arms |
Date: |
Mon, 26 Jul 2010 14:44:25 -0400 |
User-agent: |
Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.7) Gecko/20100713 Thunderbird/3.1.1 |
On 7/25/2010 10:52 PM, Richard Stallman wrote:
I was speaking shorthand. It's not an add-on to
NaturallySpeaking. It is an add-on to the communications framework
between recognition application and user application.
Something like that might be independent enough of the recognizer
to be a valid project. But there ARE free software packages for
speech recognition. So people should develop it to work with them.
If users can also run it with NaturallySpeaking, that is ok,
as long as we don't suggest it.
Sorry, I really have to correct this and correct it hard.
***there are no large vocabulary continuous speech usable speech recognition
engines out there today ***
From what I can tell, Simon is the closest and it's pretty far. Sphinx is a
great tool to keep grad students busy. To keep the speech recognition rate high
enough, you need to keep your vocabulary in the 1000 word range. to keep your
accuracy high enough you need to keep your recognition vocabulary in the 1000
word range. I spoke with the Sphinx numeral four developer about using this when
I was part of the oopen-source speech recognition initiative and he told us,
it's IVR only. Don't even think about using it for dictation.
When we did a survey of all the available packages, the closest one we found was
the MIT dugout package. But its creator admitted it was missing all of the
language model, acoustic modeling etc. that it needed and it was better than all
the alternatives other.
Need a first step for this whole process should be collecting a corpus for
training and experimenting with different recognition parameters. you need to
have one before you can ship a working recognition system. Hopefully you can get
it done a couple of years. Dragon Systems took something on the order of a year
or two with a heavy interview/recording schedule for the baseline and then kept
gradually improving it.
However, I think we should not include such things in THIS project,
because we need to focus energy on the goal of making those free
recognizers better. For us, replacing important proprietary software
takes priority over advancing the capabilities of software.
Richard, you use the language of someone who has no clue about how difficult
this problem is. I've been friends with speech recognition developers and living
with speech recognition for 15 years. I have an idea of the problems they've
encountered. I have no idea how to solve them nor do I fully understand them
but, I've learned enough to have a clue. I'm not saying you don't know anything
about the problem but your language and expectations expressed are frightening
me because expectations based on what I'm seeing have been responsible for the
failure of more than one speech recognition tools program, something much less
complex than the recognizer/line with model/acoustic model/audio
processing/predictive search engines/correction systems/training... that all go
into a full recognition system. and we still haven't started talking about how
badly screwed up the Linux audio sound system is. If you want good speech
recognition, you better need to rewrite the entire audio system to make it work
better for speech recognition. This really is a big chunk of work you are
biting off.
ask yourself this question: why was NaturallySpeaking the only large vocabulary
continuous speech recognition product on the market? (hint: it's really f-n hard
and it's a small market)
I want this to succeed but it's got to have real expectations and most
importantly serve the needs of the users because unlike any other Project you've
ever been on, the users are the most important thing. Yes, I know this is
counter the free software foundation philosophy but being injured, working with
other injured people, I can't see myself looking at this project in any other
way but compassionate. doing otherwise is just wrong according to my
spiritual/ethical/moral/greedy self-interest foundations.
I really apologize for being blunt. You have been one of my heroes for a long
time but I am willing to kick even my heroes in the shins if I think he is
going really wrong and I think you are going really wrong. If would help any,
I'm could come down to lunch and talk about some of these issues the next time
you're in Boston. if I remember the location correctly, We could probably ask
the guy (gs) to the left of your office to join us and act as a
moderator/referee :-). He knows me through ATMoB.
for the meantime, I'm going to have to drop this but, it is extremely important.
I am willing to help out with requirements for the toolset and basic thinking
about what the user needs to do until we get to the point where I can write code
using speech recognition. Are you okay with that?
--- eric
PS, not sure where to fit this in but it's an example to think about. VR-mode is
a bridge between Emacs and NaturallySpeaking. It gives full voice control and
editing control like NaturallySpeaking doesn't proprietary programs. If it would
work more consistently, I would be using Emacs instead of proprietary programs
which work better with speech recognition.
here's another thing. If I had VR mode working, I would be able to write a
moderate amount of Python code with a bare-bones recognition system.
This is the kind of incremental migration to a freer software environment that
I'm hoping for. First you modify the applications, then come with proper bridge
design, you pull out the evil proprietary stuff and replace it with a good free
stuff. right now, it's proprietary software all the way. I have no choice if I
want to work or play. I really hate it and I want to be working with free
software that works well with disabled users.
- speech recognition air and apology with long-winded explanation Re: [Accessibility] Call to Arms, (continued)
- speech recognition air and apology with long-winded explanation Re: [Accessibility] Call to Arms, Eric S. Johansson, 2010/07/28
- Re: [Accessibility] Call to Arms, Richard Stallman, 2010/07/28
- Re: [Accessibility] Call to Arms, Richard Stallman, 2010/07/25
- Re: [Accessibility] Call to Arms, Richard Stallman, 2010/07/25
- Re: [Accessibility] Call to Arms, Steve Holmes, 2010/07/26
- Re: [Accessibility] Call to Arms, Eric S. Johansson, 2010/07/26
- Re: [Accessibility] Call to Arms, Chris Hofstader, 2010/07/26
- Re: [Accessibility] Call to Arms, Eric S. Johansson, 2010/07/26
- Message not available
- Fwd: [Accessibility] Call to Arms, Tony Sales, 2010/07/26
- Re: Fwd: [Accessibility] Call to Arms, Eric S. Johansson, 2010/07/26
- Re: [Accessibility] Call to Arms,
Eric S. Johansson <=
- Re: [Accessibility] Call to Arms, Richard Stallman, 2010/07/27
- Re: [Accessibility] Call to Arms, Eric S. Johansson, 2010/07/27
- Re: [Access-activists] Re: [Accessibility] Call to Arms, Christian Hofstader, 2010/07/28
- Re: [Access-activists] Re: [Accessibility] Call to Arms, Eric S. Johansson, 2010/07/28
- Re: [Accessibility] Call to Arms, Richard Stallman, 2010/07/27
- Re: [Accessibility] Call to Arms, Eric S. Johansson, 2010/07/27
- Re: [Accessibility] Call to Arms, Bill Cox, 2010/07/27
- Re: [Accessibility] Call to Arms, Eric S. Johansson, 2010/07/27
- Re: [Accessibility] Call to Arms, Richard Stallman, 2010/07/28
- Re: [Accessibility] Call to Arms, Christian Hofstader, 2010/07/28