Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Technology

Dictation Software for Linux? 20

Yottabyte84 asks: "As a student, I often have the need to type up papers, however like most people I talk faster then I type. I've lately been looking to get a dictation program, but I don't wanna boot Windows every time I need to use it. IBM has a commercial version of ViaVoice for Linux and a free SDK. I'm unclear what the SDK can do, and would be willing to buy the commercial product if the SDK doesn't fit my needs. What I'd really like to be able to do is give spoken text input into the Linux apps I already use, but could live with being stuck with a simple included word processer. Have any of you tried ViaVoice or the SDK? How well did they work for you?"
This discussion has been archived. No new comments can be posted.

Dictation Software for Linux?

Comments Filter:

  • however like most people I talk faster then I type.


    I should hope so! Incredibly fast typists max out around 90 words per minute. Sit down and read 90 words out loud, and tell me how long it takes. 15 seconds? :-)


    If you only read aloud as fast as you typed, you'd probably ride the short bus

    • Hmmm, I guess I'm "superhuman" then?
      I average above 105 all the time, 120 on good days, and over 130 if I'm really trying to get something typed out fast (like in the middle of a game of Quake)
      • How many mistypes do you make though? Doesn't matter during a game of quake if it coms out a byt wron gbut professional typists don't make mistakes. That's the 90wps.
  • I suspect the SDK probably requires the commercial package for the actual implementation. SDK's often just provide bindings and documentation for APIs, not the implementation itself. In other words, the SDK may be useless without the full product.
  • by slashkitty ( 21637 ) on Wednesday November 07, 2001 @07:36PM (#2535231) Homepage
    In a word, they are pretty good, but not perfect. The commercial version includes the teaching portion where you talk for like twenty minutes as it learns your voice. I've only tried it in it's own little app, which is not a real good word processor, but it's good for entering text. I would read paragraphs of Newspaper articles as fast as I could, and it was nearly perfect. It would miss names or things that you would expect. It could even play back the audio of what you just read in /Your Voice/ or it's own TTS engine. It wasn't as good when I tried to feed is other stuff. I guess it's geared to corporate / news speak.

    I've also developed some test apps w/ the SDK. It's not as good for free text, but could handle special commands and vocabularies. Things like automating mp3 playing and turning on and off lights would be good for that. You should try it out if that's what you're looking for!

    • Thanks a lot. Any idea why the SDK isn't as good plain dictation (What I want to use it for most of the time)? Have you tried out dragon NaturalySpeaking, and if so do you think would it be worth paying another $100 for it and run it in wine or vmware?
      • I haven't tried dragon's product, there is a review of it somewhere. I won the VV at LinuxWorld. I wouldn't have paid for it, when it comes down to it, I can type much better then I can speak. ;-) The SDK didn't work as well because of the learning, and maybe vocabulary? I think that if you have the commercial version, you can transfer the files to an SDK project, but I didn't try that.

        The "ViaVoice Dictation Run Time Kit for for Linux V3" seems to be the same as the version you buy, minus the headset (which makes a hugh difference) ... it should at least have the free text dictation to try out.

  • like most people I talk faster then I type

    The usual problem isn't talking faster than you type, it's talking faster than you think :)

    Hmmm, maybe I can type faster than I think, too....
    • Well, the problem with most composition is that it's more formal than regular conversation.

      "Hey guy, howzit goin'?"
      "Not too bad, dude, you?"

      Not much meaning there, but a moment to consider what it is that you really want to say to your friend. However, that doesn't work too well in written correspondence.

      I might say things like "Didja hear about the new doodad down the street" in regular conversation, but I wouldn't write it that way (except right now, as a special circumstance *grin*).

      Proper written composition requires more thought than regular conversation. I have tried dictation before, both with a live secretary and via a dictaphone. Neither works nearly as well as just typing what I want to say.

      Of course, I am used to using a keyboard, and I can type much faster than I (or anyone else) can write long-hand. Until one can type such that "the keyboard disappears" (great expression, that!) then dictation or long-hand writing may indeed be faster. But for an experienced typist, there's no substitute for simply sitting down and typing up what it is that you want to say.
  • typping thiss using viavoicee do you need acuracy? If sso dont.
    #End use
    I have tried several options for voice translation to text. They work alright if you do not want any degree of accuracy. this is for win versions or *nix. If you speak like the talking heads on TV then it would be alright but if you have even a slight accent then type. The other problem I have found is the editing of the document takes longer than if I was typing it from scratch. We tend to read the way we think so missing errors is very easy. But on the bright side you can have fun confusing the software.
    #begin again
    superka UNKNOWN
    #end
    superkalifraaagi(unknown)
  • . . .is to learn how to type faster. Mavis Beacon Teaches Typing is a great tool for this, and it's also pretty fun. We use it to teach typing skills at the school where I work.

    I've supported many administrative assistants and remember when voice to text software first came out. We tried a number of solutions, and for even moderate typing speeds (60-90wpm) we found they were much more efficient if they just typed, rather than tried to dictate.

    I also remember recently the Director of our school was tired of dictating letters to tape to be transcribed by her assistant. (This was about 5-6 months ago). We tried Dragon Naturally Speaking and she didn't like it at all: typos, have to speak unnaturally, the weird feeling of talking to the computer (don't know why that was an issue since she talked to the tape. She's the boss).

    Anyway. I'd invest some time in building your typing skills. It will have a higher payoff in the end.
  • problems? (Score:2, Insightful)

    by Roadmaster ( 96317 )
    I had several issues while trying viavoice and dragon naturally speaking.

    You definitely should give it a try before committing.

    When you're typing, the kind of mistakes you make are of the kind known as "typos", where you mistype one letter for another. Those are hard to spot but in the end they don't matter that much.

    With voice recognition, you have BIG mistakes; like, the engine mistaking one word for another totally different, or complete sentences. That is big, and actually requires you to go through the entire document and fix the errors.

    Also, most voice recognizers i've seen aren't too adept at handling changes of mind; when you're speaking, you tend to make mistakes, because you changed your mind, or you meant something else, or whatnot. Recognizers aren't too forgiving of these situations. You'll end up with more crap on your document that you'll have to fix. In order to provide a coherent flow of speech for the recognizer, you basically have to be reading it from somewhere else. Which pretty much beats the purpose of the recognizer. If it's already printed, why not OCR it instead? it's bound to be much more accurate and faster. Also, if you have to write your stuff by hand before dictating it into the computer, you defeat the purpose of the voice recognition software. Because I guess you can type faster than you can handwrite :)
    • A great use however would be for dictation of hand written material. OCR isn't nearly as accurate for hand written material as it is for typed, and, as you said, a person can speak continuously making few errors when reading from something (assuimg they don't have to constantly stop and say "what is that!?").

      I bet a lot of places would find that useful, including teachers (lesson plan books), secretaries (daily logs/calenders), students (lots of hand written notes), etc.

      Just a thought.

      --MonMotha
    • If it's already printed, why not OCR it instead?

      It depends on the application. The *best* use I've yet seen for voice recognition (besides hands-free cellphone dialing, which is mostly just recognizing numbers) was on a loading dock. They got invoices and bills of lading from lots of sources, on lots of forms, with the information in lots of different places. They would just read the stuff they needed in the order their system needed it (shipper name, address, etc.; invoice number; part numbers; whatever). They got used to the various forms and could quickly find and read the desired info. I suppose you could teach an OCR program to recognize all the different forms, but what about the ones with coffee and jelly stains, handwritten forms, etc? Voice recognition was way cool for this app.

  • The major problem with most of the "voice recognition" solutions is that spoken English is fundamentally different from written English. I would almost venture that one (the written form) is more right-brain symbol oriented whereas the other (spoken) is more left-brain.

    Point in case: My above statement sounds psuedo-scientific. I have an excellent grasp of written English, whereas my spoken English usually leaves people wondering how a moran like myself qualified for Mensa.

    My suggestion: If you are expecting someone to read your completed work, you should take the extra time to type it manually. It will make your document easier to understand. If you don't believe me, just pick up a transcript from television and see how well the spoken word read from off a page. Yuck.
    • Actually, reading, writing, and speaking are all left brain activities. Most people would argue that written language is more so, as the syntax is much more formalized.

      Granted, there's right brain parts as well: something like deciphering vocal paralanguage/intonation might be more right brained. Assimilating metaphors contained in poetry would probably be more right brain.

      But in general, the left hemisphere handles all the linguistic tasks.
  • I haven't tried it out, but I've had my eye on this project for a while. http://www.speech.cs.cmu.edu/sphinx/

Never test for an error condition you don't know how to handle. -- Steinbach

Working...