Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Open Source Software

Open Source Transcription Software? 221

sshirley writes "I am beginning to do some interviews with family members and will do some audio journals for genealogy purposes. I would really love to be able to run the resulting MP3 or WAV files through some software a get a text file out. I know that software like this exists commercially. But does this exist in the open source world?"
This discussion has been archived. No new comments can be posted.

Open Source Transcription Software?

Comments Filter:
  • Dear aunt, (Score:5, Insightful)

    by Anonymous Coward on Tuesday July 20, 2010 @06:53PM (#32971954)

    let's set so double the killer delete select all.

    Seriously, transcribe it manually... automatic speech recognition just doesn't work. And can never work, because much of the time the only reason humans can understand each other is by making informed guesses based on context, which a computer program cannot do.

  • Re:Dear aunt, (Score:5, Insightful)

    by Kenoli ( 934612 ) on Tuesday July 20, 2010 @07:07PM (#32972116)
    A program capable of "making informed guesses based on context" seems perfectly plausible, though that's not part of speech recognition per se.
  • Re:Dear aunt, (Score:2, Insightful)

    by ThomConspicuous ( 1004135 ) on Tuesday July 20, 2010 @07:19PM (#32972242)
    It's already being done in medical dictations that are also recorded and double checked by Transcriptionists. Speeds up work flow immensely even with the human verification in place.

    I even witnessed an East Indian doctor with a heavy accent dictate normally and have the software pick up everything stated. He was pleasantly surprised.

    It works.
  • Re:Dear aunt, (Score:3, Insightful)

    by conchubhair ( 1453303 ) on Tuesday July 20, 2010 @07:24PM (#32972294)
    The problem you are describing (continuous speech recognition) is not solved yet. Even the best state of the art technology is not going to be perfect, and having two speakers will make it even less useful. If you really need the stuff transcribed, you can pay for online services to transcribe it (if they offer really good quality transcription, they are most likely using humans) or you can transcribe it yourself (you can buy software to help speed up the transcription process - including a foot pedal to pause/play the audio, e.g. http://www.nch.com.au/scribe/ [nch.com.au]). My company does a lot of work in speech recognition, and we have tried most of the companies that offer transcription. Some of them even provide APIs so you can code something up. The best fully automatic, commercially available transcription I have seen is from Yap Inc. (http://yapme.com/). If the speaker doesn't have a crazy accent and speaks at a normal level and pace you can get great results, but like all fully automatic transcriptions it can get it wrong. The benefit of Yap is that you can get back the confidence scores and alternates for each word, so if you had a dictionary of your own commonly used words you can pick out a better transcription. You pay by the word for transcription (it is a small amount, but it will add up if you're doing hours of audio). If you're willing to wait, the technology is improving all the time, so you could archive the audio for now and return to have it transcribed in a few years. If you need this done now and want something you can actually read then your cheapest option is to do it yourself, and maybe invest in some software to speed it all up. Unless you have a lot of time on your hand and access to a lot of transcribed audio to build the language models, using any software at home is not worth your while.
  • Got kids? (Score:5, Insightful)

    by Kral_Blbec ( 1201285 ) on Tuesday July 20, 2010 @07:52PM (#32972554)
    Pay them a buck per page and they learn some family history along the way. Problem solved.
  • Re:Dear aunt, (Score:3, Insightful)

    by theheadlessrabbit ( 1022587 ) on Tuesday July 20, 2010 @08:00PM (#32972616) Homepage Journal

    let's set so double the killer delete select all.

    Seriously, transcribe it manually... automatic speech recognition just doesn't work. And can never work, because much of the time the only reason humans can understand each other is by making informed guesses based on context, which a computer program cannot do.

    ...a computer program cannot do yet

  • Re:Dear aunt, (Score:1, Insightful)

    by Anonymous Coward on Tuesday July 20, 2010 @09:31PM (#32973312)

    you're refering software some one is -trained- to use for a specific purpose. that's not the same as a general purpose voice recognition program

  • Re:CMU Sphinx (Score:4, Insightful)

    by notthepainter ( 759494 ) <oblique&alum,mit,edu> on Tuesday July 20, 2010 @10:05PM (#32973490) Homepage

    Both options are just back-ends, you'll have to write a front-end. However, it shouldn't be too hard to do that

    Actually, it can be rather hard to do that. I was one of the founders of MacSpeech and there is a surprisingly large set of details you have to deal with, punctuation, capitalization, etc... Of course since you wouldn't be making a commercial product much of the gloss need not be coded but once you have the engine, the part that takes the audio source and converts it to text, you still have a large amount of work left over.

  • Re:CMU Sphinx (Score:5, Insightful)

    by Crudely_Indecent ( 739699 ) on Wednesday July 21, 2010 @09:28AM (#32976844) Journal

    "... unless you're not a programmer."

    I am a programmer, but we're all sometimes out of our element.

    I found need for modifications to an open source application a few years ago. Rather than spend my time reading the source code to understand how the application worked, I decided to contact the developer. A few emails and a couple of days later, the project developer made the modifications for me and $500 for himself. The world then gained additional functionality in the open source application - everyone wins.

    Some people forget, this is how many open source applications survive.

    Your analogy is outlandish! If someone wants to drive a car to work, they buy a car. If they want a shark fin on the roof, they go to a custom body shop. If they want a killer stereo, they go to a stereo shop. If they want it to be pink and yellow like yours, they go to a paint and body shop. If they can do these things on their own, they'll do it. The difference being that if the car was open source, doing these things wouldn't void the warranty.

    "Open-source is free only if your time has no value." - Jamie Zawinski

    I offer an alternative viewpoint:

    Open source is free if you truly understand freedom.

    I'm free to use the application. I'm free to modify it. I'm also free to recognize my limitations and pay someone else to do these things for me.

  • Re:CMU Sphinx (Score:3, Insightful)

    by Bill, Shooter of Bul ( 629286 ) on Wednesday July 21, 2010 @05:41PM (#32983640) Journal

    Blechkt. That's how I feel about your post. This is a site for nerds. Nerds are often adept at doing nerdy things. Like writing software.

    Now, if you're mom asked you. Then yes, a reply of "You only need to write a front end to this speech engine" is indeed inappropriate.

    Your post, and the replies to it, really reflect more on how you view the general slashdot audience, then anything else.

  • Re:CMU Sphinx (Score:3, Insightful)

    by Crudely_Indecent ( 739699 ) on Monday July 26, 2010 @04:26PM (#33036068) Journal

    The commercial app does exist, and it's a per-use app that is controlled by a dongle and subscription (hint, more than $500 - plus usage).

    Sticking it to the man has nothing to do with it, unless by "it" you mean money and by "the man" you mean my pocket.

    Of course, any commercial developer will gladly make a custom app for $, but I guarantee that it will be more than $500. The developer did have plans to add the functionality...eventually. My $500 bought made it happen right now.

    It was certainly silly of me to make over $50k using the newly modified software that I paid $500 for. That's only 9900% profit, so, you're absolutely right....I made a serious mistake.

Work without a vision is slavery, Vision without work is a pipe dream, But vision with work is the hope of the world.

Working...