Ask Slashdot: Effective, Reasonably Priced Conferencing Speech-to-Text?

Ask Slashdot: Effective, Reasonably Priced Conferencing Speech-to-Text? 81

Posted by samzenpus on Monday December 30, 2013 @03:03PM from the keep-talking dept.

First time accepted submitter DeafScribe writes "Every year during the holidays, many people in the deaf community lament the annual family gathering ritual because it means they sit around bored while watching relatives jabber. This morning, I had the best one-on-one discussion with my mother in years courtesy of her iPhone and Siri; voice recognition is definitely improving. It would've been nice if conference-level speech-to-text had been available this evening for the family dinner. So how about it? Is group speech to text good enough now, and available at reasonable cost for a family dinner scenario?"

Ask Slashdot: Effective, Reasonably Priced Conferencing Speech-to-Text?

This discussion has been archived. No new comments can be posted.

Search 81 Comments Log In/Create an Account

Comments Filter:

captions (Score:5, Insightful)

by phantomfive ( 622387 ) writes: on Monday December 30, 2013 @03:13PM (#45820779) Journal

Go find some youtube videos with auto-captioning. That is the upper-limit on the quality you will get with today's technology.

Good luck.

Re:There isn't any... (Score:5, Insightful)

by TWX ( 665546 ) writes: on Monday December 30, 2013 @03:19PM (#45820863)

video transcribers also quite expensive
Based on what I get on my TV when I press the Mute button, they really shouldn't be...

Re:There isn't any... (Score:5, Insightful)

by jettoblack ( 683831 ) writes: on Monday December 30, 2013 @03:40PM (#45821103)

There's no perfect solution, but something that works for 60% might already be better than nothing.
I work in the closed captioning industry, and I'd say anything less than 95% accuracy is actually WORSE than nothing. Automatic Speech Recognition (ASR) has no concept of context or situational awareness. The mistakes they make tend to be not in the simple common words and phrases, but concentrated in the nouns, especially proper nouns: names of people, places, companies, products, etc. Even at 80% accuracy, which is quite good for the current best speaker independent ASR systems, you're looking at 2 words out of every 10 being substituted with the wrong word, completely changing the meaning of the phrases. Imagine the chaos if (major news network)'s closed captioning reported some celebrity or politician as saying "I'm not a fan of Jews." when they actually said "I'm not a fan of juice." (Which would be 83% accurate!) Wars have been started for one misheard word out of a thousand; imagine how bad 200 out of 1000 would be.
Here's an article about a HUMAN transcription error that caused a pretty major ruckus. Now imagine this kind of problem being an order of magnitude worse:
http://www.people.com/people/article/0,,20693447,00.html [people.com]
People who lost hearing later in life tend to do better with high error rate ASR because they know what words sound like and can figure out easy substitutions, e.g. Juice vs. Jews, Election vs. Erection, etc., but people who were born deaf or lost hearing before language acquisition cannot easily make these substitutions in their head because they don't "hear" the word sounds when they read them.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Ask Slashdot: Effective, Reasonably Priced Conferencing Speech-to-Text? 81

Ask Slashdot: Effective, Reasonably Priced Conferencing Speech-to-Text? More Login

Ask Slashdot: Effective, Reasonably Priced Conferencing Speech-to-Text?

captions (Score:5, Insightful)

Re:There isn't any... (Score:5, Insightful)

Re:There isn't any... (Score:5, Insightful)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot