Forgot your password?

Ask Slashdot: Effective, Reasonably Priced Conferencing Speech-to-Text? 81

Posted by samzenpus
from the keep-talking dept.
First time accepted submitter DeafScribe writes "Every year during the holidays, many people in the deaf community lament the annual family gathering ritual because it means they sit around bored while watching relatives jabber. This morning, I had the best one-on-one discussion with my mother in years courtesy of her iPhone and Siri; voice recognition is definitely improving. It would've been nice if conference-level speech-to-text had been available this evening for the family dinner. So how about it? Is group speech to text good enough now, and available at reasonable cost for a family dinner scenario?"
This discussion has been archived. No new comments can be posted.

Ask Slashdot: Effective, Reasonably Priced Conferencing Speech-to-Text?

Comments Filter:
  • by TWX (665546) on Monday December 30, 2013 @03:30PM (#45820963)
    I have a suggestion of a test for you, to demonstrate why it's impractical at absolute best.

    Take ten or so friends to a restaurant. It can be that you're the only patrons there so it's relatively quiet, that's fine. Seat everyone along two sides of a long table, and put a person at each end. Seat yourself in the middle of one of the long sides. Now, as your party is served, attempt to pay attention to all of the conversation going on among the friends. You'll probably find that the friends break into three or four distinct conversations, with some people floating between conversations depending on what's being talked about. Now, in turn, try to focus on or participate in every distinct conversation at the table.

    Even as someone with good hearing, this will be a difficult task. With at few as four people it's possible to have two distinct conversations going on in parallel, and with six people it's almost guaranteed to have at least some moments with two simultaneous conversations.

    Unless a family operates their dinners with parliamentary rules for who has the floor, it would be almost impossible for software to successfully monitor and differentiate so many speakers, even if the hardware were ideally installed so that each individual speaker could be individually sampled. Fully able-bodied humans struggle with this with years of experience in attempting to sort through the chatter, I don't see how software is going to make it work, and I also don't see how the hearing-impaired individual is going to be able to read to keep up with that many conversations simultaneously in order to really enjoy the experience, while eating.
  • by Sarten-X (1102295) on Monday December 30, 2013 @03:39PM (#45821087) Homepage

    More fun audio tricks:

    Look at each person speaking as you're trying to listen to them, then look at someone else while still trying to follow the same person's conversation. Usually, a few moments after looking away, you'll find the other conversations are more distracting. Your brain is trying to match up what you're hearing with the mouths moving in front of you.

    Also, put in one earplug and close your eyes, so you lose spacial awareness. Again, the voices will blend together much more. That's because the brain also uses spacial cues (visual placement, stereo hearing) to separate sound sources.

We are Microsoft. Unix is irrelevant. Openness is futile. Prepare to be assimilated.