Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
News

Software Based Echo Cancellation? 212

tcyun asks: "I am helping to put together a small studio for a project at my workplace which will require some audio mixing. We have been able to find software solutions (often times open source) for almost all of our needs except for echo cancellation. I have done the requisite searches and have found a large number of hardware based echo cancellation devices, but have not found a purely software based solution. Is anybody aware of one?"

"For some more information, my office is trying to get a small system up and running that will allow multiple locations to video conference together. We have some specific requirements and have a fairly good handle on the entire video part of the problem.

However, we are running into problems with parts of our audio mix. The first issue is something that (I believe) is called 'mix minus.' This means that in a group conference, speakers do not have audio sent back to their location. (This is important for various psychology and network latency related issues.) There are several hardware based solutions that are available and we have some software based options.

The larger problem is echo cancellation. As many people may need to speak at once (and to avoid the requirement of having individuals constantly muting their microphones), we would like an echo cancellation component. The ideal would be a software solution that we could run locally, perhaps in conjunction with the same code running on the remote systems. However, most of the solutions we have found are hardware based (DSPs, ASICs, etc.).

The technology used on the studio side as well as the host side will involve various operating systems. We are trying to avoid avoid relying on specific OTC hardware solutions (namely, sound cards) as we would like to be able to create a solution that would function over time, particularly as specific hardware solutions tend quickly to horizon. So, having nice code that could be compiled on different systems would be a plus. Ideally, we would like to minimize the amount of hardware necessary, so an echo cancellation algorithm that could run in conjunction with other processes would be nice, but it is not a requirement."

This discussion has been archived. No new comments can be posted.

Software Based Echo Cancellation?

Comments Filter:
  • by vladkrupin ( 44145 )
    wouldn't the echo cancellation itself propogate over time and create artifacts in the sound? Or am I totally misunderstanding the issue altogether?
    • If you sent the audio from each source as an individual (multiplexed packetized) stream, at each source you could just not play the stream (any packets) originating from that source.
      Simple and Effective, but would require sending more data than mixing one version at a central location and sending that version to all conections (but that would be silly anyway due to lag).
      But generaly you wouldnt have more than one person talking at a time and when it happens if you are using a decent compression targeted to voices (look at the compression schemes used in digital mobile phones (both GSM & CDMA)) this shouldnt be to much of a bandwidth problem anyway
  • There's a really good software solution for this HERE... [sharpsav.com]

    HTH !

    • But that link doesn't point to a software solution at all. That's a hardware device, and it doesn't do noise/echo cancelling, it does videoconferencing. I'm sorry to say that you're a little off.
      • But that link doesn't point to a software solution at all. That's a hardware device, and it doesn't do noise/echo cancelling, it does videoconferencing. I'm sorry to say that you're a little off.

        Whoops! My bad.... I meant to post THIS [speechpro.com] link...

  • Hardware Audio Tools (Score:5, Informative)

    by saveth ( 416302 ) <cww@deWELTYnterprises.org minus author> on Wednesday May 08, 2002 @01:16PM (#3485427)
    The reason you're finding more hardware tools than software tools for echo cancellation, among other things, is that the telecommunications industry demands these sorts of things with much more fervor than the average consumer. Echo cancellation devices (for example, a codec with echo cancellation built in, running on a DSP) are used extensively in cellular telephones, voice routers, and this sort of thing. Your best bet, in this respect, is to find a company that is willing to release the source code to the software that is running on your hardware.

    Alas, I do not know of any software, especially open source or free, that provides a full suite of audio processing utilities. Why is it that you're against using hardware, in the first place? Too expensive? Those are the breaks.
    • You don't want echo cancellation, you want an acoustically dead studio.

      Put foam on the walls.

      • Of course, that won't help with the problem of multiple microphones picking up one person's voice at different times. That's probably where most of the echo is coming from in this situation.
        • But those are not echos. Those are real signals picked up by other microphones. It might be difficult to DSP that.
          • I haven't played with it, but I don't think it'll be difficult. Each microphone records its intended signal plus a delayed version of what each other microphone is recording. There will be strong correlation between the undesired sounds going into a particular microphone and the microphone at some other point. You should definately be able to do this adaptively (though there's no guarantee your PC will be able to do it in real time)
            • I think it is important to keep in mind, that unless they are in the Superdome the two feet distance separating the speakers is not going to cause any "echo". I concur with the notion that the room is causing the problems, not the hardware. In fact, if multiple mikes were picking up any given speaker, the signal would actually be of a higher amplitude.
              • thorbo is right. The speed of sound at sea level is something like 700 miles per hour (Mach 1). If I've done the math right, that is 1,026.67 feet per second. A 10-foot separation will cause a 10 millisecond delay. I believe a 50-millisecond delay is heard as echo. Can anyone help about this?
          • Those are real signals picked up by other microphones. It might be difficult to DSP that.

            I can't understand what the guy is asking about, but if this is the problem, the true answer is to use directional mics pointed at people's heads, and put a noise gate on EACH MICROPHONE. Infact, if you really want to do good recording, you should have a noise gate and compressor (and perhaps an expander) on each microphone BEFORE it goes into the mixer. Yes, it costs a lot, but you can get an eight channel block of compressor/gates for $2000 or so.

            Alternatively, you can use an automatic mic mixer which picks the loudest mic automagically, but these are really only appropriate for use in a live speach situation, not recording. They tend to sound a bit clunky.
      • by Ooblek ( 544753 )
        I believe the echo he is referring to is the echo that happens in the latency from the time their voice is spoken to the time it makes the round trip from the other side.

        The term "Mix Minus" does not apply here. It is generally used in post production where you lay off your audio mix minus the voice over. You probably also don't need echo cancellation software. Just put a mixing board on each side of the conference call. Mix the local side's voice with the remote side's voice on each side. So they hear themselves as they speak and don't hear themselves from the remote side. Use directional microphones so that the loud speaker on the remote side can't be heard in the remote microphones. You could also require everyone to wear headphones I suppose. (Probably wouldn't be popular.)

        • Just FYI; a minux mix is often used, for example, in news operations where a satellite feed is being used. The remote reporter gets a minus mix, which is program audio minus their voice. Because of the distance 22,000 miles out and back twice (once to the station and once back) should add up to like 5 or 6 seconds. when you're trying to talk, listen to the producer or director talk to you, and interact with the reporters in the studio, hearing your own voice delayed that badly isn't a good idea. It's not the program audio (what you hear in your house) that's minus anything; but the signal to the reporter on the other end of the satellite.
    • I was amazed at all the processing going on inside the DSP back when I tested, and later debugged/coded, modems for a living. When you mentioned echo cancellation, it was the first thing that popped into my head.

      A logical extension of this for your application would be to try to get your hands on some source code from a "Soft" modem. The idea was to move the most intensive processing out of the DSP and onto the PC processor since they were, in theory, becoming powerful enough to handle all the operations in real-time. Actual performance of these types of modems is a completely separate story, but the echo cancellation code is out there somewhere. At thing point it should just be a matter of getting your hands on it.
    • Your best bet, in this respect, is to find a company that is willing to release the source code to the software that is running on your hardware.

      You mean when hell freezes over?

      And why anyway? Echo cancellation is not that hard. A lot of open source 3D graphics software is much more sophisticated than echo cancellation. I think the reason why there are few open source implementations is because few people want it. As others have pointed out, open source telephony software contains this.

  • Noise gate (Score:3, Informative)

    by joshwa ( 24288 ) on Wednesday May 08, 2002 @01:17PM (#3485432) Homepage Journal

    As many people may need to speak at once (and to avoid the requirement of having individuals constantly muting their microphones),

    Why not just install a noise gate at the microphone inputs?

    For the non-audio-inclined slashdotters, a noise gate sets a minimum sound level threshold before the signal is transmitted.

    • I don't know why this post was moded "Troll", but such is life.

      I believe some sort of noise gate would be the best solution and the easiest to implement. Since it appears that there isn't any open or free software to accomplish you goal, this would probably be the cheapest, hardware-wise. If you haven't purchased mics yet, there are many that include filters and echo elimination built in.

      Give it some thought.
    • Re:Noise gate (Score:3, Informative)

      by Anonymous Coward
      >>For the non-audio-inclined slashdotters, a >
      >>noise gate sets a minimum sound level threshold
      >>before the signal is transmitted.

      I'm no audio engineer, but it's obvious it wouldn't work.

      Person A talks into Microphone A - also picked up with a delay in Mic B.

      a)You want to cancel the echo from Mic B - so you use a noise gate. it works as long as i don't talk loud enough to cross the threshold on Mic B. Given that this is in a conference setting, the mic of the person next to me is going to pick me up without that huge of difference from the mic right in front of me. Whats the difference between me speaking loudly and the person next to me speaking somewhat softly? Not much and the gate doesn't know any different.

      b)To screw things up we only have to get over the threashold. So I'm speaking AND a neighbor starts to speak. So the sound at his mic is his voice + my voice. The total is over the threashold. Again - the gate does nothing.
      • Re:Noise gate (Score:2, Interesting)

        by goofy183 ( 451746 )
        Actually a noise gate would be very effective. I am involved in our colleges technincal sound & lighting organization. We run pro level audio systems for concerts and such here. As was suggested a gate would cut the mic channel once the signal went below a certain level. Your worry about the sound comming from the confrence room speakers triggering the gate is semi valid. You would have to do a bit of trial and error in each room to set the db level to cut off the signal at. I figure if we use gates on a drum set at a concert with house speakers putting out 120+ db and them still only have the gates allow signal when the set is being played.

        So I know the original post may not want to use hardware but gates are your best bet. If there could be any way to have each person's mic on it's own channel into the computer you may be able to write some simple audio proccessing software to mute a channel after the signal is below a certain level.
      • If they are using two microphones that are not omni-directional, but instead uni-directional then the gate is the perfect thing for them. Every microphone has a diffrent sweet spot. For the uni-directional mic you can have rather loud noises all around it... as long as the sound isn't directed into the sweet spot, it isn't going to get picked up much at all. Thus the sound gate would most definatly kill any noise made at Mic 2 from the person at Mic 1.
      • Actually, there's no echo problem with the next person's mike picking up your voice. It's the mic at the other side of a large room that is a problem, and a noise gate would solve that problem, unless the person at the opther side of the room was also talking, which would open the noise gate.

        - Eric My web page [invisiblerobot.com]
      • several others have written back regarding that a noise gate would be effective, and I agree; but there is another solution involving slaving several noise gates together that might help in the conference setting. I belelive they were developed for press conference type events for television (being in television myself, you'd think I'd know). It's an automatic mic mixer (one example [shure.com]) that basically opens the mic that best picks up a source. I'm not sure how it would handle several people talking at once, I think it's smart enough to tell the difference but I've never used them myself (in most cases a person can do such a thing so much better than a machine can). Anyway, this is probably way too expensive, suggests using special mics, and is not a software solution; but it's something to consider. ~ibennetch
  • by Myshkin ( 34701 ) on Wednesday May 08, 2002 @01:17PM (#3485433)
    sed 's/^echo/#echo/' /etc/inetd.conf >/etc/inetd.new
    mv /etc/inetd.new /etc/inetd.conf
    kill -HUP $(ps -ef |grep root.*inetd|grep -v grep|awk '{print $2}')

    no more echo
  • Asterisk PBX (Score:5, Informative)

    by Anonymous Coward on Wednesday May 08, 2002 @01:18PM (#3485435)
    There's an excellent open-source PBX called Asterisk. Among other things, it provides an MMX-optimized echo-canceller. Look here [asteriskpbx.org]
  • Tough Problem (Score:3, Informative)

    by mellifluous ( 249700 ) on Wednesday May 08, 2002 @01:18PM (#3485437)
    Maybe someone at /. will find an answer for you, but I would be surprised to see this implemented in any kind of stand alone SW package. Because it is a specialized real-time application requiring fast feedback, it makes sense to implement it as an embedded system (i.e. in hardware).
  • by WndrBr3d ( 219963 ) on Wednesday May 08, 2002 @01:18PM (#3485439) Homepage Journal
    Back in the day when 56k modems were taking off, there was a large piece of software people were coding into drivers called 'Ring Cancelation'.

    These were added because when you send data down an analog line at high speeds, you begin to hear an audible sound which sounds like ringing. The modem drivers needed to be able to tell the difference between this ringing sound and the actual data.

    I think a good place to start if you cannot find any software is perhaps hacking these drivers or something along those lines.

    It's a good start at least. Hope this helps :-)
  • we need more (Score:3, Insightful)

    by RealisticWeb.com ( 557454 ) on Wednesday May 08, 2002 @01:20PM (#3485451) Homepage

    This is a great question you are asking, and I would love to see a good answer. The shame of it is, I'm expecting to see a bunch of posts in response to this saying "If you need one then write it yourself".

    Is it just me, or does it seem like the open source offerings for things related to audio/video are lacking in general? I wish I had time to make improvements myself, or the money to contribute to the developers, but it seems like we need more in this area to be able to be more competitive with proprietary solutions.

    • Yeah. like the great video/audio editing applications that are available on Windows. Oh wait, those suck too.
      I think the problem with making really robust audio/video applications is that it's difficult to do.

      There are some interesting projects in the OSS world in this area, but they are new and not yet mature.

      Like everything, it takes time for things (e.g. software applications) to take shape and be worth something to anyone other then a niche crowd.

      That's the other problem. Audio/video applications in the OSS world are not in as high demand as things like development tools and web servers.

      I predict in the next year or so their will be some pretty good OSS apps in this area.

      • I use Cakewalk, it's ok. But it's not like some amazing application.

        Come to think of it I have yet to use any audio application on any platform that works to my liking (some things are best done the old-fashioned way). A friend of mine uses a Mac to do all his mixing (records on a 4-track) and he loves it.

        Linux will get something in the next year or so that works ok, about the same that windows has now.

        There is know doubt in my mind that certain areas of the Linux desktop will trail the commercial side.

        You just have to know you the limitations of the OS and decide whether it makes sense for what you are doing.

        For top-notch audio, go with a hardware solution for recording and use your PC to mix.
    • Just a rundown:
      There is currently the ALSA Project [alsa-project.org], the MusicKit [musickit.org] (a MIDI and realtime DSP framework from the NeXT world), and . This trio, together, would be great for writing full-featured music applications. Now we just need to do it. :-) [sourceforge.net]

      I'm considering tackling this problem soon if I have time.

    • Are there any closed source software echo cancelation utils already done? Are they practical?

      People are arguing that the software solutions are to limited or to expensive to implmement so that hardware echo cancelation is what we have now.
  • You're probably not going to find a free software echo canceler. I've been working on audio processing for a cell phone design and AEC (automatic echo cancelation) is budgeted for a lot of time and money. It's a pretty involved signal processing function that is well-suited to optimization on a specialized processor, hence the number of solutions you're seeing for ASICs and DSPs. The bottom line is that I would suspect that you're probably going to have to pay for it.
  • Classic application! (Score:3, Informative)

    by spaceyhackerlady ( 462530 ) on Wednesday May 08, 2002 @01:25PM (#3485482)

    Echo cancellation is a classic application of adaptive filters. Every reference ever published on the subject discusses it. I like Haykin's book [mcmaster.ca] myself.

    I just did a search on Google [google.com] and came up with 4000 references.

    The underlying theory is pretty hairy, but the implementation of an algorithm like LMS is straightforward.

    ...laura

    • Laura's right: you'll find the maths and the algorithms for echo cancellation in most textbooks on adaptive filtering. Check out the July 1999 issue of the IEEE Signal Processing Magazine (it shouldn't be too hard to get hold of it, most university libraries' engineering section should have it) -- it is an issue dedicated to "Adaptive Algorithms and Echo Cancellation". All the maths and algorithms you need are discussed there. Yes, you do need a good background in linear algebra to follow the underlying theory, but the algorithms should be easier to implement, and you're likely to find source code for most of them on the web (LMS filtering is used in many other applications too).

      Echo cancellation is a common design problem in hands-free telephone systems and conference systems; there is lots of literature on the subject. See the references in the articles I mention above.

      • by Anonymous Coward
        An LMS implementation of a noise canceller is about ten lines of Matlab (or Siglab) code. It will look something like this:

        % ref is your noise reference
        % ypri is your input signal mixed with the noise

        alpha = 0.2; % Time constant.. play with this
        L = 30; % Filter length. Play with this too.

        W_adap=0*(1:L); %initialize filter weights to zero
        pow = 1; % initial input power estimate
        beta=alpha/L; % normalize the time constant

        % You'd be doing this forever %
        for n = ((L-1)/2)+1:(length(ref)-(L-1)/2)

        % e is the cleaned up signal
        e(n) = ypri(n) - W_adap*ref((n-(L-1)/2):(n+(L-1)/2))';

        % mu is your update coefficient
        mu = alpha/L/pow;

        % W_adap are your filter weights.
        % This updates them
        W_adap = W_adap + mu*e(n)*ref(n-(L-1)/2:n+(L-1)/2);

        % This updates your power estimate, which you
        % will use for the next cut of mu
        pow = (1.-beta)*pow + beta*abs(ref(n))^2;
        end

        %% An SNR estimate
        % snrtot =sum(e.^2)/sum(ypri.^2);

        This may not be state of the art, but it will give you a very noticeable improvement. You may want to shift the center of your filter in time a little.

        There's a lot of variations on this problem, so do check out Haykin (or recent literature) if this doesn't do it for you.

        Luck,
        Jordan
  • Are you talking about acoustic or electrical echoing? Have you taken a look at ITU G.168?
  • Microsoft! (Score:2, Interesting)

    by genka ( 148122 )
    It is as far from the open source as it gets, but the Windows Messenger includes voice features and has a software echo cancellation. Little known fact: Messenger 4.6 does not need Passport for operation, in closed enviroments any SIP server will do just fine. Also, encoded media stream it outputs uses standard encoders and standard RTP protocol. So Messenger can interoperate with many applications.
  • by Troy Baer ( 1395 ) on Wednesday May 08, 2002 @01:27PM (#3485496) Homepage

    The Access Grid [accessgrid.org] is a project started at Argonne National Lab's Math and Computer Science Division [anl.gov] to build a mostly open videoconferencing system over the Internet, using multicast audio and video streaming. You may want to take a look at their technology to see if they have ideas you can use.

    Anyway, a "node" on the Access Grid consists of a room with at least three computers: a multihead box running Win2k for display to several video projectors, a computer running Linux for audio capture and playback, and another running Linux for video capture. The audio capture machine usually runs into a Gentner AP400, which does echo cancellation as well as phone bridging.

    I don't know of anybody who has software that does this; sorry.

    --Troy
    • We just set up our AG node this past month... we do echo cancelation with a nice fancy box from Gentner that costs several thousands of dollars ;). The really annoying thing is that you program it over an RS-232 port-- from windows. The rest of the AG software is linux, though... vic and rat (for video and audio) respectively. It's fun stuff... check out accessgrid.org
  • by ttyp0 ( 33384 ) on Wednesday May 08, 2002 @01:28PM (#3485502) Homepage
    I remember when I worked at Tellabs we had a product, EC-8000 Digital Echo Canceller [tellabs.com] Might be worth a look.
  • by Seth Finkelstein ( 90154 ) on Wednesday May 08, 2002 @01:29PM (#3485508) Homepage Journal
    Am I misunderstanding the question? A Google search for "echo cancellation" software [google.com] turns up quite a bit.

    Notably, a lead such as: http://www.nist.gov/speech/tests/ctr/h5e_97/echoca n.htm [nist.gov]

    The echo cancelling software (ec_v2.5.tar.gz) that is applied to telephone data, may be obtained from Mississippi State University.

    The LDC has provided a perl script (mu_ec.perl) that will take a sphere-headered, 2-channel mu-law waveform file as input, apply the MSU/ISIP echo cancellation software, and produce a sphere-headered, 2-channel mu-law waveform file as output.

    Sig: What Happened To The Censorware Project (censorware.org) [sethf.com]

    • This is a good start. Note that the perl script linked to above only provides raw data to the ec.exe binary, but the source code is linked to on that page. Also, there is more information and the source code at http://www.isip.msstate.edu/projects/speech/softwa re/legacy/fir_echo_canceller/ [msstate.edu]. Nevertheless, consider:

      * In running the echo canceller on sparcs (ss20, SPARCserver-1000), it takes between 3 and 4 times realtime to operate.


      Now a Pentium III 800 will probably run it in a fraction of the time for an SS20, say 1/2 realtime to 1/4 realtime. But if it is for a mixing project, there will be several streams to process. I wonder if the cost of having to use a dedicated computer for software processing will outweigh the cost of dedicated DSP hardware?
  • A common algorithm (I believe it's used in modems) is the Widrow Noise Filter. I had to write one in a CS class. It's a type of neural net. The general algorithm for noise filtering is sort of machine intensive, so it's reserved for high end $$$ equipment (if I recall the lecture properly). This filter is a quick write though, and will get you set straight.
  • by Ludwig668 ( 469536 ) on Wednesday May 08, 2002 @01:34PM (#3485545)
    Check out Analog devices [analog.com]; their 2181 demo has echo cancellation as a part of the included software; source included.
  • if you're not averse to working out what you need to do on your own (there's a lot of literature out there on this subject), then i suggest PD (this [pure-data.org] is a nice place to start).

    runs on windows (NT/etc.) and linux, OS X port is in the works, IRIX is also supported. it's fast and very flexible, and will support a simpler solution - like gating, matrixed switching (if you've got separated inputs at hand), whatever - if you find one. the pd-list is quite supportive.

  • We are trying to avoid avoid relying on specific OTC hardware solutions (Emphasis added.)

    Looks like this message was dictated with intermitent echo!
  • by ultramk ( 470198 ) <ultramk@noSPAm.pacbell.net> on Wednesday May 08, 2002 @01:40PM (#3485588)
    ...is because it doesn't exist.

    Realtime processing, AFAIK, be it audio or video, is astonishingly processer-intensive. It doesn't surprise me that DSPs are being used for this reason: they may be the only thing that can cut it in a cost-effective manner.

    i.e. you may be able to build a high-end workstation, and write some real-time software to handle this task, but since it probably wouldn't be able to do anything else at the same time doesn't that qualify as a hardware solution?

    Perhaps instead of going to extreme lengths to remove echos, perhaps you just need to work harder to prevent them in the first place? Pro audio mags have tons of ways to reduce echo and other unwanted effects in small (usually home) studios. Have you looked into this?

    Michael-
    • A DSP is just a CPU with one or twelve little two-step and array-math hacks in it. Any CPU that's 2X faster in FLOPs can do the same thing with ordinary arithmetic code.

      There are lots of new CPUs that are faster than lots of 5-year-old DSPs.

      --Blair
      "But then Microsoft puts the code in a directory somewhere under C:\Windows and kills the market."
    • We implemented an echo cancellation algorithm for a GSM phone. It required somewhere around 2-4 MIPS from our 26 MIPS budget. You should be able to find some app notes you can use to implement on a very inexpensive TI or ADI development kit.

      DSPs are historically very very slow in frequency when compared to other embedded CPUs (although this is far from true today). Until the past few years, DSPs have been lingering in the 10s of MHz range, while embedded MCUs were venturing into the 100 MHz range.

      You can probably implement on any MCU that runs around 20MHz. This should provide you the low latency you need and more than ample MIPS capacity.

      best regards,
      mega
  • by jmv ( 93421 ) on Wednesday May 08, 2002 @01:41PM (#3485604) Homepage
    Look here for my echo-cancellation code:
    http://speex.sourceforge.net/audio/sndio.tg z

    It's bundled with open-sound calls to read and write audio in real-time, while removing acoustic echo from the input. There's not much doc, but the test2.c program is quite simple. Feel free to contact me at jean-marc.valin@hermes.usherb.ca. Note that there's no real project (sourceforge or other) assiciated to it but if you find it useful, I may create one.
  • by hidden ( 135234 ) on Wednesday May 08, 2002 @01:58PM (#3485710)
    1) Use directional microphones, or else throat mikes. This will make the neigbour's microphone only pick some one up very quietly, if at all.

    2)if there is still some echo problem, it should be quiet enough that simple (software) noise gates should solve the problem.

  • by Aquaman616 ( 131268 ) <bhall&figleaf,com> on Wednesday May 08, 2002 @02:03PM (#3485746) Homepage Journal
    I've been hearing about some new technology from Macromedia that might make your life a *lot* easier. Apparently the Flash 6 plugin supports hooking into both webcams and mics (after the user OKs it) as well as special socket-based connections to a new piece of server software codenamed TinCan. In addition they've talked about the server supporting shared objects as well.

    From what it seems you're able to put code on both the client and server and both are based on ECMAscript. This would let you do a lot more than nearly every other solution I've ever seen. I don't know when the server is supposed to be released, but if you check up on the recent interviews with MMs CTO Jeremy Allaire on C|Net or The Register you'll see that they seem to be hinting that it will be available later this year.
  • by Anonymous Coward on Wednesday May 08, 2002 @02:04PM (#3485751)
    Most solutions offered by Ditech, Telogy, etc. cancel the electrical echo caused by an impedance mismatch 2 to 4-wire hybrids in the analog part of the Old Telephone Network. You seem to develop a packet-based videoconferencing system, which has no hybrid in it, so you must want to cancel acoustic echo, caused by reflection of the sound produced by the speaker-phone on the walls of a conference room.
    This is a very hard problem, because you have to modelize the environment of each conference room. You will have to guess mathematically (with the LMS algorithm for example) the echo response on a tail of at least 128ms for each room, which would take at least a few minutes to one hour on a P4 2GHz system.
    And what about if a door is suddenly closed in the conference room? Or what if the speaker phone is moved? You will have to re-modelize your echo response each time that happens, because the geometry of the room will have changed.
    The solution is surely not a software echo cancellation system, at least not before 2010.
    Think about a hardware solution, DSPs or ASICs (http://www.octasic.com)
  • You just need a better microphone. Use a teleconference unit from plantronics. As a side comment, looks like you're reinventing the wheel. There are a lot of off-the-shelf video conferencing systems out there that'll probably cost less than all your TIME spent tinkering.
  • From my own experience with video conferencing and speaker phones, the biggest problem stems from the balance between getting enough microphone gain to pick up the quietest speaker and reducing both kinds of feedback/echo (single-site feedback / cross-site echo.) The obvious solution is to move each output device closer to its corresponding input device.

    In other words, either (1) exchange the room speakers for earpieces or headphones or (2) exchange the room microphones for lapel or boom microphones. Clearly the degenerate case has everyone essentially speaking into a separate telephone receiver which probably defeats the purpose of the system altogether.

    Of course, it would be way cooler to have a setup where the room microphones are aware of the room speaker output and automatically cancel it out. The trouble is it's way cooler because it's so difficult to do.

  • Just say to yourself "There is no echo" until you believe it.

  • I'd love to see a list of all the other software you have found.
  • by teamhasnoi ( 554944 ) <teamhasnoi AT yahoo DOT com> on Wednesday May 08, 2002 @02:42PM (#3486020) Journal
    One (good) omnidirectional condenser Mic in center of room; everything will be in phase and mono. Send this signal to a noise gate to cancel out paper rustling, and then a compressor (hard or software). I'd guess a 1:10 (or less), with a threshold of -20db (give or take) and a soft limiter would do it. This will equalize the volumes between the loud drunk salesguy, and the quiet intern. Educate members of meeting that they need to speak confidently.

    I guess I don't see why NOT routing the audio back would be a problem, or maybe I don't understand the question.

    Otherwise, save your paper towel rolls, and hand them out before a meeting. I don't do this for a living, so YMMV.

    • You don't understand the problem, the echo comes from the signal being sent to the other end, being picked up on the mic at the other end and sent back.

      The delay is in the order of 1500ms.

      You need a big (~100k point) adaptive FIR to cancel it, getting a PC to do it in realtime is challenging, and there won't be many resources left to do other things. We have a gentner at work, and its is fine. If you need >2 channels it would be cheaper than the eqivalent PC hardware, and its much easier to set up.
  • well, since you asked for a 'soft'-ware solution, i suggest using foam padding on the walls surrounding your conference room. altho i don't understand why you're so bent on the material being soft. foam is soft and from my very-basic understanding of acoustics, it dampens and reduces echo. wait, oh... that software!?
  • by vivekb ( 111127 )
    IBM has a software based solution (unfortunately only available as a Windows DLL) that is very impressive. I evaluated it for a project, but it wound up costing too much. Still, you could try contacting their lab in Israel [ibm.com].

    I doubt you will find an open source echo canceller, since acoustic echo cancellation is pretty difficult (and has generated many, many patents). Nearly everyone uses a different, proprietary algorithm.

    If you want to make one yourself, set aside about 10 months.

  • by tuj ( 303347 )
    Assuming you want software, I would think the best way to go would be Realtime Csound. If you're not familiar with the csound language, its very powerful for dsp and was designed for manipulation of audio. It runs on most platforms and processors. Different implementations of csound designed for realtime use exist for both cisc and risc architectures. Assuming you can design the appropriate algorithm for echo cancelation, csound may be ideal.

    Another option, also assuming you can design the algorithm, would be SuperCollider, which is another audio processing language that tends to have better realtime performance than csound. Only runs on macs tho.

    Finally, in terms of techniques, you might think about 'shooting' the various rooms (by recording a balloon popping) and using the resulting impulse data to remove (rather than add, as in conventional convolution) the echos on voice audio from that same room. FFT might work here also, although probably less effective. SuperCollider can do realtime convolution; csound might be able to, depending on how high of a sample rate you need.
  • by Petrus ( 17053 ) on Wednesday May 08, 2002 @03:20PM (#3486269)
    #define AdaptationRate 0.99
    // Basic adaptive LMS FIR algorithm.
    float EchoCancellation(float Sample)
    {
    static float History[MAX_ECHO_DURATION+1] = 0;
    int i;
    float AdaptationRate;
    float EchoAmpl;

    for( i=0; iMAX_ECHO_DURATION; i++)
    {
    EchoAmpl = History[i]*Coef[i];
    Coef[i] *= AdaptiationRate*(Sample-EchoAmpl);
    History[i+1] = History[i];
    }

    History[0] = Sample;
    return Sample-EchoAmpl;
    }

    That's all the "basic" science.
    You might find, that for 40kHz and 250ms echo this is too computationally intensive for a single Pentium. You may need some 1200 MIPS.

    You may then:
    1. Use Athalon ;-)
    2. Convert it to pointer arithmetic
    3. Convert it to integer arighmetic
    4. Skip some samples for echo estimation, sometimes
    5. Contact me to use more clever algoritm (IIR?)
    (Petrus.Vectorius@ied.com)
  • by Lumpy ( 12016 ) on Wednesday May 08, 2002 @03:22PM (#3486288) Homepage
    if you are creating your studio then you need to make the studio fix the problem first, dont try to compensate for a crapy studio in the recording hardware/software.

    #1- Sonex, sonex, sonex. If you dont have sonex or the crappy sonex copy or even just carpet on the walls (Yes wall carpet looks good) along with the roughest texture ceiling tiles you can buy at the home-depot (or better yet the $90,00 a 2foot square city scape audio ceiling tiles) then you are wasting your time. it takes very little to make a room acoustically deadened to the point that properly set up microphones wont pick up any perceptiable echo. (Note: if you have you're mic's set so your artists or voice talent is farther away than 3 inches from the P popping screen then you have it set wrong. also dont let the talent talk quietly, make then talk or sing loud to overcome room acoustics.

    start with the low tech, then add your high tech bandaid filters.
    • I don't think he really means he has a problem with echoes off the walls... I think he means that when someone on the local side of the conference speaks and his voice is played back though the speakers at the remote side, it'll be picked up by the microphone at the remote side and sent back to the local side. Depending on how loud the speakers are, this could cause a feedback loop.

      I've had the same thing happen when using one of those internet voice chat programs. If one end uses speakers rather than a headset, you'll hear an echo of yourself talking. If both ends use speakers, crank up the volume and have fun with the feedback :0

  • Pure Data [pure-data.org] is a real time sound-manipulation program that runs on Linux, Windows, and Mac Os X. You'll be able to design any kind of sound processing algorithms you like but be warned: only Linux gives you input-output latency as low as 3 milliseconds. In Windows you'll have to settle for 300 ms unless you buy some fancy audiophile soundcard that supports Steinberg's ASIO.
    • Creative Labs' Audigy supports ASIO and is targeted for the general user ($100 retail). It may even be suitable for the purpose described here.
  • by AlaskanUnderachiever ( 561294 ) on Wednesday May 08, 2002 @03:42PM (#3486404) Homepage
    Hell I helped build one. And while there is a LOT of noise cancellation and "echo reduction" software on the market (Cool Edit Pro has a few nice plug ins) the sound quality after applying such a filter could at best be called "fair". Unfortunately your best solution is to find a high quality mic with a bit of noise cacellation (and the higher end ones can be "tuned" with a hardware equilizer) and just suck it up and BUY THE FOAM. I know it's ugly. I know it's a pain in the ass. I know it's only effective if the studio is designed well, but nothing that I have personally seen (well under 40k that is) beats the stuff. Acoustic dampening foam is your cheapest option that will still maintain audio quality to a reasonable degree.
  • http://www.intel.com/software/products/perflib/
  • ...it is quite clear that this gentleman is talking about feedback and not room acoustics.

    In my experience such hardware or software solutions do more harm than good because they cannot distinguish between wanted and unwanted sound (unlike one's auditory system).

    Taking the time to set levels and proper mic/speaker placement should be sufficient. But to totally remove the problem then headphones really are the simplest option - just don't forget to turn the speakers off too! ;) In this case then the use of high quality mics/headphones is totally unnecessary. This also offers the users individual monitoring control.
  • If I understand correctly, by "echo cancellation", you mean that, when several folks are talking at once, you want each person to hear everyone except themselves. If that's the situation, I would suggest flipping each speaking person's outbound signal 180 degrees out of phase with their inbound signal. That way, their own voice will cancel itself out in their personal conversation "mix". Make sense? To flip an audio signal out of phase, you delay it by twice its frequency. In other words, if your signal is 15 kHz, that's 15,000 waves per second. Or, 1 wave per 1/15000 of a second. If you mix in another copy of that signal which has been delayed by 1/30000 of a second, the two signals will be 180 degrees out of phase, and will cancel each other out. It is important that you only flip the speaking person's voice out of phase with the incoming signal, so that her voice is only removed from her inbound mix.
  • Guys.... Actually this does not require any computing power at all.. Get some highly directional microphones (directional cancellation)...and if that doesn't work, well, you can always get two microphones..... very near each other you point one at the sound source (person speaking) and one away (use two Shure SM-57's)....you using the balanced audio pins 2 and 3 you hook pin 3 of the away mic to pin 2 of the person mic and likewise with the other two, now hook the two combination up to your input...walla! Analog, non-powered background noise cancellation using the miracle of 180 degree phase cancellation. this equates to.... person + noise - (noise) = person talking... person + feedback - feedback = person.. I know this is hard to type out, but it's a technique I've been using in the pro-audio world since about 1980....and it still works...of course you could (and I usually do) use a very small mixer with variable levels on both mics.... that way you have some control over the '3 dimensional' 'sound' of the input....natural reflections from the source and natural slight delays and echos from the rooms environment. John
  • As someone "in the business," I thought I'd share my experiences with echo cancellation in major label recordings.

    I must admit that I've never dealt with echo cancellation as a large obstacle. It is generally something taken care of by the sound engineers along with everything else. And artists: when selecting an engineer, don't pay attention to name-dropping or even reputation if you haven't heard his (or her) work yourself. In this business, a DAT is worth a thousand words, and listening to what your engineer has done in the past will not only help you avoid difficult hires, but will give you context when working with this person.

    As for digital effects, I have had some experience with Akai products, but mostly regarding live performances. Again, I tend to leave the hard labour to my engineer(s). After all, I do not ask them to sing songs for me.

  • In your terminal emulator configuration, deselect "local echo".
  • There are two problems with a software-based echo canceller
    • Its seriously compute intensive. You need two buffers: a circular echo buffer containing the last N samples where N is the sample frequency times the longest echo you want to cancel, and another buffer of N samples which describes the echos you are hearing. The echo cancellation to be applied to each sample is the sum of the products of the elements in the two buffers. For radio you want to be sampling at 20KHz (giving a 10KHz Nyquist cut-off), so if you want to cancel a 0.1 second echo then you are having to do 2K multiplications per sample, at 20KHz, giving 40e6 multiplications per second. Conference phones (the kind of thing that sits on a table) can do this more cost effectively by handling only 8k samples/sec, which gives much saner numbers. In parallel to this you have to be doing a much slower (say, 10 or 20 times per second) bit of DSP to tell you what the echos are as people move around and change the acoustic properties of the room. I'm not sure about how thats done, but it probably involves an FFT somewhere.
    • The whole thing is the hardest of hard real time. The echo correction for each sample has to be fed back into the audio at exactly the right time to be out of phase with the original echo. If you get the time wrong then you will be in phase for at least one frequency in the echo, and that frequency will be amplified around the loop.
    This is simply not a job for stock hardware on stock OS. It needs dedicated DSP. Also I suspect you can save on the DSP by doing the buffers and multiplication in analogue hardware, but I don't know if that is cost effective in practice.

    Paul.

BLISS is ignorance.

Working...