Forgot your password?
typodupeerror
AI Communications

Ask Slashdot: How To Build a Morse Code Audio Library For Machine Learning? 79

Posted by timothy
from the wax-cylinders-all-the-way dept.
New submitter mni12 writes "I have been working on a Bayesian Morse decoder for a while. My goal is to have a CW decoder that adapts well to different ham radio operators' rhythm, sudden speed changes, signal fluctuations, interference, and noise — and has the ability to decode Morse code accurately. While this problem is not as complex as speaker-independent speech recognition, there is still a lot of human variation where machine learning algorithms such as Bayesian probabilistic methods can help. I posted a first alpha release yesterday, and despite all the bugs one first brave ham reported success. I would like to collect thousands of audio samples (WAV files) of real world CW traffic captured by hams via some sort of online system that would allow hams not only to upload captured files but also provide relevant details such as their callsign, date & time, frequency, radio / antenna used, software version, comments etc. I would then use these audio files to build a test library for automated tests to improve the Bayesian decoder performance. Since my focus is on improving the decoder and not starting to build a digital audio archive service I would like to get suggestions of any open source (free) software packages, online services, or any other ideas on how to effectively collect large number of audio files and without putting much burden on alpha / beta testers to submit their audio captures. Many available services require registration and don't support metadata or aggregation of submissions. Thanks in advance for your suggestions."
This discussion has been archived. No new comments can be posted.

Ask Slashdot: How To Build a Morse Code Audio Library For Machine Learning?

Comments Filter:
  • by symes (835608) on Thursday December 26, 2013 @12:24PM (#45788087) Journal

    Picasa came to mind - this service supports audio files and, last time I looked, allows you to share stuff. Although I should add that it has been a while since I looked at this service. Complements on your your clearly written post... days of /. gone by

  • by dbc (135354) on Thursday December 26, 2013 @12:25PM (#45788095)

    So.... I guess you've never heard of skimmer, the various remote receivers out there, and the SDR's that people are using to record large swathes of shortwave spectrum? You know people have been working on the problem for a while, as in decades? Skimmer decodes multiple streams of morse at once. Wake me when your stuff outperforms skimmer.

    • Re:Skimmer (Score:4, Informative)

      by mni12 (451821) on Thursday December 26, 2013 @12:56PM (#45788265) Homepage

      I am using CW skimmer fairly actively - in fact I have been corresponding with Alex, VE3NEA who wrote the CW Skimmer. He gave me the idea of pursuing Bayesian framework [blogspot.com] as I have been progressing in developing a well working CW decoder. The main difference here is that I am focusing on improving FLDIGI [w1hkj.com] which is open source software while CW Skimmer is a commercial software package. I do agree with you that CW skimmer does a great job decoding multiple streams simultaneously. Once the algorithm works decoding multiple streams [blogspot.com] is not that difficult.

      • by dbc (135354)

        So it isn't that hard to record 200 KHz wide segments of an HF band using something like this: http://rfspace.com/RFSPACE/SDR-IQ.html [rfspace.com] How many hours of test audio do you need? If you had a few volunteer owners of SDRs do some recording for you, you would have a large test base quickly. Unless I complete misunderstand the scale of the test base that you are going after. But it seems to me that 100 or 200 hours should not be difficult to get from volunteers -- and 200 hours times 200 KHz is a lot of CW a

        • by mni12 (451821)

          I have two SDR receivers myself and using them actively. The problem is not in the volume of data but having a set of data with a lot of variability to find out limits where the decoder stops working correctly. I integrated the decoder to FLDIGI with the hope that I get other hams to try this out and report back [eham.net] when they observe conditions where decoder stops working.

          I have also created many synthetic Morse files with different speed and Signal-to-noise ratio in order to plot the performance of th [blogspot.com]

          • by plover (150551)

            So to get this variety, what you really need is a network of volunteers with SDRs set up in listening posts around the globe. I think you'll get the most participation by making a solution as turn-key as possible for volunteers. Perhaps what you could do is to wrap up a software package that you could distribute to all these people. It could install the SDR drivers, and run the capture program on the appropriate frequencies. Set up a server where your volunteers can upload their captures. Set up the ca

          • by dbc (135354)

            Synthetic files at least have the advantage of automatically generated expected results files to go with them. Coming up with a good noise model seems to be the hard part. Perhaps record off-the-air noise and summing that into clean computer generated CW receive audio is a way to get a reasonable start on a channel model that has a more realistic HF noise model. Fading is easy enough to add to a propagation model, which could be extended to auroral flutter and backscatter.. Of course, you also need a sen

  • You can collect a lot of morse code traffic in the wild. Just get yourself a good HF receiver with some filtering (notch filter and a DSP). Set up a dipole as your receive antenna cut to 1/4 the wavelength of the band you will be monitoring. Here is a handy band plan [arrl.org] to guide you to where you will be able to find morse code which is normally called CW for continuous wave communications.

    I recommend this over any attempt to collect samples directly from hams. I know I do morse code differently when using the

  • by Dishwasha (125561) on Thursday December 26, 2013 @12:37PM (#45788181)

    Write an article and submit to ARRL's QST [arrl.org] and join and post to the AMSAT mailing lists [amsat.org] as there are quite a few keys there as well. Talk to your local amateur radio club and get the word out and you might even talk to your area coordinator.

    • Probably the best idea yet. Your signal to noise (so to speak) is going to be much higher there than on Slashdot......

  • I wrote a similar application in the late 1980's using a backpropagation neural net, and it was difficult to complete.

    Asking for volunteer submissions is the easiest and obvious answer. There is a group of commercial operators at http://www.radiomarine.org/ [radiomarine.org] who might have tapes for you.

    Some of the CONET project recordings feature morse, but the ones that I have heard sound mechanically generated.

    Finally, you can collect for yourself. HF is a desert these days, the last time I tried the only hams were acti

  • Try HMMs (Score:5, Informative)

    by SnowZero (92219) on Thursday December 26, 2013 @01:12PM (#45788373)

    The thesis you are basing your work is from 1977; while no doubt current when it was written, there is has been a lot of work on human signal decoding since then.

    I'd strongly suggest looking at Hidden Markov Models:
        http://en.wikipedia.org/wiki/Hidden_Markov_model [wikipedia.org]
    While some recent methods have gone beyond HMMs for speech recognition, that's been the baseline "good" solution for the past decade.

    Since this is a binary signal problem another approach to consider would be Markov Random Fields (MRFs) which could be used as an initial de-noising pass or even as a full decoder if you set the cost functions right.

    Your idea of user adaptation is pretty reasonable, but my guess is the primary thing that matters would be an overall speed scaling. IOW for good decoding you probably just need to normalize the average letter rate between users.

    Good luck.

    • by mni12 (451821)

      Thanks @SnowZero. I have looked at HMMs and in fact I wrote a simplistic decoder version using RubyHMM just to learn more how HMM really works. You would be surprised on the mathematical rigor of the original thesis [archive.org]. Many of the ideas are very relevant today, just much easier to implement with current generation of computers.

      The current decoder actually uses Markov Model - the software calculates conditional probabilities based on 2nd order Markov symbol transition matrix. The framework itself allows to

      • You might also like to have a look at this paper on using HMMs to convert a (continuous) chromatographic signal into (discrete) base pairs "calls" during DNA sequencing: Link [mit.edu]. The problem seems similar to the one you are working on, in many respects.
  • a friend pointed this out to me the other day:

      https://archive.org/details/SsMarineElectricWoohSos [archive.org]

    • by mni12 (451821)

      I did listen parts of the conversation between WOOH, NMN, LJKR and other boats in vicinity. Scary indeed.
      BTW - FLDIGI had hard time decoding this correctly, partly because the signal quality was so poor.
      Thanks for sharing.

  • Obviously you realize there are differences in how people send CW. While I applaud your drive to make a smarter decoder - the reality is that you need to make sure it works on live traffic. So in that respect, you should hook it into some kind of SDR software like HRD or even make your own that can decode multiple streams of CW. If you don't have a radio, I suggest maybe a SoftRock receiver?

    1. It gives you actual live conversations with all the mistakes and alterations. Not everyone uses computer genera
    • by mni12 (451821)

      @jfalcom -- I do realize the differences between live traffic and recordings. The example links I provided above demonstrated a live feed from ARRL W1AW code bulletin on 12/24 at 3.58105 MHz that I decoded using experimental version of FLDIGI v3.21.75 connected via SignaLink USB to Elecraft KX3 radio.

      However, there is a difference between debugging software and listening live feeds. I posted this question to figure out ways how to get a test set of boundary conditions captured by other hams so that I cou

  • This is a problem that effect all kinds of machine learning. It is always very difficult to collect enough samples to teach good recognition skills. Whether it is hand writing, speech or as in this case Morse Code. I'm wondering if some open library that could be uploaded to for this kind of thing might not exist, or if not, it might be a good idea.
  • by eyenot (102141)

    Couldn't you just create a computer generator for this audio, that uses a PRNG to intersperse pauses and other variations? You could create a much wider variety of conditions to put your parser through by controlling how much variation is in the length of each beep, pauses between beeps, pauses between letters. You could create a really bungling case or create a perfect case, and anything in between. Why not just do that?

    • by mni12 (451821)

      Great idea and in fact I have been using this strategy to create a number of different synthetic test cases. I have synthetic audio files with various Signal-to-Noise levels, with different speeds and so on. The variable timing (rhythm) is more difficult to simulate as there is no clear distribution (like Gaussian) to use as a model. Only if you aggregate over many users and normalize by speed you can start to observe some sort of Gaussian distribution in dits and dahs. I wrote about this problem wh

  • by Dan East (318230) on Thursday December 26, 2013 @04:56PM (#45790245) Homepage Journal

    First off, thank you Slashdot UI, for having me retype this whole thing again.

    I did this back in the early 90s with my Amiga. The hardware interface consisted of a transistor, filter capacitor, and variable resistor (I don't remember the exact design I came up with) to interface to the Amiga's joystick port (which used standard Atari controller wiring). I wrote the software decoder in Blitz Basic, and it used a scrolling window of 20-30 seconds over which it would average the pulses to determine the current dit and dah length. Any pulses deviating significantly from the current dit and dah length indicate a likely change in operator (one station finished keying and the other began their response), and the window would be positioned using that as as the edge point.

    The system worked extremely well, and was far more accurate than my AEA PK-232MBX when it came to decoding morse code. It decoded most anything I threw at it. Decoded output was sometimes delayed until it had received enough code to determine the current transmission rate and style, and then it would output a chunk of text at one time as it decoded the whole buffer at once. Then it would output real-time until a deviation in dit-dah lengths had been exceeded and the window repositioned so the dit and dah length could be recalculated.

    There are two discreet problems to address, and it sounds like you're lumping them together, which may not be a good way to proceed. First is the audio filtering / notch filter which tries to isolate a specific morse code signal out of other transmissions in the adjoining frequencies and general background noise. The other is simply decoding of the morse code message. Ideally, step 1 should be the analog portion, and step 2 should be purely digital.

  • CW is dead, buddy.

    Dead as in "There are few people left on the planet who actively work CW on a high proficiency level without using a keyboard and a screen reader".

    Today you can see ham shacks without a CW keyer as a norm, and if you see a CW keyer, the owner only in rare cases can go beyond 20wpm without breaking a sweat, making lots of errors all along the way and getting frustrated at hearing others do perfect CW, albeit with a keyboard.

    To give you a sense of scale: There are no more than roughly 4-500

There is no royal road to geometry. -- Euclid

Working...