Ask Slashdot: How To Build a Morse Code Audio Library For Machine Learning? 79
New submitter mni12 writes "I have been working on a Bayesian Morse decoder for a while. My goal is to have a CW decoder that adapts well to different ham radio operators' rhythm, sudden speed changes, signal fluctuations, interference, and noise — and has the ability to decode Morse code accurately. While this problem is not as complex as speaker-independent speech recognition, there is still a lot of human variation where machine learning algorithms such as Bayesian probabilistic methods can help. I posted a first alpha release yesterday, and despite all the bugs one first brave ham reported success. I would like to collect thousands of audio samples (WAV files) of real world CW traffic captured by hams via some sort of online system that would allow hams not only to upload captured files but also provide relevant details such as their callsign, date & time, frequency, radio / antenna used, software version, comments etc. I would then use these audio files to build a test library for automated tests to improve the Bayesian decoder performance. Since my focus is on improving the decoder and not starting to build a digital audio archive service I would like to get suggestions of any open source (free) software packages, online services, or any other ideas on how to effectively collect large number of audio files and without putting much burden on alpha / beta testers to submit their audio captures. Many available services require registration and don't support metadata or aggregation of submissions. Thanks in advance for your suggestions."
Re:Try the NSA (Score:5, Informative)
They like collecting stuff
Na, amateur radio transmissions are some of the most boring conversations known to man (and I am a ham radio operator). No sex, drugs and rock and rock - no eavesdropping. Besides, we're mostly harmless.
Back to the topic. Because the bands are proscribed, ie, there are frequencies that are just CW (and phone or digital or whatever), it would seem an easy job to just record a band for a while to grab some samples. Use a software defined reciever (to allow for easy scripting), work the grey line [qsl.net] in your area. Even if your software isn't tuned well yet, I would hazard a guess that it is smart enough to detect CW vs. radio noise. Use that to start and stop the file. You probably don't need WAV, that's sort of overkill for CW. Even cruddy ol MP3 ought to give you more than enough headroom for further processing.
Re:Skimmer (Score:4, Informative)
I am using CW skimmer fairly actively - in fact I have been corresponding with Alex, VE3NEA who wrote the CW Skimmer. He gave me the idea of pursuing Bayesian framework [blogspot.com] as I have been progressing in developing a well working CW decoder. The main difference here is that I am focusing on improving FLDIGI [w1hkj.com] which is open source software while CW Skimmer is a commercial software package. I do agree with you that CW skimmer does a great job decoding multiple streams simultaneously. Once the algorithm works decoding multiple streams [blogspot.com] is not that difficult.
Try HMMs (Score:5, Informative)
The thesis you are basing your work is from 1977; while no doubt current when it was written, there is has been a lot of work on human signal decoding since then.
I'd strongly suggest looking at Hidden Markov Models:
http://en.wikipedia.org/wiki/Hidden_Markov_model [wikipedia.org]
While some recent methods have gone beyond HMMs for speech recognition, that's been the baseline "good" solution for the past decade.
Since this is a binary signal problem another approach to consider would be Markov Random Fields (MRFs) which could be used as an initial de-noising pass or even as a full decoder if you set the cost functions right.
Your idea of user adaptation is pretty reasonable, but my guess is the primary thing that matters would be an overall speed scaling. IOW for good decoding you probably just need to normalize the average letter rate between users.
Good luck.