Bayesian Filtering Outside of Email? 54
clonebarkins asks: "Is anybody out there using Bayesian filtering for stuff other than to get rid of spam? For example, how useful would Bayesian filtering be to identify news stories/blog entries in the RSS feeds I monitor? Is there any software out there using Bayesian filtering to do this sort of thing already? Are other types of filters better for these purposes?" What other areas can you think of where Bayesian filtering may prove useful?
Re:Bayesian isn't the right approach (Score:4, Insightful)
There are "clustering" techniques which attempt to identify similar bunches of data, without respect to any pre-determined bins, but the are not as useful for programmatically dealing with information. This is simply because you don't know what the clusters will contain, so you cannot make assumptions about what you will want to do with each cluster.
Classification systems are used when you WANT to fit things into one of a number of bins that you already have decided what to do with (e.g. SPAM - delete, From Mistress - show now, From Boss - file for later, From Debt collector - return "Deceased", etc.) Bayesian filtering is simply one form of classification.
For more information and ideas, check out KD Nuggets [kdnuggets.com]
Nice work on the newsbot, BTW.