Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Software Technology

Bayesian Filtering Outside of Email? 54

clonebarkins asks: "Is anybody out there using Bayesian filtering for stuff other than to get rid of spam? For example, how useful would Bayesian filtering be to identify news stories/blog entries in the RSS feeds I monitor? Is there any software out there using Bayesian filtering to do this sort of thing already? Are other types of filters better for these purposes?" What other areas can you think of where Bayesian filtering may prove useful?
This discussion has been archived. No new comments can be posted.

Bayesian Filtering Outside of Email?

Comments Filter:
  • Re:Nyuk Nyuk Nyuk (Score:4, Interesting)

    by NanoGator ( 522640 ) on Tuesday March 30, 2004 @12:37AM (#8711040) Homepage Journal
    " There have been a few "attacks" on slashdot which could have been prevented by simply blocking 'repeat' posts. "

    Filerting out GNAA posts would be nice. Not that I've run into it lately, but there was a story a couple of months back that had nearly 1,000 GNAA posts. Impressive organization on the behalf of the trolls, but it did take a while to suss out. (I wonder how many mods burned up mod points that night...)
  • by Jayfar ( 630313 ) on Tuesday March 30, 2004 @12:51AM (#8711125)
    See their technology overview [autonomy.com]. I believe they have a number of (ugh!) patents on Bayesian text analysis. They were founded by a Dr. Michael Lynch to productize research he did at Cambridge U.
  • by OnyxRaven ( 9906 ) on Tuesday March 30, 2004 @02:04AM (#8711500) Homepage
    I'm working on a project for my Senior Project that could take the Bayes method to identify webpages that are 'good' or 'bad' for a proxy or bridge based connection filtering or bandwidth limiting application.

    Now, obviously for webpages its a bit easier to say 'good' 'bad', but this app (www.bandwidtharbitrator.com) already has some regular expressions for apps like Kazaa, Bittorrent, in the hopes of limiting the bandwidth. I wonder if a Bayesian system could be adapted to this domain? I considered it, but the person in charge of that part of the project is using a diff-like method (which I find silly).

    Are there easy-to-plug-into APIs and libraries like that we could use to do all the 'hard work'? Is SpamBayes up to the task?
  • Control algorithms (Score:5, Interesting)

    by lindelof ( 606257 ) on Tuesday March 30, 2004 @05:35AM (#8712237)
    I work at the Building Physics Laboratory [lesowww.epfl.ch] in Lausanne, Switzerland, and I investigate the possible use of Bayes' theorem in the building control field. The idea is to classify situations as bad respectively good based on feedback from the occupants and have the system learn from its mistakes.

    Consider, for instance, the total amount of sunlight hitting your computer screen. Most people would like an automatic system to control their window blinds to keep that amount to an acceptable level, but the system cannot know a priori what that level will be for a given user. So we let the system set the blinds to a setting deemed acceptable for the average user and use the user's manual interventions to build up a list of bad settings, corresponding to the setting immediately before the intervention, and good settings, corresponding to the setting immediately after the intervention.

    The system will then attempt to minimize the probability of the user rejecting its settings by applying Bayes' theorem.

    I've done only preliminary exploration of this idea so far but the results are encouraging, and we plan to do a full-scale experiment this summer.

  • Kind of ... (Score:3, Interesting)

    by pen ( 7191 ) on Wednesday March 31, 2004 @01:23PM (#8726514)
    I run a submission-based web site [phrise.com] that, at times, gets a lot of duplicate (or very similar) submissions. I have a basic Bayesian script break each new submission into words and flag it if it's too close to something else.

Work is the crab grass in the lawn of life. -- Schulz

Working...