Forgot your password?
typodupeerror
The Internet IT

Smart Spam Filtering For Forums and Blogs? 183

Posted by timothy
from the world-will-beat-a-path-to-your-door dept.
phorm writes "While filtering for spam on email and other related mediums seems to be fairly productive, there is a growing issue with spam on forums, message-boards, blogs, and other such sites. In many cases, sites use prevention methods such as captchas or question-answer values to try and restrict input to human-only visitors. However, even with such safeguards — and especially with most forms of captcha being cracked fairly often these days — it seems that spammers are becoming an increasing nuisance in this regard. While searching for plugins or extensions to spamassassin etc I have had little luck finding anything not tied into the email framework. Google searches for PHP-based spam filtering tends to come up with mostly commercial and/or more email-related filters. Does anyone know of a good system for filtering spam in general messages? Preferably such a system would be FOSS, and something with a daemon component (accessible by port or socket) to offer quick response-times."
This discussion has been archived. No new comments can be posted.

Smart Spam Filtering For Forums and Blogs?

Comments Filter:
  • Akismet (Score:5, Informative)

    by seifried (12921) on Sunday December 28, 2008 @05:49PM (#26252239) Homepage
    Akismet
    • Second that! (Score:5, Informative)

      by _merlin (160982) on Sunday December 28, 2008 @06:03PM (#26252349) Homepage Journal

      Akismet [akismet.com] is the best thing for blog spam prevention ever. I can't believe you've never stumbled across it before. It uses statistical analysis to identify spam, and the more people use it, the better it gets. If everyone used it, the blog spammers would just disappear because their attacks would be completely ineffective.

      • Re:Second that! (Score:5, Informative)

        by seifried (12921) on Sunday December 28, 2008 @06:09PM (#26252417) Homepage
        Add to which it has an API/etc. It really is what you should be using.
      • Re:Second that! (Score:5, Interesting)

        by Indefinite, Ephemera (970817) on Sunday December 28, 2008 @06:21PM (#26252499)
        The difficulty in evaluating Akismet - I speak not as a user but as someone who ended up apparently blacklisted and having to try their appeals system - is that everyone I see praising it is by definition the kind of person who pays attention to the filter and therefore will train it effectively. Since your average wordpress.com user more likely lets false positives pile up, I'd love to know how effective it is for people who don't wonder how effective it is.
        • Re:Second that! (Score:5, Informative)

          by _merlin (160982) on Sunday December 28, 2008 @06:55PM (#26252727) Homepage Journal

          I've used it for a few years now. In that time, it has caught tens of thousands of spam comments. It has missed about ten spam comments (i.e. allowed them through). It has misidentified two legitimate comments as spam. Yes, I realise I'm keeping an eye on it, and someone who doesn't may not notice that it's causing problems for them. But the stats are pretty good in my case. I'm aware of the allegations of corruption and using it to gag people, but that hasn't affected me yet.

        • by sfbanutt (116292)

          Hmm.. I've run akismet for a couple of years now and have never had a false positive. It's missed a few spams, but never marked a legit post as spam.

          • I have - once. Not bad.

          • Re:Second that! (Score:5, Informative)

            by sfbanutt (116292) on Sunday December 28, 2008 @11:49PM (#26254495) Homepage

            I just noticed a handy Akismet stats link in the latest version. I've been running Akismet since October 2006, in that time there have been 26,575 comments on my blog, of which 26,302 were spam(!). It missed 25 spam comments that had to be manually moderated and passed 273 legit comments. There have been no false positives. Personally, I think that's a pretty darn good record.

      • by Ihmhi (1206036)

        How does it handle the problem of "CAPTCHA farms" in India and China where they get $2 per 1,000 or so spam messages they post by hand?

        • That comment about the "CAPTCHA farms" reminds me of a recent experience I've had with the official Steam forums for GTA4. I was trying to search for posts by people who were having the same problem as I was, but the text I had to enter was lime green on a bright background. I couldn't figure it out for the life of me. And getting a new one only made it harder to read.

          My question is: If I can't read them, am i a robot?
          • Re: (Score:3, Funny)

            by kv9 (697238)

            My question is: If I can't read them, am i a robot?

            well, let's see. would you injure a human being or, through inaction, allow a human being to come to harm?

  • I always thought (Score:2, Informative)

    by davebarnes (158106)

    Re-Captcha was fairly effective and easy to install and useful.

  • D.I.Y. (Score:2, Insightful)

    by Zsub (1365549)

    Or am I misunderstanding what FOSS really is about?

    • Re:D.I.Y. (Score:4, Informative)

      by Korin43 (881732) on Sunday December 28, 2008 @06:27PM (#26252533) Homepage
      Yes. The point of FOSS is that one person can do it and no one else needs to do it again unless they want to make it better. This guy is looking for a solution, and the solution already exists. He would be wasting his time if he did it himself.
    • Re: (Score:2, Insightful)

      by Trahloc (842734)
      Not everyone is a programmer, some of us assist in less direct ways.
      • by Firehed (942385)

        It's not even a matter of programming skills. If you look at why spam never gets through to your Gmail inbox, it's because Google has a database of billions if not trillions of messages to run analysis on. When you have a twelve-digit sample size to work with, matching can be done much more accurately than with a couple hundred messages. It's pretty easy to slap together a system where you manually flag messages as good or bad. Being able to call $akismet->isCommentSpam() and have everything done for

        • Re: (Score:2, Interesting)

          by mysidia (191772)

          This suggests a solution... Instead of using the web for comment submission: use SMTP.

          A user who wants to submit a comment answers a captcha, and clicks a "submit" button.

          An e-mail address is displayed for them to send their comment to.

          They e-mail their comment, which goes to somemailboxname+blahblah@gmail.google.com

          If Google doesn't consider it spam, then the message gets forwarded to a secret mailbox on the blog server.

          A script running on the blog server parses the message, determines what t

    • by dubl-u (51156) *

      Or am I misunderstanding what FOSS really is about?

      If your first instinct is to build it yourself, then yes, you are kind of missing what FOSS is really about. To jointly improve shared solutions, you first have to find the solutions that are already out there.

    • For one large site I took an open source Bayesian filter and customized it. This site was large enough to get spam that's only posted there, so a DIY Bayesian filter worked extremely well. They have staff to remove spam and illegal content, so the filter simply aided the staff, who were able to train the filter very quickly.

      However, this solution would be useless without enough content and without people properly training the filter. If you get generic spam in a common scenario than a more generic soluti

  • I've been thinking about modifying my VSDB software [bigattichouse.com] to do something like this...
  • There are a number of things you can try:

    For a small site I helped set up, they went to complete SSL and client certificates, where users had to obtain a cert from Verisign or Comodo before they would get access. This stopped spam, and one can obtain a client cert for free or a low cost. However, this can't be done for most forums or blogs.

    For larger sites, a lot have ended up moving to an approval type of system where a human approves the creation of the user, then a limit on how many posts a first time

    • And when you get the $5 - $10 in PayPal from a scammer using someone else's credit card number, your PayPal account usually gets suspended if not flat out closed.
  • mollom not so free (Score:3, Interesting)

    by jeffstar (134407) on Sunday December 28, 2008 @06:07PM (#26252391) Journal

    mollom [mollom.com]

    i discovered this one through drupal. I thought it was completely free but apparently for high traffic sites it isn't.

    I think all your user generated content is sent to them and checked for spaminess against the other submissions they are receiving and they give you back a rating.

    • by yelvington (8169)

      I use Mollom because of its excellent integration with Drupal. It's free for up to 100 legitimate posts per day, 30 euros/month after that.

      It works very well for stopping spambots without annoying real humans (which a plain captcha will do).

      Human spammers still slip through, but when you delete their work, it's fed back to the Mollom database, protecting you and others from repeats.

  • by loony (37622) on Sunday December 28, 2008 @06:09PM (#26252407)

    Any method you use can be broken. Your only chance is to reduce the likelihood that your site is worth the effort.

    Basically, if you use a common solution - no matter of FOSS or commercial - then there will be a thousand other sites that use it too. This attracts attackers because they know when they hack it once, they can re-use it.

    However, if you handcode something, no matter how primitive, it likely lasts a lot longer because nobody bothers hacking into your site...

    Of course that doesn't work if you have a large site like myspace - there, a single site is worth the effort by itself.

    Anyway - then there are two things - a really fast moving animated gif and silly things where you ask people to identify items usually work.
    I help out with a site that randomly takes five pictures of cats and dogs and it asks you to identify which of the images contains the highest number of kittens... We barely ever get spam through - and that with almost 20K attempted submissions by non-humans a day makes us pretty happy

    Peter.

    • by Sir_Lewk (967686)
      Sounds like security through obscurity too me, if someone actually tries to target *your site* (which will happen if it's popular enough) then chances are it'll be broken in no time.
      • Which is true - a specialized attack will succeed - but for smaller, personal sites, the spammers won't bother.

    • by dattaway (3088) * on Sunday December 28, 2008 @06:59PM (#26252761) Homepage Journal

      However, if you handcode something, no matter how primitive, it likely lasts a lot longer because nobody bothers hacking into your site...

      Simply renaming the .php files worked 100% for me.

    • Any method you use can be broken. Your only chance is to reduce the likelihood that your site is worth the effort.

      That's only one approach. The other approach is to increase the response time dramatically: once you've been spammed, if you can clean up very quickly and reconfigure to prevent similar attacks in future, then visitors are unlikely to notice anything. This is the key advantage of statistical spam filters, as they make it relatively painless to respond and reconfigure to handle whole classe

    • Re: (Score:2, Interesting)

      by edmazur (958154)

      I'll second this.

      My friend runs a smaller site and was having a problem with forum spam. He edited the registration page to include a checkbox that said something along the lines of "check this box if you are not a bot". His problems went away instantly. Obviously this does not scale well, but for smaller sites being targeted randomly by automatic spam crawlers, it appears to be very effective.

    • by KermodeBear (738243) on Sunday December 28, 2008 @11:12PM (#26254327) Homepage

      I have a very simple, small site that I run that allows small comments. It was fine until the spam bots found it. Anyways, I just added a simple question about the background color of the site, which must be correct in order for the comment to be posted. I haven't had a single issue since (except for the occasional troll, but what can you do about that).

      The nice thing about something like this, a handmade thing, is that the spammers won't bother 'breaking' it. As the parent mentions, the spammers are attacking the common solutions - so a little home grown bit will work wonders.

      • by Kalriath (849904) *

        Anyways, I just added a simple question about the background color of the site, which must be correct in order for the comment to be posted. I haven't had a single issue since (except for the occasional troll, but what can you do about that)

        Oh, and those pesky colourblind people. But screw them, eh?

        Anything based on something a human may not be able to solve sucks.

        • Funny that you mention it, but I am red-green color blind myself.

          Perhaps the background of the site is white - which everyone should be able to see, hm? (o;

        • What if his background is white? Even if it's not, color blind people are not helpless. They are allowed to drive with colored traffic lights you know.
      • I did the same thing for our small, semi-private forums awhile back. I added about six questions on either the subject matter of the site or the colors on the page, or how to best dispatch spammers. Our spammers immediately went to 0, since it would actually take a tiny bit of human interaction to create a login.
  • by WebmasterNeal (1163683) on Sunday December 28, 2008 @06:11PM (#26252443) Homepage
    I have a series of 4 tests to block spam on my website. So far it has stopped over 30,000 attempts in the last year.

    Test one is, does the last name = the first name. For some reason almost all spammers do this.

    Second, do they use a keyword from a list of about 15 words.

    Third, do they fill out a hidden inputbox? This is sort of the reverse captcha.

    Finally do they use more than 4 "http" in a post. Almost all comment spam is an SEO effort to increase their pagerank.
    • Hidden Input Box (Score:5, Informative)

      by waldoj (8229) <waldo@nosPaM.jaquith.org> on Sunday December 28, 2008 @06:31PM (#26252557) Homepage Journal

      Third, do they fill out a hidden inputbox? This is sort of the reverse captcha.

      This is really a very good test. As others have mentioned in this thread, it's the sort of thing that spammers will circumvent if it becomes widespread, but for now it's great.

      There's something else I've found to be really quite effective: deliberately misnaming my form fields. For instance, give the input field that's labelled "First Name" an input name of "phone number." Humans don't use input names to determine what text to enter, but spambots do. Then check that inputâ"if the first name field contains a phone number, you know you've got yourself spammer.

      I've used solely the combination of these two things to run one of my websites for two years now, and I get a vanishingly small amount of spam.

      • Bad Idea (Score:5, Insightful)

        by erlehmann (1045500) on Sunday December 28, 2008 @08:15PM (#26253325)

        As someone who once used text browsers, I can only advise everyone not to do this - it breaks accessibility at a fundamental level: I got banned from a forum once because they mislabeled fields.

        What however, works really great for comment spam is a simple question like "What is the name of Barack Obama ?".

        • Re: (Score:2, Funny)

          by Anonymous Coward

          Barack HUSSEIN Obama.

          Sorry. Had to. Just a little jab at people who feel the need to point that out all the time.

          Anonymous Coward to prevent a serious Karmic Backlash from people who can't take a joke.

        • What however, works really great for comment spam is a simple question like "What is the name of Barack Obama ?".

          I know, I know! It's Bin Laden! ~

          And yes, unfortunately, you have to take that sort of thing into account (though I guess it depends on the kind of forum/blog that you're running).

        • by waldoj (8229)

          Im afraid you misunderstand me. Again, only the field name is affected, not the label for the field. I've used text-only browsers regularly since 1994 (Mosaic over a 14.4k modem, Lynx, and now Links), and I'm yet to encounter one that displays the name element of an input field to the user.

      • by gr8dude (832945)

        What about people who rely on screen readers?

    • My 3 tests also work (Score:5, Interesting)

      by lalena (1221394) on Sunday December 28, 2008 @07:02PM (#26252787) Homepage
      I have implemented something similar, but I haven't been checking the number of blocked messages. All I know is that I used to get spam, and now I haven't gotten any for years. I use this for Formus and the Contact Us page.

      My rules are:
      1) The text boxes for things like name and subject are actually called junk.
      2) There are hidden textboxes called name and subject (1 hidden by javascript and one by CSS) that if they are populated the post is ignored.
      3) A third hidden field is the result of a simple javascript math equation that is checked on the server side. If the value is wrong, the post is thrown out.

      As others have said, if your site is small these types of things are good enough to prevent spam because the spammers won't bother to figure it out. These concepts would never work for any of the larger sites or 3rd party forum software.
      • by lalena (1221394) on Sunday December 28, 2008 @07:05PM (#26252809) Homepage
        As a follow up to myself, I didn't come up with these ideas on my own. I read them on Slashdot a couple of years ago.
      • by skeeto (1138903)
        Your hidden text boxes will break your site for text browsers, screen readers, and other users that don't want to run your Javascript (like Firefox with NoScript). I wouldn't say this is an acceptable solution.
        • by cyborch (524661)
          People using noscript got themselves into that mess... If you use noscript you are most likely able to figure out when to disable noscript and a part of a small enough minority to not really matter. Sorry.

          Also, any screen reader unable to ignore tags which are hidden by css will have so very many problems with standard pages that noone will be using it anyway...
    • Re: (Score:2, Insightful)

      Oh I also forgot, if you have a static URL that your form posts to, it is a good idea to rename that page every now and then, especially if it is getting a tremendous amount of spam. Also you can do a check to see if the referring URL is on your own domain as a lot of spammers are posting from a copied version of your form.
      • by Magic5Ball (188725) on Sunday December 28, 2008 @09:02PM (#26253579)

        Background: One of my sites is a custom job which kills a spam comment every 3 seconds or so, and has done so consistently for the past four years.

        OP's suggestions are very good, especially limiting the number of 'http's. We've given up on the keyword lists since they are costly to maintain and aren't as effective as some other methods.

        Currently, the most effective kill rules for us are:
        1) We write the client's IP address, the ID of the thing being commented on, and random stuff to a cookie from the legitimate page from which the client clicked the "post reply" link. If the IP address doesn't match, or if the ID missing, or if the parameter for the random junk aren't in the cookie, then fail. This rule traps non-browser scripts and limits spam throughput, but does not affect humans.

        2) The client's IP address is a hidden form variable. If that IP address does not match the IP from which the POST originates, fail. This rule traps the browser-based scripts, and operators who proxy through botnets for testing.

        These two rules catch all but about two spam-like messages a month (spam operator not using proxies to test their scripts), and have mislabeled two legitimate messages (from a local ISP's poorly-configured proxy) in the last three years.

        There are other things at play, such as salted hashes of the above, and some other heuristics on hidden and unused fields which sort and categorise the spam for our own research (including point of origin, topic, etc.). One finding is that IP/geographic blacklists are ineffective. I'll post new findings and methods in another two years.

        I'm also evil in that the apparent failure modes are non-deterministic, and include such things as random HTTP response codes, random modes of connection failure, and spam messages that apparently go through, but are only visible for the IP that posted them, or for one minute after they are posted.

        Your move, "RosarioRush".

        • Re: (Score:3, Interesting)

          by liquidpele (663430)
          Ever seen http requests hit your computer coming *from* slashdot? That is one of their anti-spam techniques. They will sometimes request a file on their domain called ok.txt [slashdot.org] to see if your IP is an anonymous proxy, and won't let you post anonymously if it works.
  • HTTPBL (Score:2, Interesting)

    by Anonymous Coward

    Project Honeypot's HTTPBL has been good to me:

    See: www.projecthoneypot.org/httpbl.php

  • by Todd Knarr (15451) on Sunday December 28, 2008 @06:21PM (#26252501) Homepage

    The fastest way is probably to just slow down user registration. Permit anonymous posting, but make it moderated/screened by default (ie. not visible to other users until the forum owner flags it as OK). When a user goes to register (so they can get their posts visible immediately), do not send them the confirmation e-mail immediately. Batch your confirmations up and send them out twice a day at odd times (ie. not midnight and noon, something like 3:47am and 3:47 pm) (you could do it 4 times a day, but not much faster than that since the idea's to introduce a delay in the registration process). Make sure to tell the user on the registration screen what sort of time-frame they can expect their confirmation to arrive in. Ordinary users who plan on using the forum long-term won't be inconvenienced much by this. Spammers... won't tolerate the delay, they want to get their message in fast and get out. With their automated scripts they might not even notice things are failing. Also, don't include a direct confirmation link in the e-mail. Include a URL to a form and make the user copy-and-paste the confirmation number from the e-mail. That'll be trivial for humans, but not easy for an automated script to handle without human assistance.

    None of that will stop a determined spammer, but most of them are more interested in volume than anything else and they won't bother spending time/effort on just one forum when they could hit 10 others instead.

  • YAWASP for wordpress (Score:3, Informative)

    by zimtmaxl (667919) on Sunday December 28, 2008 @06:22PM (#26252509) Homepage
    There is a well working semi-dynamic plugin for wordpress. It has served me well. It is called YAWASP and you can find it here: http://wordpress.org/extend/plugins/yawasp/ [wordpress.org]. The author also describes the common problems & shortfalls with traditional captcha-like methods.
  • "I am a robot" field (Score:5, Informative)

    by casualsax3 (875131) on Sunday December 28, 2008 @06:23PM (#26252511)
    The ZSNES boards employ a neat trick: http://board.zsnes.com/phpBB2/profile.php?mode=register&agreed=true [zsnes.com]

    It's got a field that says "I am a robot" checked off by default. A human should obviously see that and uncheck it. Those registrations that come in with it checked are blackholed. It's definitely cut down on the SPAM accounts since they enabled it.

  • ...there are companies out there that use a Bayesian filter to sort posts into low scoring and high scoring, and then they have their employees manually sort through the high scoring messages.
  • Message board spam. (Score:5, Informative)

    by JWSmythe (446288) * <jwsmytheNO@SPAMjwsmythe.com> on Sunday December 28, 2008 @06:32PM (#26252573) Homepage Journal

        I had a similar problem in the comments area of my site. It was all fun and games, until one day I checked, and there were something like 1000 spams for every real message.

        I wrote my own system to deal with it. It's not very hard, assuming you know how your site works (of course you do, right?)

        I ended up making two blacklists. One was for words and phrases. The spammers tend to post (and repost, and repost) the same crap. My blacklist rules had some simple regular expressions that I could run queries with. Like, "%http://%spamsite%" and "%v%gra%". You get the idea. The second list was IP's that were known spammers.

        At the time, I allowed both anonymous comments, and comments from logged in users. I eventually did away with the anonymous comments, as they were a headache. This was the best cure.

        So, when my script ran (once a minute), if it matched a message, it would delete the message, and append the IP to the IP blacklist. If it was posted by a user account, the user account got suspended, so they could no longer log in, nor post.

        After it's detection and cleanup run, it then ran back over the IP list, and pruned out every post by that IP. Sometimes they'll do practice runs saying silly things like "nice site". I thought they were real user complements at first, until I saw the same posting verbatim coming from the same IP to multiple news stories, and then that IP would start spamming later.

        Some people will argue that the IP cleanup run was not nice, polite, or even fair. People use proxies. Sure, they do. We got a lot of abuse from anonymous proxies, and no real messages from them. The spammers didn't seem to like to use AOL.

        When I implemented this, I posted a very brief description of what I was starting ("We're starting advanced anti-spam protection"), with an apology for real messages that were deleted. I never received one complaint about real comments disappearing.

        How brutally you do it is really up to you. I built my method by manually doing it for a while, and then letting the script do it on it's own. Occasionally, I would have to go in and add new words and/or site names to the words blacklist.

        I noticed the spammers hit more common software more often. It's worth it for them to make automated systems to abuse a piece of software that's deployed on tens of thousands of sites. When I rewrote my site from scratch, then abuses dropped down to 0 for a long time. Now, they manually submit "news" items which are just ads for their own sites. It appears to be manual, and since we won't run them as news stories (our editorial staff decides what does or doesn't show up as news, and if it needs to be edited first), they give up pretty quickly.

    • I like honeypot links that blacklist anyone who clicks it. Seems to take out spam spiderbots effectively, until they learn how to avoid the honeypot links.

  • TypePad antispam is a great alternative to Akismet.

  • by wangi (16741)

    On phpbb boards I run the most productive things are:

    1. Do not allow external links in profile of newly registered / non validated users

    2. Do not allow registrations with gmail.com email addresses

    3. Ensure "valid" timezone and country settings are selected by users.

    L/

    • Re:gmail (Score:5, Insightful)

      by siyavash (677724) on Sunday December 28, 2008 @08:12PM (#26253295) Journal

      "Do not allow registrations with gmail.com email addresses"

      That is one of the most stupid things I heard this year.

      • by wangi (16741)

        You'd be surprised. It is trivial for spammers to get a gmail account. Anyone who genuinely wants to contribute to a forum will have another email address, if not they will be able to explicitly email...

        • Re:gmail (Score:4, Informative)

          by shutdown -p now (807394) on Monday December 29, 2008 @09:58AM (#26257285) Journal

          You'd be surprised. It is trivial for spammers to get a gmail account.

          It's no less trivial than getting a Hotmail account, a Yahoo! account, or any of the many thousands of free webmail providers out there.

          Even so, I suspect that the majority of casual Internet users today actually have that sort of email account, based on personal experience. If you start blocking them, you're blocking most legit users, too. Unless it's a technical forum - and even in this case it's silly to block GMail, as many techies use that.

          Anyone who genuinely wants to contribute to a forum will have another email address

          Why? I for one don't have one - I use my GMail one everywhere - and I contribute to a lot of forums.

          if not they will be able to explicitly email

          Translate, please. Explicitly email what where, and how is that going to help?

      • by 1u3hr (530656)
        "Do not allow registrations with gmail.com email addresses" That is one of the most stupid things I heard this year.

        We haven't done it, but I'm tempted. At the forum I moderate we get a dozen spammers (mostly human drones in New Delhi or Beijing, from their IPs, not bots) attempting to sign up every day, which are manually checked. Almost all use GMail addresses.

  • I run a site for my rennisance faire guild to talk at and plan things. We had tons of message board spam until I implemented a simple solution: a password is required to register. If not entered, registration fails. The password is posted elsewhere on the site in my case, but you could communicate it only to people who need access if the site is small enough.

  • by macraig (621737) <mark.a.craigNO@SPAMgmail.com> on Sunday December 28, 2008 @07:31PM (#26252991)

    The comment- and trackback-spam blocking techniques in Pivot blogging software are, from my limited personal experience, 100% effective. There's even an extension that uses the enormous Project Honeypot database (http:BL) to weed out IP addresses of identified harvesters and comment spammers. That's just for entertainment, though, since the basic techniques are completely effective.

  • Mollom (Score:2, Redundant)

    by kbahey (102895)

    Mollom [mollom.com] is free for low to medium traffic sites. They have plugins [mollom.com] for the major CMSes out there (Drupal, Joomla, Wordpress, and a bunch of others).

    It is relatively new, but I use it on several sites and it works well. See the score card [mollom.com] for some fun.

    The founder of Mollom is Dries Buytaert, the founder of Drupal, the CMS.

  • by Ye_Gads (985366)

    I rarely see spam here...or is it just quickly modded down to oblivion?

  • Take a look at StopForumSpam.com [stopforumspam.com]. I've got it installed on a vBulletin forum and it works very, very well to prevent spambots from registering. Every now and then one sneaks through, but it's a lot less than I was seeing before.

  • I'm using the old "Fake Textarea" trick. If anyone fills in the fake textarea, the post is rejected. The fake textarea comes up first, but is hidden with CSS. I also modified the forum software so that the fake text field has the same form name as what the forum traditionally uses for the real field.
    I'm also using this in conjunction with blocking posts containing URLs from guests or users with no posts.

    Of course, this is all useless against Stock Ticker symbol spammers.

  • ... 90% of all spam would be eliminated.
    • Hm, that's an interesting approach to fighting spam, and it's right on time with all that government throwing money around in the USA... maybe you should try convincing Obama that it would be a good social project to invest money in, too?

  • hey there

    if you want to filter for humans simply present a bunch of images and have the person spot the cat among the dogs

    then apply the spam filtering (simple stuff really works you can even just use spam assassin plugins for content ) to get rid of the spammers posting urls and rubish and denie based on IP if you catch spam unless they contact you somehow

    regards

    John Jones

    http://www.johnjones.me.uk [johnjones.me.uk]

  • Recently, one of my users got infected with some spam-spewing bot malware which resulted in my company being listed at least four RBLs. It is annoying, but I can't hold it against the list services as I use them myself in my own filtering.

    I have to wonder if RBLs of some sort could also be applied to web browsing especially on forums? But since most people are on dynamic IP addresses, I can only assume that without some very clever ideas to go along with it (perhaps some sort of cumulative scoring + finge

  • The phpbb forum I administer fell victim to spammers more than a year ago so I tried a bunch of MODs that implemented a couple of changes to foil attempts of automatically registering. Spam slowed down a bit, but still was strong enough to be a problem... it seems that whatever script spammers use to post in phpBB already implements most standard MODs.

    PhpBBs own Captcha is no good either... ... So what I did was implement my own validation, which requires to enter a fixed word ("Dragon") in a text box. It's

  • Considering the complexity of the Internet, I have real and increasing difficulty understanding how the spammers manage to survive. They require an entire chain of support services to stay in business. Not just ISPs who let them access the Web, but also hosting services to hold their websites, DNS providers, and the domain registrars. They need lots of help to link their spamvertised websites to the spam, just on the minimal chains. (I've noticed that more complicated chains seem to be less frequent these d

  • That's exactly what Sblam! [google.com] does.

    It's PHP-based filter for web forms that detects spam based on content (bayesian filter + specific rules), behavior and uses 3rd party blacklists.

    It's absolutely transparent to the user (well, 99.8% of them).

  • I run a blogging site. When the spammers discovered it, I started getting several thousand automated spam comments per day.

    I solved the problem (ie. absolutely no automated spam) with a two-step process:

    First, I wrote a REALLY quick text analysis script in PHP which looked for the presence of links and other suspicious text. This reduced the spam by 95%, with no false positives.

    Since I had to keep examining the spam that got through and improving the filter, I wanted a system that didn't require constant ma

Hard work never killed anybody, but why take a chance? -- Charlie McCarthy

Working...