Open Source Filtering? 13
David Guichard asks: "Maybe I've just missed it, but has there been any talk or action on an open source Internet filter? I'm thinking of something that would allow libraries and schools to comply with the law, but would not hide the list of forbidden sites and would allow complete local control, and certainly would not track user browsing. I realize a lot of people wouldn't want to get anywhere near this on principle, but it seems like a winner to me. For example, would junkbuster satisfy the law already? What is missing that the law requires?" If you have to have some form of filtering in place, better an open solution than a closed one.
Cliff, you've done it again. (Score:2)
Just post it to "Ask Slashdot", where no one will ever see it.
Bravo.
How about doing it right then?? (Score:2)
Look here [slashdot.org], and do a text search for "How about doing it right then??"
Doing it wouldn't be that difficult. The Squid proxy has a good number of filters for banner ads and such. You would just need to swap out the list of banner ads with your filtered sites.
Re:Two reasons why not (Score:2)
Every obscure pornographic site doesn't have to be included -- just the ones that matter. I think we could take a lot of the winds out of the sails of pro-censorship people if just the fraudulent sites could be blocked. Sure, if someone wanted to see porn they still could, because a large portion of the porn wouldn't be blocked. They get what they asked for, and I don't think that's a big deal.
This is a significant problem. Right now there are fundamentalist who deface library books because they are opposed to the books. It would be easy to abuse a volunteer-based system similarly.Part of what would help, I think, is the openness of the system. It should be just as easy to submit a complaint about a blocked website as it is to block a website, and it should be easy to figure out where your website stands without participating in any blocking yourself. If this was combined with a reputation system for the sugggesting moderators, then the worse abusers could be isolated.
Two reasons why not (Score:3)
First, the whole point of censorware is that you can't get around it. If you have a choice of whether to run it or not, it might be searching, filtering, categorizing, whatever, but it's not censorware.
The idea of an "open" solution which is forced upon people is a little silly. Apart from the philosophical absurdity, censorware can never work on an open-source operating system without stringent physical controls as well.
(Recall the first rule of security: anyone who has physical access to your machine has the potential to compromise it. This may be as simple as booting from floppy!)
Second, making up a blacklist of porn sites is trivial if you just want to list the ones who want to be listed. Use RSACi. It's already built into your browser. Almost all porn sites rate with RSACi, and they want to be blacklisted, because it helps immunize them from prosecution for providing porn to kids (or at least that's the perception).
If you want to make up a blacklist of sites which don't want to be blacklisted, you have a fight on your hands. It's a phenomenal amount of work to scan the web. Consider the massive server farms and pipes of unholy size that Google or Alta Vista have to use to spider the web. Who's going to volunteer to set up a similar installation to spider porn sites?
If you think you're just going to provide a way for volunteers to send in "hey, I found another porn site" URLs, don't be silly. Most of those submissions are going to be RASCi-rated; almost all the rest will be overlap. The web is huge. Porn is about 1% of it. One percent of huge is still huge.
And then, the big question: who's going to make decisions about these allegedly porn (but not self-rated) sites? Some human being has to categorize them, or you'll be no more accurate than the existing closed-source blacklists (which is to say, laughably inaccurate).
That takes time, and with millions of new or changed pages on the web every hour, do the math and figure out how much time you can expect to get out of your volunteers. How many dollars of free labor does this hypothetical project depend on? Do porn-hating geeks really hate porn that much, that they'll sit in front of a monitor all day for free and surf porn sites?
Short version: if it were easy to do, someone already would have done it. In fact there already exist several places that keep an "open" list of porn sites which can be dropped into any Squid proxy. Most of them are years old and will never be maintained again:
Click the "Latest" link, which is there "just to show that someone is using it!" Note that the "latest" additions to the blacklist include such obscure sites as playboy.com, and such recent new sites as dailydirt.com (domain registered on Jan 12, 1998).
Jamie McCarthy
filtering domains & mail (Score:1)
you cant you have to rely on the consumer to make the choice
if you dont you are restricting free speach
simply you have to make the choice on what to block OR trust someone to make the choice for you
now personaly I trust no one to do that for me I would rather see the odd silly thing than have a whole bunch of dictionarys censored because they contain banned words
funny if you think of it like that (-;
regards
john jones
Linux Filtering (Score:1)
Linux firewalling also allows user-level filters -- packets can be directed to programs for filtering.
a short answer (Score:2)
Still, a good grammar engine should be able to figure out that there's either no text (red alert), or that there's text but formatted strangely with a lot of color codes. Like if someone used something like the Pixel Transformer from http://www.545studios.com.
The important vague promise of a grammar engine is that you can also use it as a content filter. Filter out your annoying co-worker. Filter out advertising. Filter out anything that you don't like. Sure, it will turn several of us into digital hermits, but I'll take that risk.
grammar-based filtering, not keyword (Score:3)
What the @#!! is grammar-based filtering?
It's where the parsing engine has enough intelligence to figure out what's going on. What the subtleties are. What the nuances are. If there are any double-entendres or hidden meanings.
Then, and only then, can you use the computer to make value-based decisions using fuzzy rules about whether or not the content should be seen. And once that happens, I'll gladly use filtering. Why? Because I'll be able to filter out advertisements at a minimum.
Re:grammar-based filtering, not keyword (Score:1)
Re:Two reasons why not (Score:1)
Try SquidGuard (Score:1)
While it's a work in progress, it works with Squid.
Nicolas Petreley recommends it.
Yes. Do it right. (Score:1)
The argument against filtering in schools and businesses is not so much 'I should be able to look at porn at work or in my high school computer lab' but 'The current filtering technology blocks usefull, informational and educational sites while not blocking much of the material it was intended to block'. The solution to the latter argument is a filter that works. (As hypothetical as it may be).
Making a filter that works is not a trivial task. There are many compainies out there that have spent lost of time, money, and resources making filters that don't work right. Good luck.
Similar technologies available (Score:2)
Why not use similar technologies for web sites? Just maintain a list of IPs, domains and specific URLs which should be filtered? What SHOULD happen, though, is some sort of categorization and rating system. In other words, under category "sex" you might have a rating of "1" for partially nude/suggestive pictures and "10" for explicit stuff. The service would have to provide guidelines as to how to rate the URLs.
Taking this example further, one would implement a Slashdot-like moderating system to give URLs "negative karma", where the administrators of the networks using the filtering system have the opportunity to place their votes on which stuff they want hidden most.
On the user's end, the network admins could have the ability to screen based on category and rating (like, filter category Sex with negative karma above 4), and the ability to override the rating of a particular site if they feel that it was marked unfairly (or get user complaints about a bad filter).
This system will obviously be very dependent on good guidelines and good participation on the part of the network admins. Obviously a free system wouldn't be able to afford to have full-time staff finding stuff to filter, but the good part about this is the list would be dynamic. Perhaps the database could be automagically downloaded weekly from a central repository in a cron job somewhere, giving the network the latest and greatest of the filters. Again, the overrides the admin put in place at the user's end would take effect, so any updates to the overridden site's rating will be ignored.