Ask Slashdot: How Can I Stop Security Firms From Harvesting My Data? 82
Slashdot reader Unpopular Opinions requests suggestions from the Slashdot community:
Lately a boom of companies decided to play their "nice guy" card, providing us with a trove of information about our own sites, DNS servers, email servers, pretty much anything about any online service you host.
Which is not anything new... Companies have been doing this for decades, except as paid services you requested. Now the trend is basically anyone can do it over my systems, and they are always more than happy to sell anyone, me included, my data they collected without authorization or consent. It's data they never had the rights to collect and/or compile to begin with, including data collected thru access attempts via known default accounts (Administrator, root, admin, guest) and/or leaked credentials provided by hacked databases when a few elements seemingly match...
"Just block those crawlers"? That's what some of those companies advise, but not only does the site operator have to automate it themself, not all companies offer lists of their source IP addresses or identify them. Some use multiple/different crawler domain names from their commercial product, or use cloud providers such as Google Cloud, AWS and Azure â" so one can't just block access to their company's networks without massive implications. They also change their own information with no warning, and many times, no updates to their own lists. Then, there is the indirect cost: computing cost, network cost, development cost, review cycle cost. It is a cat-and-mice game that has become very boring.
With the raise of concerns and ethical questions about AI harvesting and learning from copyrighted work, how are those security companies any different from AI, and how could one legally put a stop on this?
Block those crawlers? Change your Terms of Service? What's the best fix... Share your own thoughts and suggestions in the comments.
How can you stop security firms from harvesting your data?
Which is not anything new... Companies have been doing this for decades, except as paid services you requested. Now the trend is basically anyone can do it over my systems, and they are always more than happy to sell anyone, me included, my data they collected without authorization or consent. It's data they never had the rights to collect and/or compile to begin with, including data collected thru access attempts via known default accounts (Administrator, root, admin, guest) and/or leaked credentials provided by hacked databases when a few elements seemingly match...
"Just block those crawlers"? That's what some of those companies advise, but not only does the site operator have to automate it themself, not all companies offer lists of their source IP addresses or identify them. Some use multiple/different crawler domain names from their commercial product, or use cloud providers such as Google Cloud, AWS and Azure â" so one can't just block access to their company's networks without massive implications. They also change their own information with no warning, and many times, no updates to their own lists. Then, there is the indirect cost: computing cost, network cost, development cost, review cycle cost. It is a cat-and-mice game that has become very boring.
With the raise of concerns and ethical questions about AI harvesting and learning from copyrighted work, how are those security companies any different from AI, and how could one legally put a stop on this?
Block those crawlers? Change your Terms of Service? What's the best fix... Share your own thoughts and suggestions in the comments.
How can you stop security firms from harvesting your data?
nothing new (Score:3)
Re: (Score:2)
Just a heads-up, mate. "spruik" is a specifically Australian word. A lot of the seppos on here won't grok it.
Re: (Score:2)
Re: (Score:2)
people here can find anything to whinge about.
Re: (Score:2)
Re: (Score:2)
Was hardly a whinge.
Re: nothing new (Score:1)
Re: (Score:2)
Re: (Score:1)
Take your services offline (Score:5, Insightful)
Re: Take your services offline (Score:4)
Re: Take your services offline (Score:2)
Re: (Score:2)
If your company or your team don't know how to properly secure your web site, you might not be paying enough for your developers. Security these days is not optional, you have to know what you're doing. If you go cheap, you'll get developers who write crappy code that leaks.
Re: (Score:1)
Re: (Score:2)
Move all your stuff to a private network and use a VPN to get to the private network. Then only allow the very specific public services you need to expose in.
Unless you're doing something really weird, you should be able to limit your public exposure to some web services - and even then to just a couple of apps. Keeping that (plus the VPN) patched and updated ought not be too hard. I'd personally stay away from hosting any public SMTP or DNS services - they're ten a penny to have them hosted for you, and it
Re: (Score:2)
THIS. Make data you don't want public.. not public.
Easy (Score:5, Funny)
How Can I Stop Security Firms From Harvesting My Data?
Do not connect any of your systems to the Internet.
Change Society. (Score:2)
This is just one of the fine grain settings that needs to be changed. Capitalism is the greed cancer that infects this country and mostly the world now. Until you change that, data harvesting will be just one of many plagues of our society.
Re: (Score:2)
Ah, yes. Capitalism. Collecting data is a phenomena unique to market based economies and their governments. The Stasi, for instance: what mighty capitalists they were. Today we see have China and it's Social Credit Score: another expression of the ebil capitalism.
Re: (Score:2)
I just hope you do
Re: (Score:1)
I didn't say shit about Socialism, especially not MY Socialism. I prefer no -isms. You do your thing and I do mine. I don't tell you what that is and you don't tell me what mine is. See? It's fucking simple as flushing a toilet.
Re: (Score:2)
They don't have Jews in Nicaragua, so they're going for the Catholics instead.
at what point do these become DDOS? (Score:4, Insightful)
A friend carefully monitors his network, and sees a fair amount of this security scanning. Of course, that sets off the alarms he's added to his systems, clogging up logfiles and generally chewing through both bandwidth and server. At some point, this moves beyond "fair use" into "unfair use," but I don't know where to draw this line. Seems to me that "responsible" security scanning should be infrequent and probably announced ahead of time. But I could see arguments the other way.
And of course, the only way to distinguish a 'security scan' from a 'vulnerabiity scan' is to look at the originator IP and draw conclusions from that, which we know is not really authoritative.
Re: (Score:2)
Re: (Score:3, Insightful)
And of course, the only way to distinguish a 'security scan' from a 'vulnerabiity scan' is to look at the originator IP and draw conclusions from that, which we know is not really authoritative.
'Security' or 'vulnerability' scanning implies a contract is in place and therefore are actions/services that someone has requested to be done.
Otherwise, it doesn't really matter where it originated from. They're unwanted and/or unauthorized.
Re: (Score:2)
A spotlight is perceived as necessary because there is no light being cast, so each individual shines their own spotlight so they can see. Perhaps it would be better to have each ISP perform full scans and then publish the data openly? Democratize access to the information to reduce wasted resources.
"But my pants are down!", says the system owner.
"Your pants were already down, but previously, only the bad guys noticed.", says the ISP.
Maybe the internet is not for you (Score:2)
Since they admitted to it, charge them. (Score:2)
Put clear Terms of Service on your website (or anywhere else you control the info, I'd sugge
Re: (Score:2)
Put clear Terms of Service on your website (or anywhere else you control the info, I'd suggest also adding a X-Terms-of-Service-Fees: header in all your webserver's HTTP responses pointing to a relevant link) that you charge a fee (just pull any number out of thin air, like $7500USD) for any info scraped for any sort of commercial purposes, and when they admit to it by cold-calling you (as they just did), inform them of the immediate charge, payable within 30 minutes, with by-the-minute accrual interest of 3% after that time.
I would love to see you argue for this in a court of law.
The judge will probably laugh harder at you than at a sovereign citizen.
Re: (Score:3, Informative)
I would love to see you argue for this in a court of law. The judge will probably laugh harder at you than at a sovereign citizen.
Why?
Why is it Big Tech can hit users over the head with their convoluted and lawyered-up ToS in courts, sometimes using the CFAA, but that anyone else can't based on their own ToS of their choosing?
What's the difference between Unpopular Opinions' business stance vs. Big Tech's?? Are they not entitled to the same equal rights in law?
Please, enlighten us, by issuing a minimum 1000-word essay detailing your arguments.
Re: (Score:3)
Why?
Because to be valid a contract has to have shown clear agreement between two parties and clicking on a button does not generally show such agreement. This is a VERY good thing because, while you might think it would be neat to try this trick on large companies, those large companies will be able to play far meaner similar tricks on all of us if this were legal.
Why is it Big Tech can hit users over the head with their convoluted and lawyered-up ToS in courts
They can't in most courts outside the US - they tend to get thrown out for the reasons stated above. This is a good thing.
Re: (Score:2)
Why is it Big Tech can hit users over the head with their convoluted and lawyered-up ToS in courts, sometimes using the CFAA, but that anyone else can't based on their own ToS of their choosing?
You're begging the question. The actual answer is much of what Big Tech writes in their ToS is equally unenforceable, and that's despite Big Tech imposing the ToS as a barrier to functionality rather than a note that is ignored and doesn't request specific consent.
The GP's post is completely unenforceable in court.
Terms of Service: By having read the above post you implicitly agree to send me $10000000 and change your official name to Sebby.
WHERE IS MY LAWSUIT, TROLL?? (Score:2)
WHERE IS MY LAWSUIT [slashdot.org], TROLL??
Re: (Score:3)
I would love to see you argue for this in a court of law. The judge will probably laugh harder at you than at a sovereign citizen.
Why?
Why is it Big Tech can hit users over the head with their convoluted and lawyered-up ToS in courts, sometimes using the CFAA, but that anyone else can't based on their own ToS of their choosing?
What's the difference between Unpopular Opinions' business stance vs. Big Tech's?? Are they not entitled to the same equal rights in law?
Please, enlighten us, by issuing a minimum 1000-word essay detailing your arguments.
Because shrink-wrap contracts have been literally unenforceable in most civilised countries for decades now. You cannot be held subject to a contract you did not see before purchase, nor are unreasonable terms enforceable. I can say that by reading this post you must donate $5 to the Here For Life charity, there is zero way for me to enforce that and no court in the UK or Australia would even hear it, I wouldn't even have the chance to be laughed out of court.
TOS's are just CYA's these days, their sole p
Re: (Score:2)
You said it yourself. They have the convoluted and lawyered-up ToS. By inference, we must conclude that you, on the other hand, do not.
Ever heard of the Golden Rule? "Whoever has the Gold gets to make the Rule."
Anything that might be said beyond that is nothing more than argument for argument's sake.
Re: (Score:2)
I would love to see you argue for this in a court of law. The judge will probably laugh harder at you than at a sovereign citizen.
Why?
Because you have no money.
Why is it Big Tech can hit users over the head with their convoluted and lawyered-up ToS in courts, sometimes using the CFAA, but that anyone else can't based on their own ToS of their choosing?
Because they have lots of money.
What's the difference between Unpopular Opinions' business stance vs. Big Tech's??
Money.
Are they not entitled to the same equal rights in law?
In theory, yes. In practice no.
Please, enlighten us, by issuing a minimum 1000-word essay detailing your arguments.
WTF? One word will do. Money.
Re: (Score:2)
Would any of that stand up in court? I'm not an expert on US contract law, but in the UK sending someone random terms in places they are unlikely to look tends not to go down well in court. The same applies to unreasonable terms designed to heavily favour one party, such as a 30 minute payment deadline and 3%/minute interest rate.
What you really need to do is lobby your politicians to adopt a GDPR-like law. Then you can force companies to get permission for harvesting, force them to disclose what they have,
You don't... (Score:2)
... The public ate all the drm tech over the last 26 years since since the rise of the internet in the mid 90's, first with mmos then with steam. The entire industry can use telecom to steal software on an industrial scale from the computer illiterate masses, that means no privacy for you. If you bought windows 10 or have ever purchased anything requiring user names or login accounts your too stupid to be using computers. Everyone knew in 1997 when ultima online was released the average gamer and PC user
Re: (Score:2)
thanks for the good belly laugh
Re: (Score:2)
thanks for the good belly laugh
Enjoy the future where valve owns everything and you own nothing because your too stupid and irrational to understand silicon valley tech company history. Why would I listen to someone who enjoys paying money for broken software? AKA steam is malware, mmos were just pc games with the networking multiplayer ripped out and coded fraudulently.
my sister (Score:1)
I want guys to stop fingering my sister. But she keeps letting them. How do I get guys to stop fingering my sister?
Re: (Score:3)
You should be able to block the finger protocol at your firewall.
Re: (Score:2)
type "man finger" to find out all about it ;-)
Re: (Score:2)
If anyone wants to steal my joke, go ahead. But you have to take coofercat's improvement too. I'm almost ashamed I didn't think of it myself.
Say goodbye to Kim Kardashian on TikTok and (Score:2)
ditch all your other social media accounts.
Who needs Kim K. to show her ass in order to lure you into handing over all your data?
Set up your own Tor node and download all the ass for a lifetime completely anonymously.
So stop, stop selling your public data publically? (Score:5, Insightful)
I work on a public, C2B system with about 50 million MAU. The security firms scanning our servers are a drop in the bucket compared to script kiddies and legitimate users trying to do ill advised things (automation, but badly). The worst offenders are the marketing/analytics bots. We found one that has a browser plugin that was feeding the users data back to their servers which were trying to scrap the pages near real-time. Enough users had it installed that it was triggering security alerts because the bot was trying to access pages that required authentication. After repeated attempts to ask them to knock if off we just blocked them. We even debated showing an alert to affected users but decided that would just as easily backfire with people accusing us of invading their privacy (How did you know I had this plugin!?)
Long story short, the internet is a wretched hive of scum and villainy. If you expose anything to the internet, you have to put up with this crap.
Re: (Score:2)
the internet is a wretched hive of scum and villainy. If you expose anything to the internet, you have to put up with this crap.
With that attitude, yes you do.
I have no idea what this post is even about. (Score:2)
Someone should have bounced this back to the poster for some editing. Don't get me wrong, this is slashdot, the bar is not set super high. But at least have a intelligible thesis for your post.
Re: I have no idea what this post is even about. (Score:2)
They are not "harvesting your data" (Score:2)
They just look at the public surface, i.e. the stuff _you_ chose to publish. If you do not want others to see, stop publishing. If you just want to, say, email relaying or the like, implement IP restrictions, add port-knocking, requite a VPN log-in or do some of the other, well-known things to not make services generally available. But anything published is fair game, within the restrictions of copyright and intellectual property and, sometimes, privacy laws.
Deny By Default Network Policy (Score:1)
"not all companies offer lists of their source IP addresses or identify them"
By default, deny all traffic to and from your network and only allow what you need to get it working and have all allow rules expire after some amount of time so you don't get too comfortable with anywhere specific.
And there's a whole class of service providers whose services you should not use. Some of them are extremely popular.
Get a clue (Score:2)
If you want them to stop harvesting your data, hire one of them because you clearly don't know what you're doing.
It isn't your data (Score:2)
It isn't your data. It's data about you. Big difference.
Don't block them. That's painless. We want pain. (Score:2)
Just block those crawlers
I'll block at the drop of a hat on personal systems - though I can't on production things.
But really, wouldn't it be more apropos to cause them pain, suffering, and woe?
A few dittys with tc does the trick to slow down their network traffic down to one packet every 30 seconds.
Haven't thought about it too much, but some off the cuff build out mile stones:
Ensure that the setup (adding the qdisk, class) is only done once, then the cascade of tc filter add dev blah blah blah.
quick ditty (alias or /usr/local/bin
Re: (Score:2)
The "Tarpit" extension from IPTables add-ons [inai.de] project might be what you are looking for. It
TARPIT
Captures and holds incoming TCP connections using no local per-connection re
sources.
TARPIT only works at the TCP level, and is totally application agnostic. This
module will answer a
Re: (Score:2)
TC doesn't require patching which to my mind is more elegant and sustainable across servers. (TARPIT last I looked required patches)
Granted I don't use it in production but my mindset is by default set for "at scale" and I didn't remove my lazy thinking. Those that "just want results" would likely be happier with your choice than mine.
Re: (Score:2)
There's no (longer?) kernel patching required for xtables
the cloud makes security so goddamn tedious (Score:2)
My list of blocked User-Agents grows daily, from obvious bots to ancient browser versions. Can't just block the IPs because lots of useful services are using the cloud too, like getting certs from Let's Encrypt. Scientology is now hiding behind the Amazon veil, when they send mail they use somerandombullshit@aws or whatever as the envelope sender so I have to scan the DATA headers in order to block their intergalactic propaganda at the server level.
Years ago I started watching the internet static, the rando
Simple answer (Score:3)
You can't.
Even if you could obtain some compliance within a given country, the Internet is international and people from various less liked jurisdictions couldn't care less about what rules you might have.
Once the data is out there, it's out there. You can't ever have any confidence that nobody will notice that a given port is open or a given service is buggy, and that this fact won't spread through various parties, including things like underground forums most people don't know even exist.
Re: (Score:3)
You people really don't like history, but you could learn a thing or two. The "wild west" doesn't exist anymore. People do not stand for lawless freedom, because it's inefficient and tyrannical. If you want to have any say in the way the internet will be, you have to acknowledge that this free-for-all exploit-anything-public isn't acceptable and will cease to be one way or another.
"Oh, nevermind me, I was just checking that your front door is locked, to put that information in a searchable database that any
Re: (Score:2)
Your post makes absolutely no sense and does not reflect reality. vadim_t's point is that the "wild west" does exist through much of the world. You can call it inefficient and tyrannical all you like, but that doesn't make it go away. There is nothing we can do to stop script kiddies all over the world from probing systems. Security is the only solution.
You are crazy if you think you can argue the problem out of existence.
Re: (Score:2)
The story is not about underground forums in foreign countries. It's about this: "a boom of companies decided to play their "nice guy" card, providing us with a trove of information about our own sites, DNS servers, email servers, pretty much anything about any online service you host". Shodan.io for example is a Seattle, WA, company. Step one: legislate.
Re: (Score:2)
Legislate it in what country? Even if the answer is "every country" that won't stop the bad actors from doing it. How many of these unsolicited offers are from legit security companies anyway?
Re: (Score:2)
The Internet is less of a wild west than it used to be, but it's still pretty wild.
There's plenty big countries like Russia where the authorities won't care at all about any of your local rules. That may not be fun, but it's a fact. If somebody from Russia runs a scan on you, and then distributes information in Russian forums, there's pretty much nothing you can do about that.
Even less now with the Ukraine situation, where Russia is heavily sanctioned and even less inclined to be friendly to other countries
Search engines? (Score:2)
How many times does this need to be repeated? (Score:2)
There is not, and never has been, such a thing as "privacy" online. Any and all interactions, by anyone, with any online service or function, may be intercepted, reviewed, catalogued...and monetized.
This will be on the midterm.
Dude, that's how the Internet works (Score:2)
What about copyright infringement? (Score:1)
Re: (Score:1)
for comments. Sorry!
What about copyright? (Score:1)
If you haven't noticed, HTML5 made it EASIER to parse a website. It clearly defines tags for articles. This is for SEO and... yup, crawlers.
The internet is made to be scraped, that's the whole point of a captcha, but even for that there is a market to bypass them. On top of that, proxy services exist that will provide you with thousands of proxies to roll through when scraping. The o