Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Businesses Programming IT Technology

Should You Break TOS Because Work Asks You? 680

An anonymous reader writes "My boss recently assigned me a project that was all his idea, with two basic flaws that would require me to break multiple web sites' Terms of Service (TOS). Part requires scraping most of the site, parsing the data and presenting it as our own without human intervention. While we're safe on copyright issues, clearly scraping like this is normally not allowed. At times it might also put a load on those sites. The other is, for lack of better words, a 'load balancing' part that requires using multiple free accounts instead of purchasing space and CPU time for less than $2,000 USD per month. The boss sees it as 'distributed' computing when in reality it's 'parasitic.' My question is: am I wrong about the ethics? If I do need to walk, how best can I handle it without damaging my reputation and future employment opportunities?"
This discussion has been archived. No new comments can be posted.

Should You Break TOS Because Work Asks You?

Comments Filter:
  • by eldavojohn ( 898314 ) * <eldavojohn@noSpAM.gmail.com> on Monday October 27, 2008 @09:16AM (#25525833) Journal

    My question is am I wrong about the ethics?

    You don't even have to ask that question, this isn't even one of those interesting cases or gray areas. What you're planning to do is wrong--even though you could probably escape any legal ramifications. It sounds pretty clear that this site creates profit from these overly priced accounts for information that you obviously value at some amount. Getting it for free (regardless of the TOS) could put you at some risk for litigation. Using the term "load balancing" or even "distributed computing" is hilariously misplaced here.

    If I do need to walk how best can I handle it without damaging my reputation and future employment opportunities?

    Look, I understand what's it like to be looking for a job when the economy is bad. If there are forces keeping you pinned to this employer, I don't know of them. What I would retort with is "How can you keep working this job without damaging your reputation and future employment?" I mean are you going to put in your resume that you coded a technically innovative but bandwidth stealing parasitic botnet to duplicate content from a website that asks for a monthly payment to normally access it at that volume?

    I would suggest you propose the $2k/month route and if your boss balks at it, start interviewing with other companies. If you have to leave and you're worried about being blacklisted as a 'whistleblower' (and your boss just might be that kind of guy) then tell him it's for monetary reasons that you're leaving and wish him the best of luck in his future scams.

    • by Lumpy ( 12016 ) on Monday October 27, 2008 @09:20AM (#25525907) Homepage

      I say do it. but EMAIL your boss with your concerns and then continue.

      when the shit hits the fan you have documentation to throw him under the bus hard and watch the wheels crush him.

      Honestly It's all about CYA in the business world. If your boss tells you to do something unethical or wrong, document it every way you can and hold onto that so you can hand him over.

      Why do you have any loyalty to them? they have none for you.

      • by SatanicPuppy ( 611928 ) * <Satanicpuppy.gmail@com> on Monday October 27, 2008 @09:31AM (#25526097) Journal

        All that will happen is that the site in question will blacklist your scraping application. I work for a media organization, and we deal with this stuff all the time. It's far more cost efficient for us to simply whack the application than to try and track down the jokers. It's actually pretty trivial to nail an automated scraper: they're obvious on the logs.

        So the few times I've had someone ask me to do this sort of scraping, my response is usually that sure, fine, it works, but it's very easy to spot on the logs, and the information is very likely to become unavailable at unpredictable intervals.

        In the long run, it's usually pretty futile to scrape in the first place. When you're stealing content just to drive traffic, you tend to have a crappy site. The only time I ever did a professional scraping app that was "justified" and "legal", the victim was another business unit within the same corporation, and we had every right to the data that they "couldn't" compile for us.

        • by Hognoxious ( 631665 ) on Monday October 27, 2008 @09:37AM (#25526197) Homepage Journal
          Would it be possible to detect the scraper in real-time and redirect it to some fake/spoof data? It needn't be goatse. But it could be!
          • Re: (Score:3, Informative)

            by yincrash ( 854885 )
            yes
            • by TheLink ( 130905 ) on Monday October 27, 2008 @11:29AM (#25527863) Journal
              And could prove to be very amusing for a future slashdot submission if they encounter a BOFH.

              There are just so many things that could be done.

              They're planning on taking data from some site and pumping it to others and they have _ZERO_ assurance that it's going to be good data and continue to be good data.

              When you do stupid stuff like this, if you're not careful very bad things could happen (SQL injection, maybe even malware slipped in) and they could just go "nope not us", and while you could try to sue them it's pretty darn hard to prove since you requested the "bomb", and it only appears once and never appears again.

              If you're lucky it's just going to be goatse/tubgirl.

              If you're not, it could be a lot worse. Just imagine the BOFH thinking "What should I do today to them and their users" and rubbing his hands with glee.

              Just slightly tampered data will be bad enough.
          • by Andy Dodd ( 701 ) <atd7NO@SPAMcornell.edu> on Monday October 27, 2008 @09:43AM (#25526291) Homepage

            Maybe not in real time, but once someone detected a scraper at a given IP, they could easily change their site to feed that IP fake data instead of blocking it.

            If I were in the scrapee's position, I'd probably do that because it's the best way to attack the scraper. From order of least effort on the scrapee's part to most:
            1) Blocking it makes it obvious to the scraper that they've been found out, and they'll work around it, then you'll need to block them again, on and on the cat-and-mouse game goes.
            2) Feeding them mostly good data but with lots of inaccurate information scattered about is nearly impossible for them to detect until it has irreparably damaged their reputation and/or caused them to make bad decisions based on the data.
            3) Suing them is a pain in the butt, even more effort than 2)

            • by alta ( 1263 ) on Monday October 27, 2008 @10:21AM (#25526881) Homepage Journal

              I'm for giving them entirely bogus data that would cause them to loose customers. Not sure exactly what kind of site's we're talking about, but if a customer goes looking for chicken soup recipes and ends up getting porn... I think your boss will realize that they're on to you and won't suggest stealing from them any longer.

              • by Mr. Droopy Drawers ( 215436 ) on Monday October 27, 2008 @10:30AM (#25527039)

                Reminds me of a time when an Ebay'er was pointing to images on my website for an automotive auction. Didn't ask us or give us credit for the images. So, his example of "recently restored examples" became a photo of a '63 Imperial being loaded into a crusher.

                How's that for Crushing the Competition?!

                • by Free the Cowards ( 1280296 ) on Monday October 27, 2008 @11:08AM (#25527547)

                  Somebody once pointed at a picture of a frosted birthday cake on my web site from a forum. So I grabbed my image editor and built a special edition of the cake just for him, where the frosting read "Don't link to my images!"

                  I also have a specially crafted JPEG which is under 1000 bytes but which produces a 20,000x20,000 pixel image filled with black. It will totally screw up the layout of any page linking to it if they haven't entered an explicit size for the tag.

            • by interiot ( 50685 ) on Monday October 27, 2008 @11:17AM (#25527683) Homepage
              I really agree with this. If someone is already going to the effort of writing a lot of scraping code, it's already worth it to them to buy one of those $10-15/month shell accounts online that have SSH access. SSH gives them the ability to forward local TCP requests to that remote IP [proxy.dcu.ie], their scraping app just has to have the ability to use a SOCKS proxy. This means scrapers have a proxy IP that 1) doesn't show up on any of the open-proxy DNSBLs, and 2) is fast and reliable enough for them to get real work done. And if you block them, they just pay another $10-15 to get another reliable IP.
          • by SatanicPuppy ( 611928 ) * <Satanicpuppy.gmail@com> on Monday October 27, 2008 @09:49AM (#25526397) Journal

            Should be. It depends on what kind of data they're downloading, and whether they're just crawling link by link and hoovering up everything, or whether they're looking for something specific.

            Either way, spiders and scrapers usually have programmed scan intervals which have no relation to an actual human's browsing...or they just hit the page as hard as they can, but that is so easy to block that almost no one does it that way. Even if they add a little randomness, it's only efficient to run a scraper if it's hitting every few seconds at max, and even the most ADD user won't keep that up.

            Ironically, the easiest way to nail 'em is to put up a subset of "no robots" pages; if the robots crawl those pages, blacklist 'em. Every legitimate spider will respect those files.

            Otherwise, if you're running a site with a ton of data, and something is crawling it sequentially, you can absolutely redirect their queries to whatever you want. I'd be wary of doing something cute (if you can call goatse "cute") for fear that you'll have an occasional false positive and redirect a user from a high bandwidth location to that site.

            • Re: (Score:3, Insightful)

              by orclevegam ( 940336 )
              Depends a lot on how they're doing the scraping. It's not terribly hard to write a scrapper that fairly realistically duplicates human behavior, although as you pointed out if they're using it to feed their own processes it does put some demands on how often etc. it's forced to run which could make it stand out from normal activity. Of course, given 20 or so of these bots all scraping from different IPs, so long as you balanced their duty cycles so they were all offset from each other you could have scrapin
          • by level4 ( 1002199 ) on Monday October 27, 2008 @10:38AM (#25527129)

            Definitely possible!

            Any company with a website that contains "regularly updated data that might be interesting for competitors" has probably already got some kind of anti-scraping system in place. This guy's boss thinks he's being clever and original - of course he's not, any company with a site of any value and popularity has already seen this a million times.

            What they return basically depends on the mentality of those who work there. The "by the book" professional types will just blackhole the IP or return a "too many visits from this IP" page.

            Companies with a more BOFH type guy in charge might very well start "playing" with the data. Instead of the "too many visits" page you might find yourself getting a page with some of the data changed around randomly. Believe me, there are *many* people around who think it is just the height of comedy to fuck with people who are basically stealing their stuff anyway.

            They will turn it into a game - and, when the erroneous data turns up on the thieving web site (if that's what this guy's company is running), a few screenshots of that site with the modified data suddenly becomes pretty good evidence in a court, if they're of the "legal remedy" persuasion.

            Scraping data is a last resort, not the first thing you try. Forget the ethics - the fact he's working for a company willing to be that insanely cheap and stupid in the first place should be a signal to run far, far away in itself.

            • Re: (Score:3, Interesting)

              by orclevegam ( 940336 )

              Believe me, there are *many* people around who think it is just the height of comedy to fuck with people who are basically stealing their stuff anyway.

              You say that like it's a bad thing. Now where did I put my cattlepro... I mean cable tester.

              Scraping data is a last resort, not the first thing you try. Forget the ethics - the fact he's working for a company willing to be that insanely cheap and stupid in the first place should be a signal to run far, far away in itself.

              Seconded. I used to think my managers were daft, then I started reading thedailywtf.com and I gained a much greater appreciation of exactly how bad things can actually be. From the description this guy gives, he's definitely dealing with someone well on his way to ending up on that site.

          • by theaveng ( 1243528 ) on Monday October 27, 2008 @10:38AM (#25527133)

            >>>The other is, for lack of better words, a "load balancing" part that requires using multiple free accounts instead of purchasing space and CPU time for less than $2,000 USD per month. The boss sees it as "distributed" computing when in reality it's "parasitic".
            >>>

            Can someone explain what this means? Multiple free accounts of what? Gmail? I'm confused.

            Since scraping is detectable, I would follow this course of action:
            - tell the boss you think "we'll get caught"
            - if boss appears to want to fire you, then go ahead and do the action, but ask for him to put it in writing
            - note on the order you think it's a bad idea; keep original for yourself and hand copy to boss
            - write the program
            -
            - (optional)
            - from your home computer (using an anonymous account), tell the website what your program does, and explain you would have been fired if you had not complied with your bosses' wishes, but feel it's unethical to scrap data.
            - watch as Boss looks like fool when website with stolen bandwidth decides to bar his company's access
            - if fired, hire lawyer and sue the company for unjustified dismissal
            $ profit

            • by SatanicPuppy ( 611928 ) * <Satanicpuppy.gmail@com> on Monday October 27, 2008 @10:48AM (#25527273) Journal

              The example that would leap to my mind is a number of services that allow you to "map" an ip address to a geographic location...I use one of those for my job search homepage, and it only allows ~200 queries a day for the "free" account...It would be plenty useful to have as a free service (targeted advertising), and if you set up enough "free" accounts, you could use it that way.

              Since I'm doing all my job searching away from where I'm currently living, I use mine to make sure that my job searching page always looks "under construction" for people who live where I live. My boss actually checks it occasionally, I guess to make sure I'm not trying to leave.

            • Re: (Score:3, Insightful)

              from your home computer (using an anonymous account)

              And an anonymous IP address through Tor [torproject.org] or the like, just to be safe.

            • The first steps are fine, but I would not recommend you to take the option step of blowinging the whistle unless you really feel strongly about the site or people you "victimize" and see it as you moral responsibility.

              If you accept the job and then turn around and blows the whistle you have acted maliciously against your employer. They may have questionable morality but the fact is that you have agreed to work for and being loyal to them, don't sink to their level. They might even have legal grounds to sue

              • Re: (Score:3, Interesting)

                by quanticle ( 843097 )

                They may have questionable morality but the fact is that you have agreed to work for and being loyal to them, don't sink to their level. They might even have legal grounds to sue you if they find out since you clearly have willingly sabotaged their business.

                Since when did going in to work require you to hang up your morals and ethics at the door? If your employer is doing something unethical, many would argue that you're obligated (morally) to blow the whistle on it, since to do otherwise allows people to profit from unethical examples - setting a bad precedent. If your employer is violating a contract, you'd call them out on it. And that's exactly what a Terms of Service agreement is - a contract specifying the terms by which you may use the other site.

          • by Misch ( 158807 ) on Monday October 27, 2008 @12:59PM (#25529559) Homepage

            It's happened. ESPN connived a way to get to another sites private database [theregister.co.uk] and reported the data as its own. The website injected some fake data which ESPN picked up and reported and were caught.

        • by GrpA ( 691294 ) on Monday October 27, 2008 @09:56AM (#25526523)

          I've also had similar requests in the past, and in both cases I did the work. I considered the request, decided they were ethical (even if somewhat unusual) and so did it. That's something you're going to have to figure out for yourself - whether you're going to do it or not.

          I've been on the other side of the fence also...

          If you're relying on data for commercial use, putting yourself in a position where you need that data is a risky thing...

          I had a scraper once come after me. I caught them - as the previous poster pointed out, it's easy... I didn't block them. I captured and redirected their requests so I could control what they got and, well, sent them some information that made them look really, really stupid. They were angry, but there wasn't much they could do.

          They were just enthusiasts - they had no business risk in their application suddenly failing.

          Let your boss know the risk he is facing and then ask him if he really wants to risk being caught and shut down unexpectedly, or worse, finding someone has poisened his data.

          It's just not good for business.

          GrpA

          • by MindKata ( 957167 ) on Monday October 27, 2008 @10:30AM (#25527033) Journal
            "It's just not good for business."

            I find this discussion yet annother interesting insight into the (lack of) ethics of some company bosses. I've often found to my surprise, the ethics of sales people, marketing people and bosses are at times very different from that of programmers and other workers in a company. Some time ago Slashdot discussed "Ethics in IT" and its interesting how it fits with this discussion. Here's the link, it gets interesting how much it fits this discussion, once you get to the part that discusses how some bosses lack of empathy towards others...
            http://slashdot.org/comments.pl?sid=448546&cid=22377570 [slashdot.org]

            Some bosses have contempt for other people, so considering doing this kind unethical business behaviour, is well within their usual thinking.
          • Re: (Score:3, Insightful)

            by interiot ( 50685 )

            The thing is, you can get away with a lot more low-level scraping than you think. If it's something where you don't need to load significantly more pages than an average surfer (you just need to repeat it several times a day), it isn't necessarily going to stick out in the logs that much. And a lot of admins just don't have the time to analyze their logs (Wikipedia allows hotlinking of their images, for instance... combined with the fact that anyone can upload any picture, this is rife for abuse. But th

            • Re: (Score:3, Interesting)

              by onepoint ( 301486 )

              I have a site which I get paid very well to manage under contract. I have traps all over the place.
              one of my traps is for email scrapers ( php script that loads about 5000 names randomly over 50 pages )
              another page I love is my "crap" page, when an ip hit's 20 times in 2 minutes, it loads up and ask if you are a human or a computer, humans end up on the recaptcha page and on there merry surfing way, computers end up in crap section, which is 100% non real data at bargain prices and very specific key word ph

              • Re: (Score:3, Insightful)

                by interiot ( 50685 )

                So a dedicated scraper would change IPs, write some code to detect and avoid the potholes, and then resume scraping.

                There are lots of decent ways to detect scrapers or hotlinkers. But I haven't seen any idea yet from either side (web admins, or scrapers/hotlinkers) that can't be bypassed with enough work. It really seems like it's something of an arms race [paperlined.org].

                But the arms race hasn't progressed very far, even for attractive targets with some amount of money (porn sites), because it's just a lot of work

          • Make sure you use your boss's name and email for all contact information on the user accounts you setup for the scraping.

        • by garcia ( 6573 ) on Monday October 27, 2008 @10:00AM (#25526579)

          So the few times I've had someone ask me to do this sort of scraping, my response is usually that sure, fine, it works, but it's very easy to spot on the logs, and the information is very likely to become unavailable at unpredictable intervals.

          Depends on how you do it. I tend to use tor and a random wait time between gets to bring down the data over a few hours (up to a few days) and in one instance, because the URLs were easily guessed, I randomized the list to make it seem as if the hits were going to pages all over the place. I was never banned for any scraping activity that I have done.

          In the long run, it's usually pretty futile to scrape in the first place. When you're stealing content just to drive traffic, you tend to have a crappy site. The only time I ever did a professional scraping app that was "justified" and "legal", the victim was another business unit within the same corporation, and we had every right to the data that they "couldn't" compile for us.

          It's not futile. Scraping provides a plethora of information in a useful format from places that aren't willing (or unable) to provide data in the necessary format. I used scraped data of course schedule information from MnSCU to develop a weekly report that showed data about how many courses were filled at other area institutions. It was to our competitive advantage to have this information and while it was publicly available, the system wouldn't provide it to us in the DW. I used that data for a variety of different reports than I originally intended and it would not have been possible otherwise.

          While I wish that the data had been provided in a better format for my use, it wasn't and that's what made scraping necessary. Plus *I* was the one who got to determine what information I was allowed to glean from the data rather than whatever the system decided was appropriate for our needs.

          • Re: (Score:3, Informative)

            Yea, but unless you're running that list across a botnet, the IP addresses are a give away.

            Even if you are running it across a botnet it's pretty easy to pick out the patterns using some pretty trivial statistical hacks...If you graph bot traffic it looks like a heartbeat; even if you randomize the access times they don't match "human" numbers (unless you add so much random that it ceases to be an efficient scraper...If you could hire a guy to browse the site and write down the data faster than you can scra

        • Re: (Score:3, Insightful)

          by dsoltesz ( 563978 ) *

          If you've gotten to the point of asking Slashdot, you know the answer: it's unethical and you need to be looking for a new job if you can't get this resolved.

          These first three responses are probably all you need. Start with talking face-to-face with the boss, outline the ethical and technical problems (focus on the technical "ya know Mr. Boss, this is gonna eventually break") and propose a better solution. Follow up with e-mail summarizing the meeting (definitely document).

          If you can't get the boss to

      • Re: (Score:3, Interesting)

        by DrLang21 ( 900992 )
        Agreed. This is a case of CYA. I would also consider discussing it with HR, depending on the reputation that your HR group has for protecting internal whistle blowing activity.
        • by cerberusss ( 660701 ) on Monday October 27, 2008 @10:21AM (#25526885) Journal

          I'd advise against discussing it with HR. I've encountered the following situation: I talked to a HR manager about something that obviously should've remained confidential. However that same HR manager was part of the management team and thus had two hats on. She proceeded to inform the management team, to my astonishment.

          I've come to the conclusion that HR is just a staff department and owes allegiance to, you guessed it, the management team. Not you.

          • by nabsltd ( 1313397 ) on Monday October 27, 2008 @10:56AM (#25527401)

            Unless it's something about you, personally, then an HR employee has no requirement to keep it "confidential".

            In other words, if you are talking about your health insurance, your personal information, etc., that's not general "company business" and shouldn't be spread around. But, if you bring them some information about someone doing something that could be detrimental to the business, they really do have an ethical requirement to pass that along. A good HR team would know not to bring your name into it if you are "snitching" until it was absolutely necessary, but sometimes that happens sooner than you would like.

            What you were thinking of is a company ombudsman [wikipedia.org]. These people are somewhat like your "lawyer" within the company, and are there for the employees. What they would do is explain to you your options (go to management yourself, let them do it and respect your anonymity as far as possible, never divulge your name even if that means the complaint can't proceed, etc.), and then help you implement them.

      • by mea37 ( 1201159 ) on Monday October 27, 2008 @10:27AM (#25526981)

        And if you ever wondered, when <insert crisis here> broke, how things could go so horribly wrong... it's because of people who think like this guy.

        "Don't worry if it's the wrong thing to do; just document that it wasn't your idea!" And apparently never mind the idea of personal responsibility.

        When there are no negative consequences for doing the right thing, ethics is mostly a curiosity. Ethics exist to guide you when the right path isn't easy. And yes, you are personally responsible for your own ethical behavior, regardless of whether someone with a bigger paycheck -- or even someone who signs your paycheck -- says otherwise.

        Does it mean you have to walk? That depends on your boss. If you do, the best way to preserve your reputation is to avoid mud-slinging. Your current employer might want to try to harm your reputation, but it's extremely unlikely he'll get far (certainly not without exposing himself to legal liability). So just don't shoot yourself in the foot by ranting about the situation in interviews, etc.

        • by AndersOSU ( 873247 ) on Monday October 27, 2008 @10:58AM (#25527417)

          You're absolutely right. The problem is being right, like being ethical, doesn't put food on the table.

          The reason crises happen is three fold, first people with power see a competitive advantage in acting unethically, second people in charge or monitoring unethical/illegal behavior aren't up to the task, and third people tasked to do the work don't raise bloody hell when asked to do anything unethical.

          In order to solve the problem you only need to fix one of those. The problem is, the first two options involve convincing people to act against their personal interests. People contain a remarkable survival mechanism, the ability to justify and rationalize difficult actions. Going after people who stand to gain by acting unethically is the business equivalent of abstinence only education.

        • Re: (Score:3, Insightful)

          by postbigbang ( 761081 )

          "When there are no negative consequences for doing the right thing, ethics is mostly a curiosity. Ethics exist to guide you when the right path isn't easy. And yes, you are personally responsible for your own ethical behavior, regardless of whether someone with a bigger paycheck -- or even someone who signs your paycheck -- says otherwise."

          No, just because you don't get spanked doesn't mean that an ethical obligation can be ignored. Were that the case, civility would evaporate. The OP is in a tenuous positi

      • by Chapter80 ( 926879 ) on Monday October 27, 2008 @10:45AM (#25527233)
        The proper way to document this in email is something like this:

        Boss-
        I'm able to do the data scraping and should have it up and running by the end of the day.
        - Your faithful employee

        In case you are wondering about the technical details, here they are:

        The scraping is implemented with a perl script which is activated using cron.

        We scrape the site twenty times per minute, which is a violation of their terms of service. By doing this, of course, we risk that they may shut us off at any time, or even provide us with fake data.

        The typical PHB will read the first two lines on his blackberry, and you're golden. Worst case he or she will scroll down - but the managerial brain is set to shut down at the word "perl". The word "cron" is a failsafe - in case the PHB also has ADD.

        Later when s/he comes back and says "why didn't you warn me", you can point to the text "beneath the fold" of your email.

    • Re: (Score:3, Insightful)

      by JosKarith ( 757063 )
      What they're asking you to do is at the least immoral, possibly even illegal. Your employer doesn't have the right to ask you to place yourself in legal jeopardy in this way, and if the sh1t hits the fan do you really think that someone that came up with this scheme will balk at placing all the blame on you. Someone really needs to have a little chat with your boss about ethics...
      • Re: (Score:3, Insightful)

        The OP isn't in legal jeopardy. The TOS of the site being scraped at at best a contract (if the employer has a paying agreement with them) and are just words otherwise. If the contract is being violated, the employer is completely liable for the acts of the employee.

        I'd just do it. I'd point out to the boss that most sites have logging and other measures in place that may render the work product unreliable, or possibly unobtainable.

        I remember someone at our company writing a scraper for for Yahoo some ye

      • Your employer doesn't have the right to ask you to place yourself in legal jeopardy in this way, and if the sh1t hits the fan do you really think that someone that came up with this scheme will balk at placing all the blame on you.

        Absolutely. That's why you should agree to do the work, but because of the increased risk to yourself, you should ask for a "little something extra" under the table, just between you and him. A wad of hundred dollar bills passed discretely in a handshake, for example. "I help you, boss, you help me?" is a good phrase to clue him in on the situation and what's required for the project to continue. ...or perhaps he may rethink how he wants his workplace to operate?

    • by Phreakiture ( 547094 ) on Monday October 27, 2008 @10:11AM (#25526747) Homepage

      Tread wicked carefully! [slashdot.org]

      Chip Salzenberg got his ass burned back in 2005 by grumbling about his employer's ethics regarding screen scraping. I heard him speak at YAPC::NA in Toronto that year, and from what he was saying, they were able to take his every legitimate action (e.g. logging in remotely to work from home) and twist it in court into something less than legit (e.g. unauthorized access). It's their word against his, and they hold the access logs. Your best bet, if you want to make a stand about the morals, is get the hell away from there first.

  • by Anonymous Coward on Monday October 27, 2008 @09:16AM (#25525843)

    ...ask a lawyer.

  • by Anonymous Coward on Monday October 27, 2008 @09:16AM (#25525849)
    Did the contractors on the Death Star deserve to die?
    • by Intron ( 870560 ) on Monday October 27, 2008 @09:21AM (#25525937)

      "Did the contractors on the Death Star deserve to die?"

      Depends on whether it was the ones that did the weapons array or the ones that did the low-flush toilets. Oh wait, Halliburton did both.

  • by Anonymous Coward on Monday October 27, 2008 @09:17AM (#25525865)

    ...you build a system that closely relies on this nonstandard (and unsupported) method of getting information, they change it and it breaks.

    Either by accident, or because they spot a load of particular access patterns from your address, figure out what's going on and intentionally break it.

    • by Paeva ( 1176857 ) on Monday October 27, 2008 @09:42AM (#25526277) Homepage

      I would think this would be a good way to address the issue with your boss. He wants to save some money to get, as he thinks, the same thing for free. But in fact, there are potential downsides to playing that game. He may be disregarding potential legal issues, but he should be less willing to disregard practical issues. If this other company discovers what you're doing, they could make it a little harder to access, or they could ban your company's entire subnet and send a letter indicating that if you'd like to get access again, then you'll have to start paying them for the service you've been stealing.

      The key is that, in the meantime, your boss' plan will seem like a dramatic failure that should have been foreseen.

    • Re: (Score:3, Funny)

      by Sockatume ( 732728 )
      Workaround: write a program which generates random data using a small amount of harvested data as a guideline. If the boss is too lazy to generate the data he's meant to be generating, then he's probably going to be too lazy to check that the data you're "harvesting" is actually accurate.
  • by argent ( 18001 ) <peter@slashdot . ... t a r o nga.com> on Monday October 27, 2008 @09:18AM (#25525889) Homepage Journal

    If your boss asks you to do something illegal, don't. If he doesn't agree, you should probably be looking for a new job, already. If he's willing to play these kinds of games with another company, what makes you think he won't do the same to you?

  • Uh... (Score:5, Insightful)

    by Anonymous Coward on Monday October 27, 2008 @09:19AM (#25525903)

    No. By your own admission you think its wrong. Next?

  • Sigh (Score:5, Insightful)

    by MyLongNickName ( 822545 ) on Monday October 27, 2008 @09:20AM (#25525913) Journal

    Okay, this one is simple. You know what is right and what is wrong. The reality is that 99% of the folks will do what the boss asks without even raising a fuss. The reality is that you will be damaging your career if you don't go ahead.

    Now, the other reality is that shit flows downhill. That is, if this project gets questioned, the boss will claim ignorance, and put the blame on you. Your job is to cover your ass.

    Email is a good documentation tool. "Clarify" the request, asking if this is what he intends for you to do. Remove the emotion. Put in only facts. Put in a piece about your not being sure, but this may be a violation of terms of service. Ask if he wants you to proceed. Forward your sent email to a personal account.

    By the book. This one is so simple that it should be in the FAQ.

    • Re:Sigh (Score:5, Interesting)

      by sydb ( 176695 ) <[michael] [at] [wd21.co.uk]> on Monday October 27, 2008 @09:45AM (#25526341)

      It's only any good if the other party co-operates. The boss can easily phone you or walk up to you and say "Yes I want you to do it." and you have no record, and for many people this is their default mode of operation because that way no-one can pin anything on them. Unless they're singing their own praises, when everyone gets cc'd in.

      I used to find it infuriating but fury gets you nowhere in the workplace.

      • Re:Sigh (Score:5, Insightful)

        by MyLongNickName ( 822545 ) on Monday October 27, 2008 @09:54AM (#25526477) Journal

        You bring up a good point which leads to lesson #2: Written trumps verbal. If shit hits the fan, you halve your email. if your boss then says that he verbally told you not to proceed, you only have to say that you have no recollection of any such conversation. He is on the defensive as he has nothing to back it up. If he was "appalled" at the thought of breaking the TOS, then he would have written back and clarified.

        Now, if you want to double cover your ass, give him status reports via email. Ask questions. You are covered.

        Now to answer some other questions about whether to quit or not. You have to make that decision on your own. For screen scraping, I wouldn't quit over something so mundane. Sorry. Especially if you are a grunt. You voice your concerns, and go on. The reality is that 4 times out of 5 if you voice your concerns like this in a written manner, that the boss will back down. I have faced it twice in a grunt position with two different managers, and both times I got thanked for bringing it to their attention. It is all in how you deliver it. If it comes across as "I am ethical and you are a piece of shit", then your career is hurt. If it comes across sa "I am trying to look out for your well being and that of the company", it can be a positive. Wording is everything.

  • by Ohmaar ( 997049 ) on Monday October 27, 2008 @09:23AM (#25525961)
    I work in health care, so maybe it's different in your industry, but every hospital I've worked for has had a compliance officer with an anonymous 800-number for compliance questions. This is DEFINITELY the kind of stuff they want to know about.
    • by EWAdams ( 953502 ) on Monday October 27, 2008 @09:46AM (#25526357) Homepage

      "Compliance officer" in an IT business... you crack me up. You should take your show on the road.

      Hospitals have compliance officers because a) they're regulated, inspected, etc. and b) people can die and they can be sued to Kingdom Come.

      The IT business is about as regulated as Somalia.

  • by GreyyGuy ( 91753 ) on Monday October 27, 2008 @09:26AM (#25526007)

    Fix it. He wants to do something on the cheap and look good. But the way he wants to do it is going to fail spectacularly. And when it fails, so will you. If this puts any amount of load on the services it is using, it will get picked up by the service provider. Maybe not today, but it will. And then the accounts will get turned off and possibly your IP addresses blacklisted, and then it all goes away. So give him a better solution. If he is balking at the $2k/month find a cheaper service. There is almost always one. Compare the cheaper solution to the time spent fixing it when the free service cuts you off. Provide examples of free service cutting people off.

    And unless you are looking for some very specific information, I would expect someone to provide an RSS feed with something similar that is supposed to be used for this sort of thing.

  • by elrous0 ( 869638 ) * on Monday October 27, 2008 @09:29AM (#25526053)
    Only YOU can decide how far you're willing to go for your job. You're essentially asking us what your own ethical limits are.
  • Business sense (Score:5, Insightful)

    by ThePyro ( 645161 ) on Monday October 27, 2008 @09:29AM (#25526055)

    Even if your boss doesn't care about the ethics of this scheme, he probably does care about ramifications to the business. What happens when you get caught? All your development work will have been wasted because they'll shut you down at the very least. There's potential for a lawsuit, which is an expensive proposition even if you win. Damage to your company's reputation may make it harder to do business. And as another poster already mentioned, this isn't exactly a gem of a project to put on your resume.

  • A character check? (Score:5, Insightful)

    by juuri ( 7678 ) on Monday October 27, 2008 @09:31AM (#25526093) Homepage

    Having been put in a position once before that an employer asked me to do something I found to be frankly quite lacking in a moral nature here's what I ultimately decided to do.

    After considering the work for a while, both why I didn't feel like performing the work personally and why the company desired this functionality I finally decided to do the work, but inform my boss and his boss that I was uncomfortable creating this before hand and giving them clear notice of the whys.

    Firstly I did the work because it was simply my job and I had signed onto the job. It's something a *lot* of people might not have given a second thought to creating, obviously as they both had no problems with the work since they asked me to continue even after raising my concerns. Secondly because it wasn't really "that bad" and having steady income of cash dolladolla bills allows me to have nice things like somewhere to live and food I wanted to see if it was something I was over-reacting to.

    After completion? Yep, I still felt like shit. So I gave them my notice and told them in the my resignation letter why I was leaving and referred them to the early notification of my objections. So, for me, it was a good learning experience about myself and having done it in this manner I have no problem explaining it to future employers as my reason for leaving this particular job.

  • Who cares? (Score:5, Insightful)

    by 1u3hr ( 530656 ) on Monday October 27, 2008 @09:31AM (#25526101)
    would require me to break multiple web sites' Terms of Service (TOS).

    A website's "terms of service" are not the Ten Commandments. They're not laws, or even moral rules. They're just what one company wants you to do. You don't work for them, why do you care? If they notice and complain, it's your boss's problem, legally; and morally, I wouldn't lose any sleep.

    Only thing to do is cover your ass and get your boss to put his instructions in a memo so he can't blame you should problems arise.

    Really "scraping a website" is not a moral question on the scale of collaborating with Nazis. It's a business. Other businesses are your rivals, not your friends. They'd fuck you over in a minute.

    • Re: (Score:3, Interesting)

      by MightyYar ( 622222 )

      I wrote a little script to search multiple cities in Craigslist, simply because they don't offer the function at any price. People can say I'm a jerk, but I really don't care because it saves me a lot of time.

    • Re: (Score:3, Insightful)

      by Courageous ( 228506 )

      Planning ahead of time to breach a contract, with malice aforethought, may not be as free of moral constraint as your letting on.

      C//

      • Re: (Score:3, Interesting)

        by 1u3hr ( 530656 )
        Planning ahead of time to breach a contract,

        What "contract"? No contract was mentioned.

  • by jollyreaper ( 513215 ) on Monday October 27, 2008 @09:32AM (#25526111)

    I told you to scrape Slashdot, not read it. Now get back to work!

  • one approach (Score:5, Insightful)

    by buddyglass ( 925859 ) on Monday October 27, 2008 @09:33AM (#25526121)
    1. Tell your boss it's a bad idea to break these websites' terms of service. He'll probably override you and tell you to do the project anyway.
    2. Code up the project just like he asks. Demonstrate that it works.
    3. Shortly afterwards, email the sites in question from a non-work friend's account and let them know (with specific information) the accounts and IP addresses that are violating their terms of service. Hopefully the accounts will be disabled, and/or your employer's IP range will be blocked.
    4. Throw up your hands and tell your boss, "Well, I guess they figured out what we were doing!"
    • Re: (Score:3, Insightful)

      The boss will just say "You're a smart guy. Find a way to get around their protections."
      • Re: (Score:3, Insightful)

        by DingerX ( 847589 )
        No need to rat them out. Just give the boss the 411 on the "Hidden Costs" of doing things that way. Ethical arguments are well and good, but when you're asked to do something like this, it's clear that the ethical arguments mean nothing compared to economic ones. Guaranteed system-wide outages and worse catastrophic failures (=poisoned data) are going to cost a lot. Since you cannot predict when they will happen (only that they will happen), or what they will look like, you can't give an estimate for the do
  • CYA by Asking! (Score:3, Insightful)

    by cliffiecee ( 136220 ) on Monday October 27, 2008 @09:37AM (#25526195) Homepage Journal

    The whole idea sounds pretty scummy, based on your description. Multiple free accounts? yeesh.

    So why don't you just ask the webmasters of the sites you're about to scrape? I'd bet the site owners would settle for a few hundred per month to provide you with data in whatever form you require. And it's cheaper than the $2000/mo. for a server, etc. (If these sites are "bigger" than what a few hundred a month would buy, then you damn well better ask (see below).

    Ask your Legal department about this as well. They can be extremely helpful in stopping hare-brained ideas like this. If the websites in question are big enough to take action against this, YOU'RE the one left holding the bag, not Mr. Bright Idea Guy.

    WARNING: All of this assumes your boss is partially sane and reasonable!! If he's a jerk, you are hosed. I'm sorry.

  • by jimicus ( 737525 ) on Monday October 27, 2008 @09:38AM (#25526217)

    If you even need to ask, you've already demonstrated a trace of ethics.

    Now, sometimes having such ethics will mean you have to make difficult choices. And nobody else can make those choices for you.

    While ethics won't pay the mortgage, "Reason for leaving the previous job: I was asked to do something illegal and, when I queried this, was given the ultimatum to do it or get out. I got out." is probably a heck of a lot better than "The company had to sack me after it transpired I'd done something illegal" (emails to CYA notwithstanding).

    Because, make no mistake, the fact that your company has done this will get out.

  • by mark-t ( 151149 ) <markt.nerdflat@com> on Monday October 27, 2008 @10:06AM (#25526681) Journal
    Tell him that the very next-to-best case scenario for him (the "best case" scenario being that they never notice what you are doing) is that they notice what you are doing and blacklist you from connecting to it ever again. If at all possible, give him an estimate on the likelihood of that occurring. Point out to him very plainly that if or when this outcome occurs, then what he is asking you to do now will be all for nothing. If the chance of legal ramifications is not negligible, you should also mention that as well. Document everything. If he still wants you to proceed, then polish your resume and find another job because if he's too cheap to pay 2k a month for a service he thinks he can scam off of for free, he's probably too cheap to want to continue to pay you in a few months time, after he figures he's got what he needs from you.
  • Leading question (Score:3, Insightful)

    by bWareiWare.co.uk ( 660144 ) on Monday October 27, 2008 @10:13AM (#25526783) Homepage

    You asked the question in a leading manor and have got odd responses as a result:

    'Scrapping' pages is exactly what the Internet archive or Goggle do, this is common and generally accepted practice (look at the amount spend on SEO). It is also assumed that these operate without human supervision and do not need to read or compile with the human TOS of your site. Critically spiders should compile with the 'robots.txt'. If you do this you have the moral high ground. If you don't then it can be interoperated as criminal under the laws such as the Computer Misuse Act.

    Similarly no one suggests that everyone using gMail is a parasite. Most 'free' services come with a very explicit contract detailing their allowed uses. If you compile with the contract you are fine, if not, you are again breaking the law.

    Probably more importantly, this is almost certainly a bad business discussion:

    Given that you as an employee have judged it as ethically questionable you can be fairly sure a significant proportion of your clients are likely to feel similarly.

    Even if you are complying with the contract from your free service you are almost certainly not getting a SLA in return. If the supplier decides your business is dodgy, or you are putting too much burden on their system they will shut down all of your accounts without warning or reprieve. Constantly battling this is likely to cost you more then the hosting in the long run.

    Page scrapping is very unreliable. Even when the source site is cooperating they invariable break it on every edit. What will happen to your business when the source site detects your scrapping and decides to serve goatse to your spider, and hence your clients?

  • Terrible engineering (Score:3, Informative)

    by Have Blue ( 616 ) on Monday October 27, 2008 @10:23AM (#25526931) Homepage
    Even if you don't want to tangle with the ethical issues, ask your boss how he feels about the app constantly going down and losing data because the "parasited" service deleted all your free accounts.
  • Comment removed (Score:3, Insightful)

    by account_deleted ( 4530225 ) on Monday October 27, 2008 @10:51AM (#25527321)
    Comment removed based on user account deletion

"If it ain't broke, don't fix it." - Bert Lantz

Working...