The thing about using Bots or "Web Crawlers" is that their all automated. Which means a computer bot can be designed to input or search anything, depending on what it's specifically programmed to do. A bot designed for web indexing can monitor a whole website and scrape information or can simply be used to search keywords. There's no way around it, the best option is to just, Make that money.
Razor is absolutely correct about scrapers. If the scraper is currently only looking for links than disabling them would work for a short period of time until the maintainer updates the protocol to grab anything that has the words "mturk.com" or "preview?". We could try using URL shorting services. It's a fly-by-night solution and we'd constantly have to stay on top of the bot just in case it was updated to catch them.
The bot probably has a proxy list that it cycles through. I wouldn't be surprised if it is scraping proxies as it goes along. Now, I am wondering what level the proxy's are. If the bot maintainer is smart he'd only be using elite proxies. If not we may be able to track him given enough time.
Yepp, someone beat me to it. Blocking an IP address is pretty much obsolete these days because there's to many alternatives to going around that method.
Great TO, must live in Mass. to do it though (possibly a repost) Title: 10-Minute Study about daily experiences and attitudes - Only people who live in MASSACHUSETTS wanted! Requester: Duke Center for Behavioral Economics [A14RDIKJG0V5TD] (TO) Description: Only people who live i MASSACHUSETTS wanted! It is a sensitive survey.Please take it if you are willing to attend to it carefully. We'd really appreciate your response and will compensate you well! Reward: $0.60 Qualifications: HIT approval rate (%) is not less than 95, Location is US Link: https://www.mturk.com/mturk/preview?groupId=2R6UCWY2AOO2YOO20NFGWSCSFJC22U [size=-2]Powered by non-amazonian script monkeys [/size]
Hmm that's true. I wonder how much the creator cares about avoiding black-lists though. From the original post it seems like he just finished this recently so idk how equipped it is.
Well, depending on the VPN and how they handle information that wouldn't be that huge of a deal. There are a lot of VPNs that say in their TOS--which no-one seems to read--that they will share your information if you are doing something that violates the TOS. Scraping is considered a violation on a majority of these. So a newbie with a VPN is not that scary these day, but I digress as I believe you're correct in your assumption that he is at least a semi intelligent scraper and probably only uses elite proxies or private VPNs.
Good TO, took me 3 minutes Title: Take a Short Survey (5 minutes) Requester: Vita Info Systems [A3FUL1A5P551RW] (TO) Description: Short survey for a research project. There are no wrong answers! Reward: $0.41 Qualifications: Location is US Link: https://www.mturk.com/mturk/preview?groupId=2NPDAT2D5NQ131YYG3AGFTYNQKY73T [size=-2]Powered by non-amazonian script monkeys [/size] (we ARE still posting hits, right? lol)
I think banning my IP is a conspiracy against me so that I'd be forced to use the bot instead of this forum. Oh cruel irony.
Title: Find Locations in the Text Requester: Parvin Sadat Feizabadi [A3PD8J74N1VLPL] (TO) Description: In this task, you read a text and fill in some blanks about the text. Reward: $0.15 Qualifications: HIT approval rate (%) is not less than 95, Location is US Link: Redacted [size=-2]Powered by non-amazonian script monkeys [/size]