Hello everyone, my team has a big series of HITs we plan to post on Amazon Mechanical Turk, and before I do, I hope to get some feedback from potential turkers about the way to go about it, and the clarity of the instructions, and so on. We hope to hear back from people on this forum. We are creating a corpus of post-1990 American English that has a lot of annotation added to it in many different ways. Linguistics students have added annotations, automatic processing has added some, and so on. It is used by researchers all over the world who study Natural Language Processing, linguistics, and other topics. We want to do a big experiment where turkers add word meanings -- given a word, a sentence from the corpus where the word is used, and a list of meanings to select from. We're almost ready to go, but we hope to get feedback on the process we propose, and our instructions before we launch. First, the process: the name of the HITs will be "Do you know what this word means? (For American English Word Mavens)". We will first do a trial run of a few words, 100 sentences each where the title will indicate it is a trial run as follows "Do you know what this word means? (Trial Run; For American English Word Mavens)". The trial run is to make sure we can process the hits quickly and accurately and that everything is working before we launch the big series of HITs. Does this seem like a good idea? Second, the instructions describe the task and a bonus schedule we came up with to encourage the same turkers to do many HITs and briefly what the data will be used for. Each HIT has only 10 sample sentences to judge, but in total for each word we have 1000 sentences (10 HITs). By making the HITs small, turkers can stop doing HITs whenever they want to and we'll still be able to get a lot of data. By offering bonuses, we want to reward turkers who do a lot of HITs of the same word, because more data from the same person is good for the corpus. Does this seem like a sensible strategy? Third, we will have a delay of no more than a week after the HITs expire while we do quality checking and compute bonuses, which is explained in the instructions. The quality checking is so we can refuse work from obvious spammers, and the bonus computation is so that we can properly reward people who do a lot of HITs. Does that seem reasonable? Below are the instructions. We welcome comments, questions, feedback, etc. masc-word-sense team ------------------------------------------------------------------------------- Title: Do you know what this word means? (Trial Hits; for American English Word Mavens) For the 10 sentences in this HIT, select the best meaning of the word in boldface. Each sentence is followed by the same list of meanings to choose from. There are *100 HITs* for this *same word, same list of meanings*. The data collected from these HITs will be used for research on how word meaning varies with context, and will become part of an open resource for linguistic research. There are a total of 45 words. Because we want you to do as many HITs per word as possible, we will give bonuses for larger quantities of HITs we approve: 1) if you complete 10 HITs for a word, you get a 0.01/HIT bonus for HITs #10-#39; 2) if you complete 40 HITs for a word, you get a 0.02/HIT bonus for HITs #40-#100. WARNING: Before we approve any HITS for a given word, we will do quality checking, we will not approve poor quality HITs. We apologize in advance that there will be a delay in payment while we quality check, and while we compute the bonuses. We will take no longer than a week from the time that we let HITs expire, or that all HITs for all words have been completed.