Could anyone help me with a script/extension/program?

Discussion in 'Scripts' started by Dan, Nov 25, 2012.

  1. Dan

    Dan Member

    Joined:
    Sep 12, 2012
    Messages:
    19
    Likes Received:
    0
    I'm looking for a tool that can auto highlight all the repeating words (string of characters, including numbers, without spaces) on a webpage. The mturk page contains frames and refreshes every time i submit a hit. I'm doing batches of hits where i compare different things, so this could help me a lot, as my eyes could find the relevant information much easier.

    I've tried some Chrome extensions, but none of them worked correctly or do exactly what i needed. Do any of you knows something that could help? I have no coding experience, as i imagine it wouldn't be too hard to code such a script/browser extension. :(

    Thanks.
     
  2. Cowfin

    Cowfin Community Manager
    Staff Member

    Joined:
    Sep 12, 2012
    Messages:
    3,737
    Likes Received:
    0
    Have you tried using the browser's built in search function? Ctrl-F. It highlights all appearances of the word or group of words you search for.

    EDIT: Works for more than words, it also works for letters, numbers, characters, practically anything.
     
  3. Dan

    Dan Member

    Joined:
    Sep 12, 2012
    Messages:
    19
    Likes Received:
    0
    I know, but it takes too much time. I must double click on a char string, then ctrl-c, ctrl-f and ctrl-v. And on a page there could be many others, so i must use this sequence more than once. I'm doing lots of Product Listings, so if any of you knows them, you would know what i'm talking about and why i need such a program.

    The auto highlighter would highlight any word (or numbers) that's repeating, including common words like "the", but it doesn't matter, it would still help me a lot.
     
    #3 Dan, Nov 25, 2012
    Last edited by a moderator: Nov 25, 2012
  4. Cowfin

    Cowfin Community Manager
    Staff Member

    Joined:
    Sep 12, 2012
    Messages:
    3,737
    Likes Received:
    0
    My bad, I was thinking of highlighting as visual only, rather than for the purpose of selecting text to copy and paste.
     
  5. sasquatch

    sasquatch User

    Joined:
    Oct 19, 2012
    Messages:
    4,446
    Likes Received:
    0
    what hits are these? i'd have to see one to be able to tell how to go about doing this...
     
  6. Dan

    Dan Member

    Joined:
    Sep 12, 2012
    Messages:
    19
    Likes Received:
    0
    I'll post a screenshot when they're back with new hits.
     
  7. Dan

    Dan Member

    Joined:
    Sep 12, 2012
    Messages:
    19
    Likes Received:
    0
    Here are two products, left and right, with many characters (including numbers) repeating: http://i.imgur.com/fbtam.jpg[/B][/U] This is one of the easiest, but others are more subtle, with serial numbers in the middle of the product description, colors named in different places, etc.

    This one shows Chrome browser highlighting one of the repeating blocks of characters that i chosen to search: http://i.imgur.com/mfeZ5.jpg[/U][/B] As you see in this picture, that's not the only part of text that's repeating, but others aren't highlighted because i didn't search for them.

    I want something that automatically highlights all the repeating blocks of text (and/or numbers), preferably in a different random color for each character block that appears more than once.

    1 hit consists of 14 pages in that frame, each page with 2 products, like in the pictures above.
     
    #7 Dan, Nov 27, 2012
    Last edited by a moderator: Nov 27, 2012
  8. sasquatch

    sasquatch User

    Joined:
    Oct 19, 2012
    Messages:
    4,446
    Likes Received:
    0
    i think this would be rather impractical... you would have to have a script parse the entire page, divide it all up into discrete chunks of text, then compare each chunk to every other chunk. given that most browser extensions and all greasemonkey scripts use javascript, this would likely delay the page loading enough to erase any advantages it could provide...

    EDIT: also, you would have the annoyance of having EVERYTHING that repeats highlighted, like words like 'the', 'a', etc. you could build in a filter to prevent these words from being highlighted, but then that would make the operation take longer because each 'word' would have to be compared not only to all the other words, but also to the list of disallowed words...
     
    #8 sasquatch, Nov 27, 2012
    Last edited by a moderator: Nov 27, 2012
  9. Dan

    Dan Member

    Joined:
    Sep 12, 2012
    Messages:
    19
    Likes Received:
    0
    Can't this be done after the page completed loading? The mturk page only refreshes after submitting the hit, but the 14 pages composing 1 hit are refreshed separately, in a web frame (i think that's what is called, common with mturk hits).

    It takes me about 5 seconds for comparing two products, sometimes more, sometimes less. So more than a minute per hit, usually. The pages load instantly, most of the time.

    EDIT: common words highlighted (like "the", "and", etc) wouldn't bother me that much, i've done these hits for a long time, my brain learned to ignore them. :D
     
    #9 Dan, Nov 27, 2012
    Last edited by a moderator: Nov 27, 2012
  10. bayonjoset

    bayonjoset User

    Joined:
    Nov 18, 2012
    Messages:
    162
    Likes Received:
    0
    sorry to ask friend, how can we get this qual ???
     
  11. sasquatch

    sasquatch User

    Joined:
    Oct 19, 2012
    Messages:
    4,446
    Likes Received:
    0
    technically speaking it would always have to be done after the page (or page within the iframe) finished loading, if javascript is executed before the page loads it won't necessarily have the whole content of the page to work on. my point is that, given the efficiency of javascript (or, rather, lack thereof), there's a good chance that you would have already manually identified the necessary data by the time the script was finished parsing the text. i wouldn't personally be inclined to work on a script like this for that reason, and also because i don't qualify for these hits, so the script would be useless to me and probably the majority of others here, and i wouldn't be able to even test it properly.
     
  12. Dan

    Dan Member

    Joined:
    Sep 12, 2012
    Messages:
    19
    Likes Received:
    0
    You can't, not anymore, as far as i know. Only 263 people work on them (at least that's the number of quals given) and that number hasn't changed in two years, i think. They were given randomly (i presume) to people who worked for them when they were available to everybody.


    Another option for a script would be to have a list with allowed words, where i could place some key words that appear on many products, like color (black, white, orange, etc), finish (aged bronze, etc), "compatible" or "OEM" (for ink cartridges). And this could be combined with the ability to highlight only repeating numbers (without spaces or letters), for example: 120OE-453365 and 453365OB, it will highlight only "453365". This could be useful for the many serial numbers i get to see all over the place.

    There are a couple of Chrome extensions that do just that (except for the numbers thing), but don't work in iframes, for some reason.
     
    #12 Dan, Nov 27, 2012
    Last edited by a moderator: Nov 27, 2012
  13. sasquatch

    sasquatch User

    Joined:
    Oct 19, 2012
    Messages:
    4,446
    Likes Received:
    0
    the existing scripts would have to be targeted for iframe content, which probably wouldn't take much tweaking. but since i can't access the hits, i really can't help out here...
     
  14. Dan

    Dan Member

    Joined:
    Sep 12, 2012
    Messages:
    19
    Likes Received:
    0
    I could give you the frame source of the hit (in chrome i have the "view frame source" option, when i right click on a frame), if that helps you.
     
  15. sasquatch

    sasquatch User

    Joined:
    Oct 19, 2012
    Messages:
    4,446
    Likes Received:
    0
    like i said before, i'm not really interested in working on this, for a number of reasons. first, the time it would take versus the amount of help it would provide the community isn't worth it to me. i don't want to sound like a jerk or anything, but you yourself mentioned how few people have the qual to do these hits.

    plus, the fact that you said you get through each set in about 5 seconds means that there's very little time that's going to be saved overall by doing this. i know every little bit helps, but it should take at least a second or two to look at any text, highlighted or otherwise, and another second or two to make your decision and click the mouse, leaving little room for improvement.

    i also don't have any experience developing or modifying google chrome extensions, i work almost exclusively in firefox. many browser extensions use javascript, which myself and many other people are familiar with, but these extensions are packaged and deployed in different ways, so learning how to deal with chrome extensions would take even more time.

    and on top of all of that, i wouldn't develop any script or extension i couldn't personally test. i've had to do projects like this for work, and it's an extremely frustrating and annoying process if i have to go back and forth with someone else every time i change a single character of code, and makes the whole task take an exponentially longer period of time.

    but in the interest of what you're trying to accomplish, yes, it should be possible to adapt an existing extension to target a document located in an iframe. if it's a user script, rather than an extension, then it should be much easier, because there is no layer on top of the javascript to deal with.
     
  16. Dan

    Dan Member

    Joined:
    Sep 12, 2012
    Messages:
    19
    Likes Received:
    0
    Ok, i understand. Thanks anyway. :)
     

Share This Page