Thursday, September 03, 2009

Mechanical Turk Revival

Moderating results from Turk today proved more inspiring than the last bout, the ones requiring people to write things.  I thought a lot about coding against Amazon's MTurk API, but then decided the cost of time (and money if I hired someone other than myself) was not worth it.  Instead I altered the way I requested the HIT's and also the instructions.

Rather than 4 HIT's for 4 different pictures sent to 3 unique workers (which sometimes gives me great results), I have only 1 for 6 which gets sent to 2 unique workers.  I also lowered the pay-out for that one, rather than raising it, because I want to do more of these and have less to moderate.  It doesn't sound like it makes a lot of sense (or cents), but it does in my head so I'm going to try it.  Since I'm having to go back to these and scrub/fix data anyway, my aim is to get their cost as low as possible (about a buck per site instead of five).

Part of the issue also related to waiting to publish the pic HIT's.  I found that putting out all of them at once tended to degrade the quality: all the pics from one person would be the same for all 4 HIT's.  So I had 4x the work and 4x the cost, with the same output.  Additionally, pacing myself is not fun and caused a lot more repetitive busy work.  I hope that this new structure allows me to blast out all the HIT's at once and parse all their data the following day.

Lastly, I hope to genericize the pics HIT to use for games, movies, and television shows in addition to people.

2 comments:

Fiznicker said...

Sorry for not digging into this myself via your blog, but a quick peruse didn't give me the answer: would you mind elaborating on what you're using MT for?

Neil C. Obremski said...

From the context I speak about it in this blog, I'm using Mechanical Turk for blurbs on celebrities and groups (bands, comedy troupes, etc.). These are going to be seeds which I hope to bring fans of those things in on to expand. Currently, the blog that meant to parallel this has taken a life of its own: http://fansiter.com/