CinefileTestProcess
From MWCSWiki
Stages
- User decides on a category (do WE give them one? do they pick their own?)
- User produces canonical list
- Category builder iterates through canonical keywords (note: don't evaluate significance at all until we have at least one counterexample)
- Category builder iterates through inclusive keywords (freeze, unless that turns out to not have enough)
- User is presented with "test" movies (how do we choose these? for now, maybe let's use naive bayes to pick the films that are closest to 50%. if there's a "real" prior from the 33%, use it; otherwise, pick your poison. 50%? but the cool jonathan idea is: choose the movies which have the greatest variance across raters. do this by (a) finding the variance across all films for each rater, (b) normalizing the ratings by that variance, and (c) finding the variance across raters for each film, sort by decreasing variance, and choose the films with highest variance.)
Parameters
- How many categories do we want each user to go through?
- Do we want the same user to do both keyword TLG and freeword TLG, or will we divide our testers between them?
- What is the threshold for keyword significance?

