The best set of clusters for the query based on the scores across multiple worker responses. These are not raters that use Google's Quality Rater Guidelines to rate search results. I've never seen those human evaluators pass tasks like rating clusters of search results. Refinement of search result clustering In addition to rating clusters of search results. The patent also tells us that these crowdsourcing workers can suggest changes. During a crowdsourced evaluation, staff may suggest changes based on a series of refined tasks.

Refinement tasks can include: Merge two clusters that are too similar Delete clusters that don't seem to match other clusters Remove entity/topic from cluster Remove a specific search term from a cluster Move entities or search terms from one cluster to another We were also told: If the proposed refinement meets a consensus threshold for the task, the system may automatically perform the refinement by changing the cluster definition and/or may report the refinement to an expert. Cluster set test Each cluster set can represent a different clustering algorithm.

This provides a better user experience for users viewing the results Evaluation and rating is scalable (e.g. can handle hundreds or thousands of queries) because it relies on crowdsourced tasks rather than experts The system maximizes quality by lowering the ratings of crowdsourced workers who do not spend enough time on tasks and/or do not have enough expertise (e.g., familiarity with queries and search items) The system also maximizes quality by randomly presenting different sets of clusters to different workers to avoid the bias of workers spending more time on the first set presented

