|Addresses: www mail|
Computer generation of cloze tasks still falls short of full automation; most current systems are used by teachers as authoring aids. Improved methods to estimate cloze quality are needed for full automation. We investigated lexical reading difficulty as a novel automatic estimator of cloze quality, to which cooccurrence frequency of words was compared as an alternate estimator. Rather than relying on expert evaluation of cloze quality, we submitted open cloze tasks to workers on Amazon Mechanical Turk (AMT) and discuss ways to measure of the results of these tasks. Results show one statistically significant correlation between the above measures and estimators, which was lexical co-occurrence and Cloze Easiness. Reading difficulty was not found to correlate significantly. We gave subsets of cloze sentences to an English teacher as a gold standard. Sentences selected by co-occurrence and Cloze Easiness were ranked most highly, corroborating the evidence from AMT.