It is a well documented fact that, for human readers, familiar text is more legible than unfamiliar text. Current-generation computer vision systems also are able to exploit some kinds of prior knowledge of linguistic context: for example, many OCR systems can use known lexica (word-lists, such as of commonly occurring English words) to disambiguate interpretations. It is interesting that human readers can exploit various degrees of familiarity: for example, strings of characters which, while not found in dictionaries, are similar to spelled words: e.g. "pronounceable" strings, or strings made up of frequently occurring character n-grams. In contrast to this, computer vision technologies for exploiting such poorly characterized constraints (absent an explicit, complete lexicon) are not yet well developed. This gap in ability may allow us to design stronger CAPTCHAs. We measure the familiarity of challenge strings generated by four methods (described by Bentley and Mallows) a...