The present paper considers the effects of introducing inaccuracies in a learner’s environment in Gold’s learning model of identification in the limit. Three kinds of inaccuracies are considered: presence of spurious data is modeled as learning from a noisy environment, missing data is modeled as learning from incomplete environment, and the presence of a mixture of both spurious and missing data is modeled as learning from imperfect environment. Two learning domains are considered, namely, identification of programs from graphs of computable functions and identification of grammars from positive data about recursively enumerable languages. Many hierarchies and tradeoffs resulting from the interplay between the number of errors allowed in the final hypotheses, the number of inaccuracies in the data, the types of inaccuracies, and the type of success criteria are derived. An interesting result is that in the context of function learning, incomplete data is strictly worse for ...
Mark A. Fulk, Sanjay Jain