We discuss important factors in the design of evaluation studies for systems that generate animations of American Sign Language (ASL) sentences. In particular, we outline how some cultural and linguistic characteristics of members of the American Deaf community must be taken into account so as to ensure the accuracy of evaluations involving these users. Finally, we describe our implementation and user-based evaluation (by native ASL signers) of a prototype ASL generator to produce sentences containing classifier predicates, frequent and complex spatial phenomena that previous ASL generators have not produced. Categories and Subject Descriptors I.2.7 [Artificial Intelligence]: Natural Language Processing
Matt Huenerfauth, Liming Zhao, Erdan Gu, Jan M. Al