Research in natural language generation promises significant advances in the ways in which we can make available the contents of underlying information sources. Most work in the field relies on the existence of carefully constructed artificial intelligence knowledge bases; however, the reality is that most information currently stored on computers is not represented in this format. In this paper, we describe some work in progress where we attempt to generate large numbers of texts automatically from existing underlying databases. We focus here in particular on the automatic generation of descriptions of objects stored in a museum database, highlighting the difficulties that arise in using a real data source, and pointing to some possible solutions.
Robert Dale, Stephen J. Green, Maria Milosavljevic