Our aim is to develop new database technologies for the approximate matching of unstructured string data using indexes. We explore the potential of the suffix tree data structure i...
The discipline of narratology has long recognized the need to classify documents as instances of different text types. We have discovered that classification is as applicable to h...
This paper describes an application of IR and text categorization methods to a highly practical problem in biomedicine, specifically, Gene Ontology (GO) annotation. GO annotation...
The amount of available Thai broadcast news transcribed text for training a language model is still very limited, comparing to other major languages. Since the construction of a b...
One of the challenges of music information retrieval is the automatic extraction of effective content descriptors of music documents, which can be used at indexing and at retrieva...