In this paper we present two experiments conducted for comparison of different language identification algorithms. Short words-, frequent words- and n-gram-based approaches are co...
Lena Grothe, Ernesto William De Luca, Andreas N&uu...
This paper describes our novel retrieval model that is based on contexts of query terms in documents (i.e., document contexts). Our model is novel because it explicitly takes into...
Ho Chung Wu, Robert W. P. Luk, Kam-Fai Wong, K. L....
Compound document images contain graphic or textual content along with pictures. They are a very common form of documents, found in magazines, brochures, web-sites, etc. We focus ...
This paper studies five strategies for storing XML documents including one that leaves documents in the file system, three that use a relational database system, and one that uses...
Feng Tian, David J. DeWitt, Jianjun Chen, Chun Zha...
This paper discusses aspects of the redocumentation of legacy systems and proposes a model oriented approach to generating documentation, which is to produce models from existing ...