. When publishing documents on the web, the user needs to describe and classify her documents for the benefit of later retrieval and use. This paper presents an approach to semanti...
Structured document content reuse is the problem of restructuring and translating data structured under a source schema into an instance of a target schema. A notion closely tied ...
Developer documentation helps developers learn frameworks and libraries. To better understand how documentation in open source projects is created and maintained, we performed a q...
In order to reduce the rejection rate of our automatic reading system, we propose to pre-classify the business documents by introducing an Automatic Recognition of Documents stage...
Clustering separates unrelated documents and groups related documents, and is useful for discrimination, disambiguation, summarization, organization, and navigation of unstructure...