There is a lack of an integrated technology that will increase effective usage of the vast and heterogeneous multi-lingual and multimedia digital content. The need is being express...
Information extraction systems are increasingly being used to mine structured information from unstructured text documents. A commonly used unsupervised technique is to build iter...
A bibliography is traditionally characterized by the judgments, bounded by explicit selection criteria, made by a single compiler. Because these criteria concern the attributes as...
David G. Hendry, J. R. Jenkins, Joseph F. McCarthy
Ubimedia is a concept where media files are embedded in everyday objects and the environment. We propose an approach where the user can read and write these files with his/her pe...
Duplicate URLs have brought serious troubles to the whole pipeline of a search engine, from crawling, indexing, to result serving. URL normalization is to transform duplicate URLs...
Tao Lei, Rui Cai, Jiang-Ming Yang, Yan Ke, Xiaodon...