Although documents have hundreds of thousands of unique words, only a small number of words are significantly useful for intelligent services. For this reason, feature extraction ...
Document-centric XML is a mixture of text and structure. With the increased availability of document-centric XML content comes a need for query facilities in which both structural...
Jaap Kamps, Maarten Marx, Maarten de Rijke, Bö...
This paper addresses the alignment issue in the framework of exploitation of large bimultilingual corpora for translation purposes. A generic alignment scheme is proposed that can...
An increasing amount of heterogeneous information about scientific research is becoming available on-line. This potentially allows users to explore the information from multiple p...
Social network contents are not limited to text but also multimedia. Dailymotion, YouTube, and MySpace are examples of successful sites which allow users to share videos among the...
Janice Kwan-Wai Leung, Chun Hung Li, Ting Keung Ip