Sciweavers

IRFC
2011
Springer
13 years 2 months ago
Multilingual Document Clustering Using Wikipedia as External Knowledge
This paper presents Multilingual Document Clustering (MDC) on comparable corpora. Wikipedia, a structured multilingual knowledge base, has been highly exploited in many monolingual...
N. Kiran Kumar, G. S. K. Santosh, Vasudeva Varma
IPM
2008
123views more  IPM 2008»
13 years 11 months ago
Effectiveness of additional representations for the search result presentation on the web
The presentation of search results on the web has been dominated by the textual form of document representation. On the other hand, the document's visual aspects such as the ...
Hideo Joho, Joemon M. Jose
ECIR
2003
Springer
14 years 25 days ago
Taming Wild Phrases
Abstract. In this paper the suitability of different document representations for automatic document classification is compared, investigating a whole range of representations be...
Cornelis H. A. Koster, Marc Seutter
RIAO
2007
14 years 26 days ago
Effectiveness of Rich Document Representation in XML Retrieval
Information Retrieval (IR) systems are built with different goals in mind. Some IR systems target high precision that is to have more relevant documents on the first page of their...
Fahimeh Raja, Mostafa Keikha, Maseud Rahgozar, Far...
CIKM
2006
Springer
14 years 3 months ago
Representing documents with named entities for story link detection (SLD)
Several information organization, access, and filtering systems can benefit from different kind of document representations than those used in traditional Information Retrieval (I...
Chirag Shah, W. Bruce Croft, David Jensen
ICDAR
2003
IEEE
14 years 4 months ago
Classification of Web Documents Using a Graph Model
In this paper we describe work relating to classification of web documents using a graph-based model instead of the traditional vector-based model for document representation. We ...
Adam Schenker, Mark Last, Horst Bunke, Abraham Kan...
SIGIR
2004
ACM
14 years 4 months ago
Locality preserving indexing for document representation
Document representation and indexing is a key problem for document analysis and processing, such as clustering, classification and retrieval. Conventionally, Latent Semantic Index...
Xiaofei He, Deng Cai, Haifeng Liu, Wei-Ying Ma
CIS
2005
Springer
14 years 5 months ago
Concept Chain Based Text Clustering
Different from familiar clustering objects, text documents have sparse data spaces. A common way of representing a document is as a bag of its component words, but the semantic re...
Shaoxu Song, Jian Zhang, Chunping Li
KDD
2009
ACM
243views Data Mining» more  KDD 2009»
14 years 12 months ago
Exploiting Wikipedia as external knowledge for document clustering
In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document. For instance, if two ...
Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, ...