Sciweavers

106 search results - page 21 / 22
» Document Representation and Dimension Reduction for Text Clu...
Sort
View
SOCIALCOM
2010
13 years 5 months ago
Opinion Summarization in Bengali: A Theme Network Model
Theme network is a semantic network of document specific themes. So far Natural Language Processing (NLP) research patronized much of topic based summarizer system, unable to captu...
Amitava Das, Sivaji Bandyopadhyay
ICDE
2010
IEEE
273views Database» more  ICDE 2010»
14 years 7 months ago
WikiAnalytics: Ad-hoc Querying of Highly Heterogeneous Structured Data
Searching and extracting meaningful information out of highly heterogeneous datasets is a hot topic that received a lot of attention. However, the existing solutions are based on e...
Andrey Balmin, Emiran Curtmola
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
14 years 2 months ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
NAACL
2003
13 years 9 months ago
Monolingual and Bilingual Concept Visualization from Corpora
e by placing terms in an abstract ‘information space’ based on their occurrences in text corpora, and then allowing a user to visualize local regions of this information space....
Dominic Widdows, Scott Cederberg
KDD
2005
ACM
194views Data Mining» more  KDD 2005»
14 years 8 months ago
Web object indexing using domain knowledge
Web object is defined to represent any meaningful object embedded in web pages (e.g. images, music) or pointed to by hyperlinks (e.g. downloadable files). Users usually search for...
Muyuan Wang, Zhiwei Li, Lie Lu, Wei-Ying Ma, Naiya...