Search Sciweavers | Sciweavers

359 search results - page 8 / 72

» Document clustering using word clusters via the information ...

click to vote

IJCAI
2007

212views Artificial Intelligence» more IJCAI 2007»

Semantic Smoothing of Document Models for Agglomerative Clustering

13 years 9 months ago

Download www.ischool.drexel.edu

In this paper, we argue that the agglomerative clustering with vector cosine similarity measure performs poorly due to two reasons. First, the nearest neighbors of a document belo...

Xiaohua Zhou, Xiaodan Zhang, Xiaohua Hu

claim paper

Read More »

click to vote

ICDE
2007
IEEE

211views Database» more ICDE 2007»

Document Representation and Dimension Reduction for Text Clustering

14 years 1 months ago

Download torch.cs.dal.ca

Increasingly large text datasets and the high dimensionality associated with natural language create a great challenge in text mining. In this research, a systematic study is cond...

M. Mahdi Shafiei, Singer Wang, Roger Zhang, Evange...

claim paper

Read More »

click to vote

CLEF
2011
Springer

255views Information Technology» more CLEF 2011»

A Language-Independent Approach to Identify the Named Entities in Under-Resourced Languages and Clustering Multilingual Document

12 years 7 months ago

Download web2py.iiit.ac.in

Abstract. This paper presents a language-independent Multilingual Document Clustering (MDC) approach on comparable corpora. Named entites (NEs) such as persons, locations, organiza...

N. Kiran Kumar, G. S. K. Santosh, Vasudeva Varma

claim paper

Read More »

click to vote

ISI
2007
Springer

144views Security Privacy» more ISI 2007»

DOTS: Detection of Off-Topic Search via Result Clustering

14 years 1 months ago

Download www.ir.iit.edu

— Often document dissemination is limited to a “need to know” basis so as to better maintain organizational trade secrets. Retrieving documents that are off-topic to a user�...

Nazli Goharian, Alana Platt

claim paper

Read More »

click to vote

WWW
2010
ACM

257views Internet Technology» more WWW 2010»

CETR: content extraction via tag ratios

14 years 2 months ago

Download www.cs.illinois.edu

We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...

Tim Weninger, William H. Hsu, Jiawei Han

claim paper

Read More »

« Prev « First page 8 / 72 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers