Large document collections

134

COLING
2000

88views Computational Linguistics» more COLING 2000»

Experiments in Automated Lexicon Building for Text Searching

15 years 7 months ago

This paper describes experiments in the automatic construction of lexicons that would be useful in searching large document collections for text fragments that address a specific ...

Barry Schiffman, Kathleen McKeown

claim paper

Read More »

185

click to vote

ACL
2008

153views Computational Linguistics» more ACL 2008»

Pairwise Document Similarity in Large Collections with MapReduce

15 years 7 months ago

Download www.umiacs.umd.edu

This paper presents a MapReduce algorithm for computing pairwise document similarity in large document collections. MapReduce is an attractive framework because it allows us to de...

Tamer Elsayed, Jimmy J. Lin, Douglas W. Oard

claim paper

Read More »

139

click to vote

CIKM
1997
Springer

133views Information Technology» more CIKM 1997»

The Need for Metrics in Visual Information Analysis

15 years 10 months ago

Download infoviz.pnl.gov

CT This paper explores several methods for visualizing the thematic content of large document collections. As opposed to traditional query-driven document retrieval, these methods ...

Nancy Miller, Elizabeth G. Hetzler, Grant Nakamura...

claim paper

Read More »

164

click to vote

CIKM
2000
Springer

109views Information Technology» more CIKM 2000»

Scalable association-based text classification

15 years 10 months ago

Download www.meretakis.gr

Naïve Bayes (NB) classifier has long been considered a core methodology in text classification mainly due to its simplicity and computational efficiency. There is an increasing n...

Dimitris Meretakis, Dimitris Fragoudis, Hongjun Lu...

claim paper

Read More »

169

click to vote

HICSS
2006
IEEE

133views Biometrics» more HICSS 2006»

Being Literate with Large Document Collections: Observational Studies and Cost Structure Tradeoffs

16 years 2 days ago

Download cobweb.ecn.purdue.edu

How do people work with large document collections? We studied the effects of different kinds of analysis tools on the behavior of people doing rapid large-volume data assessment,...

Daniel M. Russell, Malcolm Slaney, Yan Qu, Mave Ho...

claim paper

Read More »

157

click to vote

AI
2007
Springer

172views Artificial Intelligence» more AI 2007»

Fuzzy Clustering for Topic Analysis and Summarization of Document Collections

16 years 6 days ago

Download www.rene-witte.net

Abstract. Large document collections, such as those delivered by Internet search engines, are diﬃcult and time-consuming for users to read and analyse. The detection of common an...

René Witte, Sabine Bergler

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers