Sciweavers

187 search results - page 8 / 38
» Entity categorization over large document collections
Sort
View
INFOVIS
2000
IEEE
13 years 12 months ago
ThemeRiver: Visualizing Theme Changes over Time
ThemeRiver™ is a prototype system that visualizes thematic variations over time within a large collection of documents. The “river” flows from left to right through time, ch...
Susan Havre, Elizabeth G. Hetzler, Lucy T. Nowell
ACL
2009
13 years 5 months ago
Profile Based Cross-Document Coreference Using Kernelized Fuzzy Relational Clustering
Coreferencing entities across documents in a large corpus enables advanced document understanding tasks such as question answering. This paper presents a novel cross document core...
Jian Huang 0002, Sarah M. Taylor, Jonathan L. Smit...
CIKM
2010
Springer
13 years 6 months ago
Improved index compression techniques for versioned document collections
Current Information Retrieval systems use inverted index structures for efficient query processing. Due to the extremely large size of many data sets, these index structures are u...
Jinru He, Junyuan Zeng, Torsten Suel
VISSOFT
2005
IEEE
14 years 1 months ago
Fractal Figures: Visualizing Development Effort for CVS Entities
Versioning systems such as CVS or Subversion exhibit a large potential to investigate the evolution of software systems. They are used to record the development steps of software ...
Marco D'Ambros, Michele Lanza, Harald Gall
ICML
2004
IEEE
14 years 8 months ago
Text categorization with many redundant features: using aggressive feature selection to make SVMs competitive with C4.5
Text categorization algorithms usually represent documents as bags of words and consequently have to deal with huge numbers of features. Most previous studies found that the major...
Evgeniy Gabrilovich, Shaul Markovitch