Sciweavers

ECAI
2000
Springer
14 years 3 months ago
Background Knowledge, Indexing and Matching Interdependencies of Document Management and Ontology-Maintenance
This position paper presents an algorithm, which determines similarities between text documents. These text documents are indexed with keywords and further background knowledge-ter...
Andreas Faatz, Thomas Kamps, Ralf Steinmetz
HT
2003
ACM
14 years 4 months ago
Untangling compound documents on the web
Most text analysis is designed to deal with the concept of a “document”, namely a cohesive presentation of thought on a unifying subject. By contrast, individual nodes on the ...
Nadav Eiron, Kevin S. McCurley
SPIRE
2004
Springer
14 years 4 months ago
Indexing Text Documents Based on Topic Identification
This work provides algorithms and heuristics to index text documents by determining important topics in the documents. To index text documents, the work provides algorithms to gene...
Manonton Butarbutar, Susan McRoy
ISI
2004
Springer
14 years 4 months ago
Generating Concept Hierarchies from Text for Intelligence Analysis
It is important to automatically extract key information from sensitive text documents for intelligence analysis. Text documents are usually unstructured and information extraction...
Jenq-Haur Wang, Chien-Chung Huang, Jei-Wen Teng, L...
CIKM
2005
Springer
14 years 5 months ago
Inferring document similarity from hyperlinks
Assessing semantic similarity between text documents is a crucial aspect in Information Retrieval systems. In this work, we propose to use hyperlink information to derive a simila...
David Grangier, Samy Bengio
ICMCS
2005
IEEE
126views Multimedia» more  ICMCS 2005»
14 years 5 months ago
Protocols for data-hiding based text document security and automatic processing
Text documents, in electronic and hardcopy forms, are and will probably remain the most widely used kind of content in our digital age. The goal of this paper is to overview proto...
Frédéric Deguillaume, Yuriy Rytsar, ...
ICDAR
2005
IEEE
14 years 5 months ago
A Model for Detecting and Merging Vertically Spanned Table Cells in Plain Text Documents
A spanned cell in a table is a single, complete unit that physically occupies multiple columns and/or multiple rows. Spanned cells are common in tables, and they are a significan...
Vanessa Long, Robert Dale, Steve Cassidy
AH
2008
Springer
14 years 5 months ago
Collection Browsing through Automatic Hierarchical Tagging
In order to navigate huge document collections efficiently, tagged hierarchical structures can be used. For users, it is important to correctly interpret tag combinations. In this ...
Korinna Bade, Marcel Hermkes
KDD
2007
ACM
136views Data Mining» more  KDD 2007»
14 years 12 months ago
Information genealogy: uncovering the flow of ideas in non-hyperlinked document databases
We now have incrementally-grown databases of text documents ranging back for over a decade in areas ranging from personal email, to news-articles and conference proceedings. While...
Benyah Shaparenko, Thorsten Joachims
ICML
2005
IEEE
15 years 7 days ago
Hierarchical Dirichlet model for document classification
The proliferation of text documents on the web as well as within institutions necessitates their convenient organization to enable efficient retrieval of information. Although tex...
Sriharsha Veeramachaneni, Diego Sona, Paolo Avesan...