Sciweavers

832 search results - page 106 / 167
» Document clustering with committees
Sort
View
DEXAW
2008
IEEE
123views Database» more  DEXAW 2008»
14 years 3 months ago
Text Extraction from the Web via Text-to-Tag Ratio
– We describe a method to extract content text from diverse Web pages by using the HTML document’s Text-to-Tag Ratio rather than specific HTML cues that may not be constant acr...
Tim Weninger, William H. Hsu
IRAL
2003
ACM
14 years 2 months ago
A practical text summarizer by paragraph extraction for Thai
In this paper, we propose a practical approach for extracting the most relevant paragraphs from the original document to form a summary for Thai text. The idea of our approach is ...
Chuleerat Jaruskulchai, Canasai Kruengkrai
NLDB
2010
Springer
14 years 1 months ago
Semantic Content Access Using Domain-Independent NLP Ontologies
We present a lightweight, user-centred approach for document navigation and analysis that is based on an ontology of text mining results. This allows us to bring the result of exis...
René Witte, Ralf Krestel
KI
2008
Springer
13 years 9 months ago
Interactive Dynamic Information Extraction
The IDEX system is a prototype of an interactive dynamic Information Extraction (IE) system. A user of the system expresses an information request for a topic description which is ...
Kathrin Eichler, Holmer Hemsen, Markus Löckel...
CLEF
2010
Springer
13 years 10 months ago
MapReduce for Information Retrieval Evaluation: "Let's Quickly Test This on 12 TB of Data"
We propose to use MapReduce to quickly test new retrieval approaches on a cluster of machines by sequentially scanning all documents. We present a small case study in which we use ...
Djoerd Hiemstra, Claudia Hauff