Sciweavers

43 search results - page 5 / 9
» Scalable Attribute-Value Extraction from Semi-structured Tex...
Sort
View
DL
2000
Springer
162views Digital Library» more  DL 2000»
14 years 1 months ago
Snowball: extracting relations from large plain-text collections
Text documents often contain valuable structured data that is hidden in regular English sentences. This data is best exploited if available as a relational table that we could use...
Eugene Agichtein, Luis Gravano
EDBT
2009
ACM
123views Database» more  EDBT 2009»
14 years 4 months ago
High-performance information extraction with AliBaba
A wealth of information is available only in web pages, patents, publications etc. Extracting information from such sources is challenging, both due to the typically complex langu...
Peter Palaga, Long Nguyen, Ulf Leser, Jörg Ha...
PAKDD
2009
ACM
116views Data Mining» more  PAKDD 2009»
14 years 4 months ago
Scalable Web Mining with Newistic
Abstract. Newistic is a web mining platform that collects and analyses documents crawled from the Internet. Although it currently processes news articles, it can be easily adapted ...
Ovidiu Dan, Horatiu Mocian
DL
2000
Springer
164views Digital Library» more  DL 2000»
14 years 1 months ago
Scalable browsing for large collections: a case study
Phrase browsing techniques use phrases extracted automatically from a large information collection as a basis for browsing and accessing it. This paper describes a case study that...
Gordon W. Paynter, Ian H. Witten, Sally Jo Cunning...
CLEF
2010
Springer
13 years 10 months ago
A Textual-Based Similarity Approach for Efficient and Scalable External Plagiarism Analysis - Lab Report for PAN at CLEF 2010
In this paper we present an approach to detect external plagiarism based on textual similarity. This is an efficient and precise method that can be applied over large sets of docum...
Daniel Micol, Óscar Ferrández, Ferna...