Sciweavers

1670 search results - page 276 / 334
» Using Information Filtering in Web Data Mining Process
Sort
View
KDD
2002
ACM
130views Data Mining» more  KDD 2002»
14 years 9 months ago
Learning domain-independent string transformation weights for high accuracy object identification
The task of object identification occurs when integrating information from multiple websites. The same data objects can exist in inconsistent text formats across sites, making it ...
Sheila Tejada, Craig A. Knoblock, Steven Minton
KDD
2007
ACM
193views Data Mining» more  KDD 2007»
14 years 9 months ago
Joint optimization of wrapper generation and template detection
Many websites have large collections of pages generated dynamically from an underlying structured source like a database. The data of a category are typically encoded into similar...
Shuyi Zheng, Ruihua Song, Ji-Rong Wen, Di Wu
SIGIR
2012
ACM
11 years 11 months ago
Predicting quality flaws in user-generated content: the case of wikipedia
The detection and improvement of low-quality information is a key concern in Web applications that are based on user-generated content; a popular example is the online encyclopedi...
Maik Anderka, Benno Stein, Nedim Lipka
ADC
2005
Springer
135views Database» more  ADC 2005»
14 years 2 months ago
A Path-based Relational RDF Database
We propose a path-based scheme for storage and retrieval of RDF data using a relational database. The Semantic Web is much anticipated as the nextgeneration web where high-level p...
Akiyoshi Matono, Toshiyuki Amagasa, Masatoshi Yosh...
DOCENG
2009
ACM
14 years 3 months ago
Object-level document analysis of PDF files
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
Tamir Hassan