Sciweavers

910 search results - page 78 / 182
» Testbed for information extraction from deep web
Sort
View
CIKM
2006
Springer
13 years 12 months ago
A fast and robust method for web page template detection and removal
The widespread use of templates on the Web is considered harmful for two main reasons. Not only do they compromise the relevance judgment of many web IR and web mining methods suc...
Karane Vieira, Altigran Soares da Silva, Nick Pint...
SIGIR
2002
ACM
13 years 7 months ago
Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering
A novel method for simultaneous keyphrase extraction and generic text summarization is proposed by modeling text documents as weighted undirected and weighted bipartite graphs. Sp...
Hongyuan Zha
AUSAI
2003
Springer
14 years 1 months ago
Semi-Automatic Construction of Metadata from a Series of Web Documents
Metadata plays an important role in discovering, collecting, extracting and aggregating Web data. This paper proposes a method of constructing metadata for a specific topic. The m...
Sachio Hirokawa, Eisuke Itoh, Tetsuhiro Miyahara
IJCAI
2003
13 years 9 months ago
Trainability: Developing a responsive learning system
In this paper, we describe the lessons we learned in developing AgentBuilder, a commercial system for rapidly creating agents that extract information from web sites. AgentBuilder...
Steven Minton, Sorinel I. Ticrea, Jennifer Beach
AAAI
2000
13 years 9 months ago
Learning the Common Structure of Data
The proliferation of online information sources has accentuated the need for tools that automatically validate and recognize data. We present an efficient algorithm that learns st...
Kristina Lerman, Steven Minton