Sciweavers

368 search results - page 6 / 74
» Template-Based Information Mining from HTML Documents
Sort
View
ISMIS
2003
Springer
14 years 16 days ago
MetaNews: An Information Agent for Gathering News Articles on the Web
This paper presents MetaNews, an information gathering agent for news articles on the Web. MetaNews reads HTML documents from online news sites and extracts article information fro...
Dae-Ki Kang, Joongmin Choi
WEBI
2005
Springer
14 years 24 days ago
Automated Metadata and Instance Extraction from News Web Sites
In this paper, we present automated techniques for extracting metadata instance information by organizing and mining a set of news Web sites. We develop algorithms that detect and...
Srinivas Vadrevu, Saravanakumar Nagarajan, Fatih G...
WWW
2004
ACM
14 years 8 months ago
OntoMiner: bootstrapping ontologies from overlapping domain specific web sites
In this paper, we present automated techniques for bootstrapping and populating specialized domain ontologies by organizing and mining a set of relevant overlapping Web sites prov...
Hasan Davulcu, Srinivas Vadrevu, Saravanakumar Nag...
ICML
2002
IEEE
14 years 8 months ago
Kernels for Semi-Structured Data
Semi-structured data such as XML and HTML is attracting considerable attention. It is important to develop various kinds of data mining techniques that can handle semistructured d...
Hisashi Kashima, Teruo Koyanagi
ICDM
2002
IEEE
162views Data Mining» more  ICDM 2002»
14 years 8 days ago
Recognition of Common Areas in a Web Page Using Visual Information: a possible application in a page classification
Extracting and processing information from web pages is an important task in many areas like constructing search engines, information retrieval, and data mining from the Web. Comm...
Milos Kovacevic, Michelangelo Diligenti, Marco Gor...