Sciweavers

2137 search results - page 117 / 428
» Extraction of Structural Information from the Web
Sort
View
WWW
2010
ACM
15 years 11 months ago
CETR: content extraction via tag ratios
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Tim Weninger, William H. Hsu, Jiawei Han
SIGMOD
2006
ACM
202views Database» more  SIGMOD 2006»
16 years 4 months ago
Avatar semantic search: a database approach to information retrieval
We present Avatar Semantic Search, a prototype search engine that exploits annotations in the context of classical keyword search. The process of annotations is accomplished offli...
Eser Kandogan, Rajasekar Krishnamurthy, Sriram Rag...
131
Voted
PRIS
2004
15 years 5 months ago
Learning Text Extraction Rules, without Ignoring Stop Words
Information Extraction (IE) from text /web documents has become an important application area of AI. As the number of web sites and documents has grown dramatically, the users need...
João Cordeiro, Pavel Brazdil
WISE
2002
Springer
15 years 9 months ago
Querying Web Data - The WebQA Approach
The common paradigm of searching and retrieving information on the Web is based on keyword-based search using one or more search engines, and then browsing through the large numbe...
Sunny K. S. Lam, M. Tamer Özsu
EMNLP
2007
15 years 6 months ago
Large-Scale Named Entity Disambiguation Based on Wikipedia Data
This paper presents a large-scale system for the recognition and semantic disambiguation of named entities based on information extracted from a large encyclopedic collection and ...
Silviu Cucerzan