Sciweavers

1328 search results - page 127 / 266
» Stacked Generalization for Information Extraction
Sort
View
WWW
2005
ACM
14 years 8 months ago
Thresher: automating the unwrapping of semantic content from the World Wide Web
We describe Thresher, a system that lets non-technical users teach their browsers how to extract semantic web content from HTML documents on the World Wide Web. Users specify exam...
Andrew Hogue, David R. Karger
WWW
2005
ACM
14 years 8 months ago
The volume and evolution of web page templates
Web pages contain a combination of unique content and template material, which is present across multiple pages and used primarily for formatting, navigation, and branding. We stu...
David Gibson, Kunal Punera, Andrew Tomkins
DOLAP
2000
ACM
14 years 13 days ago
Comparing Nested GPSJ Queries in Multidimensional Databases
A multidimensional database can be seen as a collection of multidimensional cubes, from which information is usually extracted by aggregation; aggregated data can be calculated ei...
Matteo Golfarelli, Stefano Rizzi
CLEF
2008
Springer
13 years 10 months ago
Assessing the Impact of Thesaurus-Based Expansion Techniques in QA-Centric IR
In this paper, we assess the impact of using thesaurus-based query expansion methods, at the Information Retrieval (IR) stage of a Question Answering (QA) system. We focus on expan...
Luís Sarmento, Jorge Teixeira, Eugén...
HIS
2003
13 years 9 months ago
Data Mining of Web Access Logs From an Academic Web Site
We have used a general purpose data mining tool to determine whether we can find any ‘golden nuggets’ in the web access logs of a large academic web site. Our goal was to use...
Victor Ciesielski, A. Lalani