Sciweavers

232 search results - page 41 / 47
» Query-related data extraction of hidden web documents
Sort
View
ICDE
2003
IEEE
139views Database» more  ICDE 2003»
14 years 9 months ago
Super-Fast XML Wrapper Generation in DB2: A Demonstration
The XML Wrapper is a new feature of the federated database capabilities of DB2/UDB v8. It enables users and applications to issue SQL queries against XML data from a variety of so...
Vanja Josifovski, Sabine Massmann, Felix Naumann
WWW
2005
ACM
14 years 8 months ago
A search engine for natural language applications
Many modern natural language-processing applications utilize search engines to locate large numbers of Web documents or to compute statistics over the Web corpus. Yet Web search e...
Michael J. Cafarella, Oren Etzioni
WSDM
2010
ACM
215views Data Mining» more  WSDM 2010»
14 years 5 months ago
Boilerplate Detection using Shallow Text Features
In addition to the actual content Web pages consist of navigational elements, templates, and advertisements. This boilerplate text typically is not related to the main content, ma...
Christian Kohlschütter, Peter Fankhauser, Wol...
COLING
2010
13 years 2 months ago
Efficient Statement Identification for Automatic Market Forecasting
Strategic business decision making involves the analysis of market forecasts. Today, the identification and aggregation of relevant market statements is done by human experts, oft...
Henning Wachsmuth, Peter Prettenhofer, Benno Stein
MKM
2004
Springer
14 years 29 days ago
A Graph-Based Approach Towards Discerning Inherent Structures in a Digital Library of Formal Mathematics
As the amount of online formal mathematical content grows, for example through active efforts such as the Mathweb [21], MOWGLI [4], Formal Digital Library, or FDL [1], and others, ...
Lori Lorigo, Jon M. Kleinberg, Richard Eaton, Robe...