Sciweavers

543 search results - page 22 / 109
» Exploiting content redundancy for web information extraction
Sort
View
ECIR
2009
Springer
13 years 6 months ago
PathRank: Web Page Retrieval with Navigation Path
Abstract. This paper describes a path-based method to use the multi-step navigation information discovered from website structures for web page ranking. Use of hyperlinks to enhanc...
Jianqiang Li, Yu Zhao 0002
ECAI
2006
Springer
14 years 13 days ago
Identifying Inter-Domain Similarities Through Content-Based Analysis of Hierarchical Web-Directories
Providing accurate personalized information services to the users requires knowing their interests and needs, as defined by their User Models (UMs). Since the quality of the person...
Shlomo Berkovsky, Dan Goldwasser, Tsvi Kuflik, Fra...
AUSAI
2003
Springer
14 years 2 months ago
Information Extraction via Path Merging
Abstract. In this paper, we describe a new approach to information extraction that neatly integrates top-down hypothesis driven information with bottom-up data driven information. ...
Robert Dale, Cécile Paris, Marc Tilbrook
WSDM
2012
ACM
214views Data Mining» more  WSDM 2012»
12 years 4 months ago
Selecting actions for resource-bounded information extraction using reinforcement learning
Given a database with missing or uncertain content, our goal is to correct and fill the database by extracting specific information from a large corpus such as the Web, and to d...
Pallika H. Kanani, Andrew K. McCallum
WWW
2007
ACM
14 years 9 months ago
Organizing and searching the world wide web of facts -- step two: harnessing the wisdom of the crowds
As part of a large effort to acquire large repositories of facts from unstructured text on the Web, a seed-based framework for textual information extraction allows for weakly sup...
Marius Pasca