Sciweavers

62 search results - page 9 / 13
» Learning Page-Independent Heuristics for Extracting Data fro...
Sort
View
DEEC
2006
IEEE
14 years 1 months ago
Maintaining Web Navigation Flows for Wrappers
A substantial subset of the web data follows some kind of underlying structure. In order to let software programs gain full benefit from these “semistructured” web sources, wra...
Juan Raposo, Manuel Álvarez, José Lo...
AIPRF
2007
13 years 9 months ago
Evaluation of Different Approaches to Training a Genre Classifier
This paper presents experiments on classifying web pages by genre. Firstly, a corpus of 1539 manually labeled web pages was prepared. Secondly, 502 genre features were selected ba...
Vedrana Vidulin, Mitja Lustrek, Matjaz Gams
AGENTS
1997
Springer
13 years 11 months ago
A Scalable Comparison-Shopping Agent for the World-Wide Web
The World-Wide-Web is less agent-friendly than we might hope. Most information on the Web is presented in loosely structured natural language text with no agent-readable semantics...
Robert B. Doorenbos, Oren Etzioni, Daniel S. Weld
WWW
2008
ACM
14 years 8 months ago
As we may perceive: finding the boundaries of compound documents on the web
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Pavel Dmitriev
IJDMMM
2008
124views more  IJDMMM 2008»
13 years 7 months ago
Completing missing views for multiple sources of web media
: Combining multiple data sources, each with its own features, to achieve optimal inference has received a lot of attention in recent years. In inference from multiple data sources...
Shankara B. Subramanya, Zheshen Wang, Baoxin Li, H...