Sciweavers

684 search results - page 25 / 137
» Extracting semantic structure of web documents using content...
Sort
View
SIGIR
2003
ACM
14 years 1 months ago
Text categorization by boosting automatically extracted concepts
Term-based representations of documents have found widespread use in information retrieval. However, one of the main shortcomings of such methods is that they largely disregard le...
Lijuan Cai, Thomas Hofmann
AINA
2009
IEEE
14 years 3 months ago
Learning to Extract Content from News Webpages
We consider the problem of content extraction from online news webpages. To explore to what extent the syntactic markup and the visual structure of a webpage facilitate the extrac...
Alex Spengler, Patrick Gallinari
SIGIR
1993
ACM
14 years 23 days ago
A Model of Information Retrieval Based on a Terminological Logic
According to the logical model of Information Retrieval (IR), the task of IR can be described as the extraction, from a given document base, of those documents d that, given a que...
Carlo Meghini, Fabrizio Sebastiani, Umberto Stracc...
EPIA
2003
Springer
14 years 1 months ago
A Methodology to Create Ontology-Based Information Retrieval Systems
Modern information retrieval systems need the capability to reason about the knowledge conveyed by text bases. In this paper a methodology to automatically create ontologies and cl...
José Saias, Paulo Quaresma
EP
1998
Springer
14 years 27 days ago
Measuring Structural Similarity Among Web Documents: Preliminary Results
When we describe a Web page informally, we often use phrases like it looks like a newspaper site", there are several unordered lists" or it's just a collection of li...
Isabel F. Cruz, Slava Borisov, Michael A. Marks, T...