Sciweavers

502 search results - page 77 / 101
» Extracting Partial Structures from HTML Documents
Sort
View
IPM
2007
95views more  IPM 2007»
13 years 9 months ago
Using structural contexts to compress semistructured text collections
We describe a compression model for semistructured documents, called Structural Contexts Model (SCM), which takes advantage of the context information usually implicit in the stru...
Joaquín Adiego, Gonzalo Navarro, Pablo de l...
PAMI
2006
95views more  PAMI 2006»
13 years 9 months ago
Table Detection in Online Ink Notes
In documents, tables are important structured objects that present statistical and relational information. In this paper, we present a robust system which is capable of detecting t...
Zhouchen Lin, Junfeng He, Zhicheng Zhong, Rongrong...
AGENTS
1997
Springer
14 years 1 months ago
A Scalable Comparison-Shopping Agent for the World-Wide Web
The World-Wide-Web is less agent-friendly than we might hope. Most information on the Web is presented in loosely structured natural language text with no agent-readable semantics...
Robert B. Doorenbos, Oren Etzioni, Daniel S. Weld
SIGIR
1993
ACM
14 years 1 months ago
A Model of Information Retrieval Based on a Terminological Logic
According to the logical model of Information Retrieval (IR), the task of IR can be described as the extraction, from a given document base, of those documents d that, given a que...
Carlo Meghini, Fabrizio Sebastiani, Umberto Stracc...
ICPR
2010
IEEE
13 years 10 months ago
Images in News
A system, called NewsStand, is introduced that automatically extracts images from news articles. The system takes RSS feeds of news article and applies an online clustering algori...
Jagan Sankaranarayanan, Hanan Samet