Sciweavers

1127 search results - page 45 / 226
» Web-scale extraction of structured data
Sort
View
ACL
2012
11 years 11 months ago
Movie-DiC: a Movie Dialogue Corpus for Research and Development
This paper describes Movie-DiC a Movie Dialogue Corpus recently collected for research and development purposes. The collected dataset comprises 132,229 dialogues containing a tot...
Rafael E. Banchs
VLDB
2007
ACM
134views Database» more  VLDB 2007»
14 years 2 months ago
Building Structured Web Community Portals: A Top-Down, Compositional, and Incremental Approach
Structured community portals extract and integrate information from raw Web pages to present a unified view of entities and relationships in the community. In this paper we argue...
Pedro DeRose, Warren Shen, Fei Chen 0002, AnHai Do...
ICDAR
2003
IEEE
14 years 2 months ago
Document Transformation System from Papers to XML Data Based on Pivot XML Document Method
This paper proposes a new method for document transformation using OCR to generate various XML documents from printed documents. The proposed method adopts a hierarchical transfor...
Yasuto Ishitani
ICCV
2011
IEEE
12 years 8 months ago
Latent Low-Rank Representation for Subspace Segmentation and Feature Extraction
Low-Rank Representation (LRR) [16, 17] is an effective method for exploring the multiple subspace structures of data. Usually, the observed data matrix itself is chosen as the dic...
Guangcan Liu, Shuicheng Yan
MKM
2004
Springer
14 years 2 months ago
A Graph-Based Approach Towards Discerning Inherent Structures in a Digital Library of Formal Mathematics
As the amount of online formal mathematical content grows, for example through active efforts such as the Mathweb [21], MOWGLI [4], Formal Digital Library, or FDL [1], and others, ...
Lori Lorigo, Jon M. Kleinberg, Richard Eaton, Robe...