Sciweavers

1127 search results - page 45 / 226
» Web-scale extraction of structured data
Sort
View
122
Voted
ACL
2012
13 years 5 months ago
Movie-DiC: a Movie Dialogue Corpus for Research and Development
This paper describes Movie-DiC a Movie Dialogue Corpus recently collected for research and development purposes. The collected dataset comprises 132,229 dialogues containing a tot...
Rafael E. Banchs
117
Voted
VLDB
2007
ACM
134views Database» more  VLDB 2007»
15 years 8 months ago
Building Structured Web Community Portals: A Top-Down, Compositional, and Incremental Approach
Structured community portals extract and integrate information from raw Web pages to present a unified view of entities and relationships in the community. In this paper we argue...
Pedro DeRose, Warren Shen, Fei Chen 0002, AnHai Do...
132
Voted
ICDAR
2003
IEEE
15 years 7 months ago
Document Transformation System from Papers to XML Data Based on Pivot XML Document Method
This paper proposes a new method for document transformation using OCR to generate various XML documents from printed documents. The proposed method adopts a hierarchical transfor...
Yasuto Ishitani
157
Voted
ICCV
2011
IEEE
14 years 2 months ago
Latent Low-Rank Representation for Subspace Segmentation and Feature Extraction
Low-Rank Representation (LRR) [16, 17] is an effective method for exploring the multiple subspace structures of data. Usually, the observed data matrix itself is chosen as the dic...
Guangcan Liu, Shuicheng Yan
155
Voted
MKM
2004
Springer
15 years 8 months ago
A Graph-Based Approach Towards Discerning Inherent Structures in a Digital Library of Formal Mathematics
As the amount of online formal mathematical content grows, for example through active efforts such as the Mathweb [21], MOWGLI [4], Formal Digital Library, or FDL [1], and others, ...
Lori Lorigo, Jon M. Kleinberg, Richard Eaton, Robe...