Sciweavers

468 search results - page 87 / 94
» Automatic Data Extraction from Data-Rich Web Pages
Sort
View
IUI
2009
ACM
14 years 4 months ago
Using salience to segment desktop activity into projects
Knowledge workers must manage large numbers of simultaneous, ongoing projects that collectively involve huge numbers of resources (documents, emails, web pages, calendar items, et...
Daniel Lowd, Nicholas Kushmerick
AND
2009
13 years 5 months ago
Digital weight watching: reconstruction of scanned documents
A web-portal providing access to over 250.000 scanned and OCRed cultural heritage documents is analyzed. The collection consists of the complete Dutch Hansard from 1917 to 1995. E...
Tim Gielissen, Maarten Marx
VLDB
2005
ACM
177views Database» more  VLDB 2005»
14 years 1 months ago
Discovering Large Dense Subgraphs in Massive Graphs
We present a new algorithm for finding large, dense subgraphs in massive graphs. Our algorithm is based on a recursive application of fingerprinting via shingles, and is extreme...
David Gibson, Ravi Kumar, Andrew Tomkins
TREC
2008
13 years 9 months ago
UTDallas at TREC 2008 Blog Track
This paper describes our participation in the 2008 TREC Blog track. Our system consists of 3 components: data preprocessing, topic retrieval, and opinion finding. In the topic ret...
Bin Li, Feifan Liu, Yang Liu
ICDE
2008
IEEE
167views Database» more  ICDE 2008»
14 years 9 months ago
Building Community Wikipedias: A Machine-Human Partnership Approach
Abstract-- The rapid growth of Web communities has motivated many solutions for building community data portals. These solutions follow roughly two approaches. The first approach (...
Pedro DeRose, Xiaoyong Chai, Byron J. Gao, Warren ...