Sciweavers

1768 search results - page 113 / 354
» Mining Very Large Databases
Sort
View
163
Voted
KDD
2012
ACM
205views Data Mining» more  KDD 2012»
13 years 5 months ago
Searching and mining trillions of time series subsequences under dynamic time warping
Most time series data mining algorithms use similarity search as a core subroutine, and thus the time taken for similarity search is the bottleneck for virtually all time series d...
Thanawin Rakthanmanon, Bilson J. L. Campana, Abdul...
223
Voted
ICDE
2009
IEEE
121views Database» more  ICDE 2009»
16 years 4 months ago
Large-Scale Deduplication with Constraints Using Dedupalog
We present a declarative framework for collective deduplication of entity references in the presence of constraints. Constraints occur naturally in many data cleaning domains and c...
Arvind Arasu, Christopher Ré, Dan Suciu
99
Voted
EUSFLAT
2007
105views Fuzzy Logic» more  EUSFLAT 2007»
15 years 4 months ago
SPoID: Do Not Throw Meaningful Incomplete Sequences Away!
Industrial databases often contain a large amount of unfilled information. During the knowledge discovery process one processing step is often necessary in order to remove these ...
Céline Fiot, Anne Laurent, Maguelonne Teiss...
128
Voted
BIRTHDAY
2005
Springer
15 years 8 months ago
Toward Automated Large-Scale Information Integration and Discovery
The high cost of data consolidation is the key market inhibitor to the adoption of traditional information integration and data warehousing solutions. In this paper, we outline a n...
Paul Brown, Peter J. Haas, Jussi Myllymaki, Hamid ...
VLDB
2007
ACM
128views Database» more  VLDB 2007»
16 years 2 months ago
Periscope/SQ: Interactive Exploration of Biological Sequence Databases
Life science laboratories today have to rely on procedural techniques to store and manage large sequence datasets. Procedural techniques are cumbersome to use and are often very i...
Sandeep Tata, Willis Lang, Jignesh M. Patel