Recent work in data integration has shown the importance of statistical information about the coverage and overlap of data sources for efficient query processing. Gathering and s...
Query optimization in data integration requires source coverage and overlap statistics. Gathering and storing the required statistics presents many challenges, not the least of wh...
One of the major problems in biological data integration is that many data sources are stored as flat-files, with a variety of different layouts. Integrating data from such sour...
Kaushik Sinha, Xuan Zhang, Ruoming Jin, Gagan Agra...
A major obstacle to fully integrated deployment of many data mining algorithms is the assumption that data sits in a single table, even though most real-world databases have compl...
Alexandrin Popescul, Lyle H. Ungar, Steve Lawrence...
Background: Expressed sequence tag (EST) collections are composed of a high number of single-pass, redundant, partial sequences, which need to be processed, clustered, and annotat...