Efficient Metadata Generation to Enable Interactive Data Discovery over Large-Scale Scientific Data Collections

15 years 4 months ago

Download granules.cs.colostate.edu

Discovering the correct dataset efficiently is critical for computations and effective simulations in scientific experiments. In contrast to searching web documents over the Internet, massive binary datasets are difficult to browse or search. Users must select a reliable data publisher from the large collection of data services available over the Internet. Once a publisher is selected, the user must then discover the dataset that matches the computation's needs, among tens of thousands of large data packages that are available. Some of the data hosting services provide advanced data search interfaces but their search scope is often limited to local datasets. Because scientific datasets are often encoded as binary data formats, querying or validating missing data over hundreds of Megabytes of a binary file involves a compute intensive decoding process. We have developed a system, GLEAN, that provides an efficient data discovery environment for users in scientific computing. Fine-gr...

Sangmi Lee Pallickara, Shrideep Pallickara, Milija

Real-time Traffic

Atmospheric Sciences | CLOUDCOM 2010 | Distributed And Parallel Computing | Efficient Data | Massive Binary Datasets |

claim paper

» Browsing large scale cheminformatics data with dimension reduction

» LabKey Server An open source platform for scientific data integration analysis and collabo...

» Line graph explorer scalable display of line graphs using FocusContext

» SelfOrganizing Data Mining

Post Info
More Details (n/a)

Added	10 Feb 2011
Updated	10 Feb 2011
Type	Journal
Year	2010
Where	CLOUDCOM
Authors	Sangmi Lee Pallickara, Shrideep Pallickara, Milija Zupanski, Stephen Sullivan

Comments (0)

Sciweavers

Efficient Metadata Generation to Enable Interactive Data Discovery over Large-Scale Scientific Data Collections

Atmospheric Sciences | CLOUDCOM 2010 | Distributed And Parallel Computing | Efficient Data | Massive Binary Datasets |

Explore & Download

Productivity Tools

Sciweavers