Schema Extraction and Structural Outlier Detection for JSON-based NoSQL Data Stores

10 years 1 months ago

Download www.informatik.uni-rostock.de

: Although most NoSQL Data Stores are schema-less, information on the structural properties of the persisted data is nevertheless essential during application development. Otherwise, accessing the data becomes simply impractical. In this paper, we introduce an algorithm for schema extraction that is operating outside of the NoSQL data store. Our method is speciﬁcally targeted at semi-structured data persisted in NoSQL stores, e.g., in JSON format. Rather than designing the schema up front, extracting a schema in hindsight can be seen as a reverse-engineering step. Based on the extracted schema information, we propose set of similarity measures that capture the degree of heterogeneity of JSON data and which reveal structural outliers in the data. We evaluate our implementation on two real-life datasets: a database from the Wendelstein 7-X project and Web Performance Data.

Meike Klettke, Uta Störl, Stefanie Scherzinge

Real-time Traffic

BTW 2015 | Database |

claim paper

Post Info
More Details (n/a)

Added	17 Apr 2016
Updated	17 Apr 2016
Type	Journal
Year	2015
Where	BTW
Authors	Meike Klettke, Uta Störl, Stefanie Scherzinger

Comments (0)

Sciweavers

Schema Extraction and Structural Outlier Detection for JSON-based NoSQL Data Stores

BTW 2015 | Database |

Explore & Download

Productivity Tools

Sciweavers