Published In
Proceedings of the VLDB Endowment
Document Type
Article
Publication Date
10-2016
Subjects
Indexing--Data processing, Database management
Abstract
Sensing devices generate tremendous amounts of data each day, which include large quantities of multi-dimensional measurements. These data are expected to be immediately available for real-time analytics as they are streamed into storage. Such scenarios pose challenges to state-of-the-art indexing methods, as they must not only support efficient queries but also frequent updates. We propose here a novel indexing method that ingests multi-dimensional observational data in real time. This method primarily guarantees extremely high throughput for data ingestion, while it can be continuously refined in the background to improve query efficiency. Instead of representing collections of points using Minimal Bounding Boxes as in conventional indexes, we model sets of successive points as line segments in hyperspaces, by exploiting the intrinsic value continuity in observational data. This representation reduces the number of index entries and drastically reduces "over-coverage" by entries. Experimental results show that our approach handles real-world workloads gracefully, providing both low-overhead indexing and excellent query efficiency.
DOI
10.14778/3007328.3007334
Persistent Identifier
http://archives.pdx.edu/ds/psu/19396
Citation Details
Wang, S., Maier, D., & Ooi, B. C. (2016). Fast and adaptive indexing of multi-dimensional observational data. Proceedings of the VLDB Endowment, 9(14), 1683-1694.
Description
Copyright 2016 VLDB Endowment. This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered, transformed, or built upon in any way.