Sponsor
Portland State University. Department of Computer Science
First Advisor
David Maier
Date of Publication
1-1-2011
Document Type
Thesis
Degree Name
Master of Science (M.S.) in Computer Science
Department
Computer Science
Language
English
Subjects
Database management -- Technological innovations, Streaming technology (Telecommunications)
DOI
10.15760/etd.161
Physical Description
1 online resource (viii, 78 p.) : ill. (some col.)
Abstract
Obtaining low-latency results from window-aggregate queries can be critical to certain data-stream processing applications. Due to a DSMS's lack of control over incoming data (typically, because of delays and bursts in data arrival), timely results for a window-aggregate query over a data stream cannot be obtained with guarantees about the results' accuracy. In this thesis, I propose a technique, which I term prodding, to obtain early result estimates for window-aggregate queries over data streams. The early estimates are obtained in addition to the regular query results. The proposed technique aims to maximize the contribution to a result-estimate computation from all the stateful operators across a multi-level query plan. I evaluate the benefits of prodding using real-world and generated data streams having different patterns in data arrival and data values. I conclude that, in various DSMS applications, prodding can generate low-latency estimates to window-aggregate query results. The main factors affecting the degree of inaccuracy in such estimates are: the aggregate function used in a query, the patterns in arrivals and values of stream data, and the aggressiveness of demanding the estimates. The utility of the estimates obtained using prodding should be optimized by tuning the aggressiveness in result-estimate demands to the specific latency and accuracy needs of a business, considering any available knowledge about patterns in the incoming data.
Rights
In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
Persistent Identifier
http://archives.pdx.edu/ds/psu/6942
Recommended Citation
Bhat, Amit, "Low-latency Estimates for Window-Aggregate Queries over Data Streams" (2011). Dissertations and Theses. Paper 161.
https://doi.org/10.15760/etd.161
Comments
Portland State University. Dept. of Computer Science