Advisor

David Maier

Date of Award

1-1-2011

Document Type

Thesis

Degree Name

Master of Science (M.S.) in Computer Science

Department

Computer Science

Physical Description

1 online resource (viii, 78 p.) : ill. (some col.)

Subjects

Database management -- Technological innovations, Streaming technology (Telecommunications)

DOI

10.15760/etd.161

Abstract

Obtaining low-latency results from window-aggregate queries can be critical to certain data-stream processing applications. Due to a DSMS's lack of control over incoming data (typically, because of delays and bursts in data arrival), timely results for a window-aggregate query over a data stream cannot be obtained with guarantees about the results' accuracy. In this thesis, I propose a technique, which I term prodding, to obtain early result estimates for window-aggregate queries over data streams. The early estimates are obtained in addition to the regular query results. The proposed technique aims to maximize the contribution to a result-estimate computation from all the stateful operators across a multi-level query plan. I evaluate the benefits of prodding using real-world and generated data streams having different patterns in data arrival and data values. I conclude that, in various DSMS applications, prodding can generate low-latency estimates to window-aggregate query results. The main factors affecting the degree of inaccuracy in such estimates are: the aggregate function used in a query, the patterns in arrivals and values of stream data, and the aggressiveness of demanding the estimates. The utility of the estimates obtained using prodding should be optimized by tuning the aggressiveness in result-estimate demands to the specific latency and accuracy needs of a business, considering any available knowledge about patterns in the incoming data.

Description

Portland State University. Dept. of Computer Science

Persistent Identifier

http://archives.pdx.edu/ds/psu/6942

Share

COinS