Advisor

David Maier

Date of Award

10-2008

Document Type

Dissertation

Degree Name

Doctor of Philosophy (Ph.D.) in Computer Science

Department

Computer Science

Physical Description

1 online resource (vi, 191 pages)

Subjects

Querying (Computer science), Streaming technology (Telecommunications), Electronic data processing

DOI

10.15760/etd.2668

Abstract

Evaluating queries over data streams has become an appealing way to support various stream-processing applications. Window queries are commonly used in many stream applications. In a window query, certain query operators, especially blocking operators and stateful operators, appear in their windowed versions. Previous research work in evaluating window queries typically requires ordered streams and this order requirement limits the implementations of window operators and also carries performance penalties. This thesis presents efficient and flexible algorithms for evaluating window queries. We first present a new data model for streams, progressing streams, that separates stream progress from physical-arrival order. Then, we present our window semantic definitions for the most commonly used window operators—window aggregation and window join. Unlike previous research that often requires ordered streams when describing window semantics, our window semantic definitions do not rely on physical-stream arrival properties. Based on the window semantic definitions, we present new implementations of window aggregation and window join, WID and OA-Join. Compared to the existing implementations of stream query operators, our implementations do not require special stream-arrival properties, particularly stream order. In addition, for window aggregation, we present two other implementations extended from WID, Paned-WID and AdaptWID, to improve excution time by sharing sub-aggregates and to improve memory usage for input with data distribution skew, respectively. Leveraging our order-insenstive implementations of window operators, we present a new architecture for stream systems, OOP (Out-of- Order Processing). Instead of relying on ordered streams to indicate stream progress, OOP explicitly communicates stream progress to query operators, and thus is more flexible than the previous in-order processing (IOP) approach, which requires maintaining stream order. We implemented our order-insensitive window query operators and the OOP architecture in NiagaraST and Gigascope. Our performance study in both systems confirms the benefits of our window operator implementations and the OOP architecture compared to the commonly used approaches in terms of memory usage, execution time and latency.

Description

If you are the rightful copyright holder of this dissertation or thesis and wish to have it removed from the Open Access Collection, please submit a request to pdxscholar@pdx.edu and include clear identification of the work, preferably with URL

Persistent Identifier

http://archives.pdx.edu/ds/psu/16545

Share

COinS