Document Type

Technical Report

Publication Date



Data compression (Computer science), Big data -- Management, Execution traces (Computer program testing), Application software -- Performance -- Analysis


Event traces are required to correctly diagnose a number of performance problems that arise on today’s highly parallel systems. Unfortunately, the collection of event traces can produce a large volume of data that is difficult, or even impossible, to store and analyze. One approach for compressing a trace is to identify repeating trace patterns and retain only one representative of each pattern. However, determining the similarity of sections of traces, i.e., identifying patterns, is not straightforward. In this paper, we investigate pattern-based methods for reducing traces that will be used for performance analysis. We evaluate the different methods against several criteria, including size reduction, introduced error, and retention of performance trends, using both benchmarks with carefully chosen performance behaviors, and a real application.


Portland State University Computer Science Department Technical Report #09-03, 2009.

Persistent Identifier