First Advisor

Fei Xie

Term of Graduation

Fall 2024

Date of Publication

11-12-2024

Document Type

Dissertation

Degree Name

Doctor of Philosophy (Ph.D.) in Computer Science

Department

Computer Science

Language

English

Subjects

Compiler, Formal verification, Large language model, Translation validation

Physical Description

1 online resource (x, 98 pages)

Abstract

The ever-growing complexity of software and its target hardware makes it increasingly challenging to develop reliable compilers that preserve the semantics of source code during compilation. Traditional testing methods often lack sufficient test coverage and fail to identify subtle errors and undefined behaviors that can lead to compiler malfunctions. Furthermore, while formal compiler certification ensures semantic preservation through theorem proving, the inherent complexity of this process makes re-certification after each compiler revision substantially labor-intensive. This often significantly hinders the improvement of compilers.

This research aims to bridge the gap between increasing compiler complexity and the limited scalability of formal verification techniques. We present a comprehensive approach for automatic verification of compiler transformations that are the core components of any compiler design. Our approach particularly targets the foundations of state-of-the-art compilers: their intermediate representations (IRs). It provides automated verification algorithms that formally compare the IRs before and after a compiler transformation and systematically enumerate the execution paths of such transformation.

First, we have developed an automated equivalence checking framework for hardware designs represented in HalideIR, the foundation of several well-known hardware design compilers. This framework conducts scalable equivalence checking by analyzing structural similarities in HalideIR across various design iterations. It identifies minimal design constructs essential for verification and performs efficient localized checks. Its application in deep learning accelerator designs has proven its capability to detect inconsistencies that manual testing fails to identify, thereby bolstering hardware design reliability.

Second, we have developed an automated verification framework for HalideIR-based compiler transformations using symbolic execution. This method systematically enumerates and verifies execution paths of compiler transformations to ensure that the output program for each path is equivalent to its source. The application of this framework in common HalideIR-based compilers has uncovered previously undetected transformation bugs, highlighting its effectiveness.

Third, we have pioneered a novel translation validation framework for MLIR-based compilers. This framework overcomes the complexity arising from the numerous MLIR dialects and employs the Z3 SMT solver to validate the refinement of target programs with respect to source programs in MLIR. Its evaluations on diverse MLIR-based compilers have not only showcased its versatility but also its crucial role in identifying bugs and undefined behaviors.

Finally, we have also explored integrating Large Language Models (LLMs) into translation validation to target compiler transformations where formal verification tools fall short. Our framework utilizes existing formal verification tools for translation validation (in this work, we use Alive2, a well-known tool in LLVM IR compiler verification, as an example). When formal verification tools are unable to confirm a transformation’s soundness, our framework employs fine-tuned LLMs for prediction. It then applies fuzzing to transformations predicted as potentially unsound by the LLMs due to return values or memory inconsistencies, aiming to find counterexamples. If the transformations are deemed sound, unsound for other reasons, or if no counterexamples emerge, the framework directly reports these outcomes without further fuzzing. This framework has shown effectiveness in complex compiler transformations such as those with unbounded loops, where traditional formal verification tools struggle.

Rights

In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).

Persistent Identifier

https://archives.pdx.edu/ds/psu/42894

Share

COinS