First Advisor

Wu-chi Feng

Term of Graduation

Spring 2024

Date of Publication

4-25-2024

Document Type

Dissertation

Degree Name

Doctor of Philosophy (Ph.D.) in Computer Science

Department

Computer Science

Language

English

Physical Description

1 online resource (viii, 154 pages)

Abstract

Machine learning systems have catalyzed numerous image-centric applications owing to the significant achievements of machine learning algorithms and models. While these systems have showcased the efficacy of machine learning models, certain challenges persist, such as machine learning system design and security vulnerabilities inherent in deep neural networks. Moreover, the deployment of deep neural network models remains a significant hurdle. This dissertation introduces a multimedia prototyping framework tailored for visual analytical applications, improving the reusability of video analysis software tools with minimal performance overhead. Furthermore, we present novel image-processing techniques designed to bolster the robustness of deep neural networks and propose an innovative compression technique to address deployment challenges.

First, we propose a new software prototyping framework called Video as Text (vText) that analyzes and manipulates the video data as trivial as we handle text data in most Unix and Linux systems to tackle the reusability issue in the existing video analysis tools. The vText paradigm seeks to mimic such programs. We demonstrate the design and implementation of vText linking video codecs with computer vision and image processing algorithms, and the performance evaluation shows that the vText framework achieves comparable running time and is easily used for prototyping visual analytical programs.

Second, to reduce the vulnerability of deep neural networks against adversaries, we propose three color-reduction image processing approaches, which are Gaussian smoothing plus PNM color reduction (GPCR), Gaussian smoothing plus K-means (GK-means), and fast GK-means to make deep convolutional neural networks more robust to adversarial perturbation. We evaluate the approaches on a subset of the ImageNet dataset. Our evaluation reveals that our GK-means-based algorithms have the best top-1 classification accuracy.

The final contribution of the dissertation is introducing a novel deep neural network compression framework on class specialization problems to address the limited utilization of deep neural network-based functionalities. We propose a novel knowledge distillation framework with two proposed losses, Renormalized Knowledge Distillation (RKD) and Intra-Class Variance (ICV), to render computationally efficient, specialized neural network models. Our quantitatively empirical evaluation demonstrates that our proposed framework achieves significant classification accuracy improvements for the tasks where the number of subclasses or instances in datasets is relatively small.

Rights

© 2024 Li-Yun Wang

In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).

Persistent Identifier

https://archives.pdx.edu/ds/psu/42211

Share

COinS