Traceable Group-Wise Self-Optimizing Feature Transformation Learning: A Dual Optimization Perspective
Published In
Acm Transactions on Knowledge Discovery from Data
Document Type
Citation
Publication Date
5-1-2024
Abstract
Feature transformation aims to reconstruct an effective representation space by mathematically refining the existing features. It serves as a pivotal approach to combat the curse of dimensionality, enhance model generalization, mitigate data sparsity, and extend the applicability of classical models. Existing research predominantly focuses on domain knowledge-based feature engineering or learning latent representations. However, these methods, while insightful, lack full automation and fail to yield a traceable and optimal representation space. An indispensable question arises: Can we concurrently address these limitations when reconstructing a feature space for a machine learning task? Our initial work took a pioneering step towards this challenge by introducing a novel self-optimizing framework. This framework leverages the power of three cascading reinforced agents to automatically select candidate features and operations for generating improved feature transformation combinations. Despite the impressive strides made, there was room for enhancing its effectiveness and generalization capability. In this extended journal version, we advance our initial work from two distinct yet interconnected perspectives: 1) We propose a refinement of the original framework, which integrates a graph-based state representation method to capture the feature interactions more effectively and develop different Q-learning strategies to alleviate Q-value overestimation further. 2) We utilize a new optimization technique (actor-critic) to train the entire self-optimizing framework in order to accelerate the model convergence and improve the feature transformation performance. Finally, to validate the improved effectiveness and generalization capability of our framework, we perform extensive experiments and conduct comprehensive analyses. These provide empirical evidence of the strides made in this journal version over the initial work, solidifying our framework’s standing as a substantial contribution to the field of automated feature transformation. To improve the reproducibility, we have released the associated code and data by the Github link https://github.com/coco11563/TKDD2023_code.
Rights
Copyright © 2024 ACM, Inc.
Locate the Document
DOI
10.1145/3638059
Persistent Identifier
https://archives.pdx.edu/ds/psu/41683
Citation Details
Xiao, M., Wang, D., Wu, M., Liu, K., Xiong, H., Zhou, Y., & Fu, Y. (2024). Traceable Group-Wise Self-Optimizing Feature Transformation Learning: A Dual Optimization Perspective. ACM Transactions on Knowledge Discovery from Data, 18(4), 1–22.