Systems Science Faculty Publications and Presentations

Learning with Binary-Valued Utility Using Derivative Adaptive Critic Methods

Shari A. Matzner, Portland State University
Thaddeus T. Shannon, Portland State University
George G. Lendaris, Portland State UniversityFollow

Published In

IEEE International Conference on Neural Networks - Conference Proceedings

Document Type

Citation

Publication Date

12-1-2004

Abstract

Adaptive critic methods for reinforcement learning are known to provide consistent solutions to optimal control problems, and are also considered plausible models for cognitive learning processes. This work discusses binary reinforcement in the context of three adaptive critic methods: heuristic dynamic programming (HDP), dual heuristic programming (DHP), and globalized dual heuristic programming (GDHP). Binary reinforcement arises when the qualitative measure of success is simply "pass" or "fail". We implement binary reinforcement with adaptive critic methods for the pole-cart benchmark problem. Results demonstrate two qualitatively dissimilar classes of controllers: those that replicate the system stabilization achieved with quadratic utility, and those that merely succeed at not dropping the pole. It is found that the GDHP method is effective for learning an approximately optimal solution, with results comparable to those obtained via DHP that uses a more informative, quadratic utility function.

Locate the Document

https://doi.org/10.1109/IJCNN.2004.1380882

DOI

10.1109/IJCNN.2004.1380882

Persistent Identifier

https://archives.pdx.edu/ds/psu/37312

Citation Details

Matzner, S. A., Shannon, T. T., & Lendaris, G. G. (2004, July). Learning with binary-valued utility using derivative adaptive critic methods. In 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No. 04CH37541) (Vol. 3, pp. 1805-1810). IEEE.

COinS

Systems Science Faculty Publications and Presentations

Learning with Binary-Valued Utility Using Derivative Adaptive Critic Methods

Published In

Document Type

Publication Date

Abstract

Locate the Document

DOI

Persistent Identifier

Citation Details

Find

Connect

Systems Science Faculty Publications and Presentations

Learning with Binary-Valued Utility Using Derivative Adaptive Critic Methods

Authors

Published In

Document Type

Publication Date

Abstract

Locate the Document

DOI

Persistent Identifier

Citation Details

Share

Find

Connect