Download (2.1 MB)
Humans have the ability to make use of experience while performing system identification and selecting control actions for changing situations. In contrast to current technological implementations that slow down as more knowledge is stored, as more experience is gained, human processing speeds up and has enhanced effectiveness. An emerging experience-based (“higher level”) approach promises to endow our technology with enhanced efficiency and effectiveness.
The notions of context and context discernment are important to understanding this human ability. These are defined as appropriate to controls and system-identification. Some general background on controls, Dynamic Programming, and Adaptive Critic leading to Adaptive Dynamic Programming (ADP) will be provided.
The higher-level application of Adaptive Dynamic Programming (ADP) is described, wherein ADP is employed to develop on-line algorithms that respond to changes in context by efficiently and effectively selecting designs from a repository of existing controller solutions– in contrast to the usual application of ADP that focuses on designing controllers directly. In this way, the ADP is said to be applied up a level from typical application.
Key components of the approach include the notions of context, context discernment, and experience. These apply to applications in control and also to system identification.
Details of the approach and its rationale will be described, including examples and recent developments of the underlying ideas.
George G. Lendaris is Professor of Systems Science and Electrical & Computer Engineering at Portland State University.
Dynamic programming, Reinforcement learning, Adaptive control systems -- Mathematical models, Computational intelligence, System identification
Computer Sciences | Theory and Algorithms
© Copyright the author(s)
This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
The purpose of this statement is to help the public understand how this Item may be used. When there is a (non-standard) License or contract that governs re-use of the associated Item, this statement only summarizes the effects of some of its terms. It is not a License, and should not be used to license your Work. To license your own Work, use a License offered at https://creativecommons.org/
Lendaris, George G., "Higher-level Application of Adaptive Dynamic Programming/reinforcement Learning – A Next phase for Controls and System Identification?" (2011). Systems Science Friday Noon Seminar Series. 53.