First Advisor

Banafsheh Rekabdar

Term of Graduation

Fall 2025

Date of Publication

12-9-2025

Document Type

Thesis

Degree Name

Master of Science (M.S.) in Computer Science

Department

Computer Science

Language

English

Subjects

atari, mamba, mujoco, online fine-tuning, reinforcement learning, sequence modeling

Physical Description

1 online resource (v, 40 pages)

Abstract

Online in-context reinforcement learning enhances offline-trained policies through online fine-tuning. We introduce Online Decision Mamba (ODM), an architecture that replaces the attention mechanism in Online Decision Transformers (ODT) with the Mamba module to improve long-context sequence modeling and overall RL performance. We performed in-depth evaluations on MuJoCo (OpenAI Gym) and Atari benchmarks, comparing ODM against state-of-the-art offline and online baselines—including Decision Mamba (DM) and ODT. Our results show that ODM achieves competitive or superior performance, with particularly robust gains when initial datasets lack expert demonstrations. In the Qbert Atari environment, ODM shows context-length sensitivity similar to offline DM; however, we demonstrate that adjusting the Mamba delta-parameter initialization range effectively mitigates any performance degradation. Further experiments explored the effects of frame stacking, action-embedding dimensionality, exploration strategies, multinomial sampling temperature, pretraining iterations, and replay-buffer size. These findings confirm that ODM is a flexible, high-performance framework for online in-context reinforcement learning, adaptable to diverse tasks and dataset characteristics.

Rights

In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).

Persistent Identifier

https://archives.pdx.edu/ds/psu/44398

Recommended Citation

Ruf, Trenton W., "Online Decision Mamba" (2025). Dissertations and Theses. Paper 6983.

Download

Included in

Computer Sciences Commons

COinS

Dissertations and Theses

Online Decision Mamba

First Advisor

Term of Graduation

Date of Publication

Document Type

Degree Name

Department

Language

Subjects

Physical Description

Abstract

Rights

Persistent Identifier

Recommended Citation

Included in

Find

Connect

Dissertations and Theses

Online Decision Mamba

Author

Sponsor

First Advisor

Term of Graduation

Date of Publication

Document Type

Degree Name

Department

Language

Subjects

Physical Description

Abstract

Rights

Persistent Identifier

Recommended Citation

Included in

Share

Find

Connect