First Advisor

Christof Teuscher

Term of Graduation

Summer 2021

Date of Publication

8-4-2021

Document Type

Thesis

Degree Name

Master of Science (M.S.) in Electrical and Computer Engineering

Department

Electrical and Computer Engineering

Language

English

Subjects

Neural networks (Computer science), Mathematical optimization, Nuclear physics, Radiation

DOI

10.15760/etd.7657

Physical Description

1 online resource (xvi, 86 pages)

Abstract

Rapid localization and search for lost nuclear sources in a given area of interest is an important task for the safety of society and the reduction of human harm. Detection, localization and identification are based upon the measured gamma radiation spectrum from a radiation detector. The nonlinear relationship of electromagnetic wave propagation paired with the probabilistic nature of gamma ray emission and background radiation from the environment leads to ambiguity in the estimation of a source's location. In the case of a single mobile detector, there are numerous challenges to overcome such as weak source activity, multiple sources, or the presence of obstructions, i.e. a non-convex environment. Detectors deployed to smaller autonomous systems such as drones or robots have smaller surface area and volume resulting in worse counting statistics per dwell time. Additionally, search algorithms need to be efficient and generalizable to operate across a variety of scenarios.

The motivation of this work is to investigate the sequential decision making capability of deep reinforcement learning (DRL) in the nuclear source search context. We focus on a branch of DRL known as stochastic, model-free, on-policy gradients that learns strictly through interaction with an environment to develop a useful policy for a specified goal. A novel neural network architecture (RAD-A2C) based on the actor critic (A2C) framework that uses a gated recurrent unit (GRU) for action selection and a particle filter gated recurrent unit (PFGRU) for localization is proposed.

Performance is studied in randomized 22 x 22 m convex and non-convex simulated environments across a range of signal-to-noise ratio (SNR)s for a single detector and single source. The RAD-A2C performance is compared to both an information-driven controller that uses a bootstrap particle filter (BPF) and to a gradient search (GS) algorithm. We find that the RAD-A2C has comparable performance to the information- driven controller across SNR in a convex environment and at lower computational complexity per action. The RAD-A2C far outperforms the GS algorithm in the non-convex environment with greater than 95% median completion rate.

Rights

In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).

Persistent Identifier

https://archives.pdx.edu/ds/psu/36383

Recommended Citation

Proctor, Philippe Erol, "Proximal Policy Optimization for Radiation Source Search" (2021). Dissertations and Theses. Paper 5786.
https://doi.org/10.15760/etd.7657

Download

Included in

Computer Sciences Commons, Nuclear Commons

COinS

Dissertations and Theses

Proximal Policy Optimization for Radiation Source Search

First Advisor

Term of Graduation

Date of Publication

Document Type

Degree Name

Department

Language

Subjects

DOI

Physical Description

Abstract

Rights

Persistent Identifier

Recommended Citation

Included in

Find

Connect

Dissertations and Theses

Proximal Policy Optimization for Radiation Source Search

Author

Sponsor

First Advisor

Term of Graduation

Date of Publication

Document Type

Degree Name

Department

Language

Subjects

DOI

Physical Description

Abstract

Rights

Persistent Identifier

Recommended Citation

Included in

Share

Find

Connect