First Advisor

Melanie Mitchell

Term of Graduation

Fall 2021

Date of Publication

11-22-2021

Document Type

Dissertation

Degree Name

Doctor of Philosophy (Ph.D.) in Computer Science

Department

Computer Science

Language

English

Subjects

Pattern recognition systems, Computer vision, Deep learning (Machine learning)

DOI

10.15760/etd.7720

Physical Description

1 online resource (vi, 140 pages)

Abstract

Computer vision and machine learning systems have improved significantly in recent years, largely based on the development of deep learning systems, leading to impressive performance on object detection tasks. Understanding the content of images is considerably more difficult. Even simple situations, such as "a handshake", "walking the dog", "a game of ping-pong", or "people waiting for a bus", present significant challenges. Each consists of common objects, but are not reliably detectable as a single entity nor through the simple co-occurrence of their parts.

In this dissertation, toward the goal of developing machine learning systems that demonstrate properties associated with understanding, I will describe a novel system for performing visual situation recognition. Given a description of a situation and a small labeled training set, the system, called Situate, learns object appearance models as well as a probabilistic model capturing the situation's expected spatial relationships. Given a new image, Situate uses its learned models and an array of agents to engage in an active search of its input to find the most consistent correspondence between the model of the situation and the content of the image. Each agent develops a possible correspondence between the model and the input, while Situate allocates computational resources to the agents such that promising solutions are developed early, but alternative correspondences are not ignored.

I will compare Situate to a more traditional computer vision approach that relies on the detection of constituent objects of a situation, as well as to a related image-retrieval system based on "scene graphs". I will evaluate each method on the situation recognition task and in the context of image retrieval. The results demonstrate the value of a feedback system between image content and a model of that content.

Rights

In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).

Persistent Identifier

https://archives.pdx.edu/ds/psu/36915

Recommended Citation

Quinn, Max Henry, "Situate: An Agent-Based System for Situation Recognition" (2021). Dissertations and Theses. Paper 5849.
https://doi.org/10.15760/etd.7720

Download

Included in

Artificial Intelligence and Robotics Commons

COinS

Dissertations and Theses

Situate: An Agent-Based System for Situation Recognition

First Advisor

Term of Graduation

Date of Publication

Document Type

Degree Name

Department

Language

Subjects

DOI

Physical Description

Abstract

Rights

Persistent Identifier

Recommended Citation

Included in

Find

Connect

Dissertations and Theses

Situate: An Agent-Based System for Situation Recognition

Author

Sponsor

First Advisor

Term of Graduation

Date of Publication

Document Type

Degree Name

Department

Language

Subjects

DOI

Physical Description

Abstract

Rights

Persistent Identifier

Recommended Citation

Included in

Share

Find

Connect