Presentation Type

Poster

Start Date

5-8-2024 11:00 AM

End Date

5-8-2024 1:00 PM

Subjects

Physics, Artificial intelligence

Advisor

Ralf Widenhorn

Student Level

Undergraduate

Abstract

In our Large Language Model (LLM) research, examining ChatGPT 4, we devised a physics problem involving an object descending an inclined plane. Through variations in terminology such as "rolling," "sliding," "solid sphere," "hollow sphere," "wooden ramp," "no-slip ramp," and more, we sought to evaluate LLM responses for different scenarios. Our analysis aimed to discern whether the LLM’s answers exhibited expertise in the field of physics. This experiment sheds light on LLM’s ability to give accurate and precise physics answers as well as variation in responses to nuanced changes in problem formulation. This provides valuable insights into its proficiency and potential for educational applications. We conducted 34 different variations with 5 responses for each variation. Manipulating the object type, action verb, and incline property to observe how the LLM responded under different prompts. The objectives of this study were to assess the LLM’s ability to address the same physics problem with varying conditions and to determine if it provided responses consistent with physics expert assumptions. Through a comprehensive analysis, we gained insights into ChatGPT’s performance in handling diverse problem formulations, highlighting its potential as an educational tool for physics and related disciplines.

Persistent Identifier

https://archives.pdx.edu/ds/psu/41890

Included in

Physics Commons

Share

COinS
 
May 8th, 11:00 AM May 8th, 1:00 PM

Going Down an Incline with ChatGPT

In our Large Language Model (LLM) research, examining ChatGPT 4, we devised a physics problem involving an object descending an inclined plane. Through variations in terminology such as "rolling," "sliding," "solid sphere," "hollow sphere," "wooden ramp," "no-slip ramp," and more, we sought to evaluate LLM responses for different scenarios. Our analysis aimed to discern whether the LLM’s answers exhibited expertise in the field of physics. This experiment sheds light on LLM’s ability to give accurate and precise physics answers as well as variation in responses to nuanced changes in problem formulation. This provides valuable insights into its proficiency and potential for educational applications. We conducted 34 different variations with 5 responses for each variation. Manipulating the object type, action verb, and incline property to observe how the LLM responded under different prompts. The objectives of this study were to assess the LLM’s ability to address the same physics problem with varying conditions and to determine if it provided responses consistent with physics expert assumptions. Through a comprehensive analysis, we gained insights into ChatGPT’s performance in handling diverse problem formulations, highlighting its potential as an educational tool for physics and related disciplines.