First Advisor
Ameeta Agrawal
Term of Graduation
January 2026
Date of Publication
6-1-2026
Document Type
Dissertation
Language
English
Subjects
Contextual Grounding, Decoding, Large Language Models
Physical Description
1 online resource ( pages)
Abstract
Large language models (LLMs) are now expected to adapt to context in ways that are both socially appropriate and informationally faithful. This dissertation argues that contextual adaptation and contextual grounding are the unifying problems behind that expectation: models must adjust to interactional settings such as role-play while also grounding their outputs in the information supplied at inference time. Across four studies, the dissertation examines how LLMs respond to social cues, how they balance contextual and parametric knowledge, how multilingual long-context settings reveal grounding failures, and how decode-time control can improve reliability.
The first study analyzes the ChatGPT Role-play Dataset (CRD) to characterize conversational adaptability in role-based interactions. Role-play settings improve persona alignment and conversational naturalness relative to unconstrained chat, but model responses remain systematically more verbose than human turns and only partially responsive to user expectations. These findings establish social interaction as one form of context to which LLMs must adapt.
The second study investigates contextual versus parametric knowledge in open-ended question answering. Even in knowledge-consistent settings, LLMs do not simply reproduce the provided evidence: they mix grounded content with parametric supplementation, fail to use all available context, and reduce hallucination only gradually as contextual evidence increases. This reveals a persistent informational grounding problem even when context and pretrained knowledge are not in direct conflict.
The third study extends this analysis to Lost-in-the-Later and multilingual contextual grounding. It shows that long-context failures are not only about whether relevant information is present, but also where that information appears and in which language it is expressed. Grounding becomes less stable when relevant evidence occurs later in context or must be tracked across multilingual settings, exposing positional and linguistic biases in contextual recall.
The fourth study addresses these failures with No-Worse Context-Aware Decoding (NWCAD), a decode-time control framework for improving faithfulness to context without degrading baseline generation quality. Together, the four studies move from diagnosing failures of contextual adaptation to proposing a concrete intervention for more dependable grounded generation.
Taken together, the dissertation advances a unified account of contextual adaptation in LLMs. Conversational behavior, contextual grounding, multilingual recall, and reliability control are treated not as isolated problems, but as linked manifestations of the same central challenge: making language models respond to context in ways that are adaptive, faithful, and dependable.
Rights
In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
Recommended Citation
Tao, Yufei, "Toward Contextual Adaptation and Knowledge Grounding in Large Language Models" (2026). Dissertations and Theses. Paper 7119.