Computer Science Faculty Publications and Presentations

“lost-In-The-Later”: Framework for Quantifying Contextual Grounding in Large Language Models

Published In

2025 IEEE International Conference on Data Mining Workshops (ICDMW)

Document Type

Pre-Print

Publication Date

2025

Abstract

Large language models (LLMs) are capable of leveraging both contextual and parametric knowledge but how they prioritize and integrate these sources remains underexplored. We introduce CoPE, a novel framework for systematically quantifying contextual grounding in LLMs. CoPE distinguishes between contextual knowledge (CK) and parametric knowledge (PK), enabling fine-grained attribution across languages and tasks. Using our newly created MultiWikiAtomic dataset in English, Spanish, and Danish, we analyze how LLMs integrate context, prioritize information, and incorporate PK in open-ended question answering. We find that across models and languages, only around 50 to 76 percent of outputs are grounded in the given context, even in knowledge-consistent settings. Grounding drops further in counterfactual scenarios, but not drastically. Our analysis uncovers a phenomenon we call lost-in-the-later, where LLMs tend to overlook information that appears later in a given context, revealing a strong positional bias that affects contextual grounding. We further find that reasoning models, as well as nonreasoning models prompted with Chain-of-Thought (CoT), use context even less than non-reasoning models without CoT and fail to mitigate the lost-in-the-later effect. CoT prompting, in particular, results in lower context recall and shorter responses, leading to degraded contextual grounding. Based on these insights, we design training-free prompt-based methods to effectively leverage input context, consistently improving contextual grounding on average by 12 % across all three languages.

Rights

This work is licensed under a Creative Commons Attribution 4.0 International License.

Description

This is the author’s version of a work that was accepted for publication. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published as: (2025). “Lost-In-The-Later”: Framework for Quantifying Contextual Grounding in Large Language Models. 2025 IEEE International Conference on Data Mining Workshops (ICDMW), 1703–1712. https://doi.org/10.1109/icdmw69685.2025.00204

DOI

10.1109/ICDMW69685.2025.00204

Persistent Identifier

https://archives.pdx.edu/ds/psu/44564

Publisher

IEEE

Citation Details

Published as: Tao, Y., Hiatt, A., Seetharaman, R., & Agrawal, A. (2025). “Lost-In-The-Later”: Framework for Quantifying Contextual Grounding in Large Language Models. 2025 IEEE International Conference on Data Mining Workshops (ICDMW), 1703–1712. https://doi.org/10.1109/icdmw69685.2025.00204

Download

Included in

Computer Sciences Commons

COinS

Computer Science Faculty Publications and Presentations

“lost-In-The-Later”: Framework for Quantifying Contextual Grounding in Large Language Models

Published In

Document Type

Publication Date

Abstract

Rights

Description

DOI

Persistent Identifier

Publisher

Citation Details

Included in

Find

Connect

Computer Science Faculty Publications and Presentations

“lost-In-The-Later”: Framework for Quantifying Contextual Grounding in Large Language Models

Authors

Published In

Document Type

Publication Date

Abstract

Rights

Description

DOI

Persistent Identifier

Publisher

Citation Details

Included in

Share

Find

Connect