Published In
2025 IEEE International Conference on Data Mining Workshops (ICDMW)
Document Type
Pre-Print
Publication Date
2025
Abstract
Large language models (LLMs) are capable of leveraging both contextual and parametric knowledge but how they prioritize and integrate these sources remains underexplored. We introduce CoPE, a novel framework for systematically quantifying contextual grounding in LLMs. CoPE distinguishes between contextual knowledge (CK) and parametric knowledge (PK), enabling fine-grained attribution across languages and tasks. Using our newly created MultiWikiAtomic dataset in English, Spanish, and Danish, we analyze how LLMs integrate context, prioritize information, and incorporate PK in open-ended question answering. We find that across models and languages, only around 50 to 76 percent of outputs are grounded in the given context, even in knowledge-consistent settings. Grounding drops further in counterfactual scenarios, but not drastically. Our analysis uncovers a phenomenon we call lost-in-the-later, where LLMs tend to overlook information that appears later in a given context, revealing a strong positional bias that affects contextual grounding. We further find that reasoning models, as well as nonreasoning models prompted with Chain-of-Thought (CoT), use context even less than non-reasoning models without CoT and fail to mitigate the lost-in-the-later effect. CoT prompting, in particular, results in lower context recall and shorter responses, leading to degraded contextual grounding. Based on these insights, we design training-free prompt-based methods to effectively leverage input context, consistently improving contextual grounding on average by 12 % across all three languages.
Rights
Copyright (c) 2026 The Authors
This work is licensed under a Creative Commons Attribution 4.0 International License.
DOI
10.1109/ICDMW69685.2025.00204
Persistent Identifier
https://archives.pdx.edu/ds/psu/44564
Publisher
IEEE
Citation Details
Published as: Tao, Y., Hiatt, A., Seetharaman, R., & Agrawal, A. (2025). “Lost-In-The-Later”: Framework for Quantifying Contextual Grounding in Large Language Models. 2025 IEEE International Conference on Data Mining Workshops (ICDMW), 1703–1712. https://doi.org/10.1109/icdmw69685.2025.00204

Description
This is the author’s version of a work that was accepted for publication. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published as: (2025). “Lost-In-The-Later”: Framework for Quantifying Contextual Grounding in Large Language Models. 2025 IEEE International Conference on Data Mining Workshops (ICDMW), 1703–1712. https://doi.org/10.1109/icdmw69685.2025.00204