Published In

2025 IEEE International Conference on Data Mining Workshops (ICDMW)

Document Type

Pre-Print

Publication Date

2025

Abstract

Large language models (LLMs) are capable of leveraging both contextual and parametric knowledge but how they prioritize and integrate these sources remains underexplored. We introduce CoPE, a novel framework for systematically quantifying contextual grounding in LLMs. CoPE distinguishes between contextual knowledge (CK) and parametric knowledge (PK), enabling fine-grained attribution across languages and tasks. Using our newly created MultiWikiAtomic dataset in English, Spanish, and Danish, we analyze how LLMs integrate context, prioritize information, and incorporate PK in open-ended question answering. We find that across models and languages, only around 50 to 76 percent of outputs are grounded in the given context, even in knowledge-consistent settings. Grounding drops further in counterfactual scenarios, but not drastically. Our analysis uncovers a phenomenon we call lost-in-the-later, where LLMs tend to overlook information that appears later in a given context, revealing a strong positional bias that affects contextual grounding. We further find that reasoning models, as well as nonreasoning models prompted with Chain-of-Thought (CoT), use context even less than non-reasoning models without CoT and fail to mitigate the lost-in-the-later effect. CoT prompting, in particular, results in lower context recall and shorter responses, leading to degraded contextual grounding. Based on these insights, we design training-free prompt-based methods to effectively leverage input context, consistently improving contextual grounding on average by 12 % across all three languages.

Rights

Copyright (c) 2026 The Authors

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Description

This is the author’s version of a work that was accepted for publication. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published as: (2025). “Lost-In-The-Later”: Framework for Quantifying Contextual Grounding in Large Language Models. 2025 IEEE International Conference on Data Mining Workshops (ICDMW), 1703–1712. https://doi.org/10.1109/icdmw69685.2025.00204

DOI

10.1109/ICDMW69685.2025.00204

Persistent Identifier

https://archives.pdx.edu/ds/psu/44564

Publisher

IEEE

Share

COinS