Published In

Fire 2025 Proceedings of the 17th Annual Meeting of the Forum for Information Retrieval Evaluation

Document Type

Conference Proceeding

Publication Date

1-12-2026

Subjects

Code-mixed Language Identification, Dravidian languages

Abstract

Code-mixing is considered as a linguistic phenomenon that combines several languages into one text. It has now become very common in multilingual societies, especially in digital communication. Word-Level Identification of Languages in Dravidian Languages (WILD) - a Code-mixed Language Identification (CoLI) in Dravidian languages shared task, organized as a part of Forum for Information Retrieval and Evaluation (FIRE) 2025, put forward these challenges to the researchers by asking them to develop models capable of classifying words in code-mixed texts involving Dravidian languages - Tamil, Telugu, Malayalam, Kannada, and Tulu, which are interwoven with English. It poses significant challenges due to the complexity of linguistic structures, mixed-language tokens, and dialectal variations in low-resource languages such as those from the Dravidian family. The participating teams used different methodologies, ranging from traditional Machine Learning (ML) to Deep Learning and transformer-based models to address these challenges. This paper presents the important findings of the task and an overview of the submitted methodologies.

Rights

Copyright (c) 2025 The Authors

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

DOI

10.1145/3777867.3778258

Persistent Identifier

https://archives.pdx.edu/ds/psu/44470

Share

COinS