All Translation Tools Are Not Equal: Investigating the Quality of Language Translation for Forced Migration
Sponsor
We would like to acknowledge the McCourt Institute, the Massive Data Institute, and the Institute for the Study of International Migration at Georgetown University for funding this study. This research was partially supported by NSF award 2246174.
Published In
2023 IEEE 10th International Conference on Data Science and Advanced Analytics (DSAA)
Document Type
Citation
Publication Date
2023
Abstract
As the volume and complexity of forced movement continues to grow, there is an urgent need to use new data sources to better understand emerging crises. Organic sources, like social media and newspapers, can offer insights in near real time when administrative data are unavailable for timely and detailed analysis. However, in order to flexibly switch to different contexts, we need the ability to contextualize the drivers of movement for different locations and languages. Recent advances in natural language processing and specifically, neural machine translation, have shown impressive results on standard benchmark datasets for well-studied language pairs. However, the effectiveness of these models in a real-world scenario remains less known. To advance our understanding of real-world, contextual translation, we systematically study the performance of multiple widely used off-the-shelf machine translation tools using words associated with drivers of forced movement in both high- and low-resource languages. Our empirical results suggest significant variation between the performance of these machine translation tools in terms of accuracy and efficiency, highlighting a problem that must be faced by those conducting migration research using multilingual contexts. We conclude by suggesting strategies for obtaining reasonable translations from off-the-shelf language tools.
Rights
© IEEE
Locate the Document
DOI
10.1109/DSAA60987.2023.10302481
Persistent Identifier
https://archives.pdx.edu/ds/psu/41274
Publisher
IEEE
Citation Details
Agrawal, A., Singh, L., Jacobs, E., Liu, Y., Dunlevy, G., Pokharel, R., & Uppala, V. (2023, October 9). All Translation Tools Are Not Equal: Investigating the Quality of Language Translation for Forced Migration. 2023 IEEE 10th International Conference on Data Science and Advanced Analytics (DSAA). https://doi.org/10.1109/dsaa60987.2023.10302481