Published In

IEEE Transactions on Affective Computing

Document Type

Pre-Print

Publication Date

2023

Subjects

Word Representation, Language Representation, Pretrained Language Models, Affective Tasks, Text Preprocessing, Word Embeddings, Emotion Classification, Sentiment analysis, Sarcasm Detection

Abstract

Affective tasks, including sentiment analysis, emotion classification, and sarcasm detection have drawn a lot of attention in recent years due to a broad range of useful applications in various domains. The main goal of affect detection tasks is to recognize states such as mood, sentiment, and emotions from textual data (e.g., news articles or product reviews). Despite the importance of utilizing preprocessing steps in different stages (i.e., word representation learning and building a classification model) of affect detection tasks, this topic has not been studied well. To that end, we explore whether applying various preprocessing methods (stemming, lemmatization, stopword removal, punctuation removal and so on) and their combinations in different stages of the affect detection pipeline can improve the model performance. The are many preprocessing approaches that can be utilized in affect detection tasks. However, their influence on the final performance depends on the type of preprocessing and the stages that they are applied. Moreover, the preprocessing impacts vary across different affective tasks. Our analysis provides thorough insights into how preprocessing steps can be applied in building an effect detection pipeline and their respective influence on performance.

Rights

© Copyright the author(s) (2023)

Description

This article has been accepted for publication in IEEE Transactions on Affective Computing. This is the author's version which has not been fully edited and content may change prior to final publication.

Locate the Document

10.1109/TAFFC.2023.3270115

DOI

10.1109/TAFFC.2023.3270115

Persistent Identifier

https://archives.pdx.edu/ds/psu/39824

Share

COinS