First Advisor

Dr. Anthony Rhodes

Date of Award

Summer 9-9-2020

Document Type

Thesis

Degree Name

Bachelor of Science (B.S.) in Computer Science and University Honors

Department

Computer Science

Language

English

Subjects

NLP, Machine Learning, CBOW, Skip Gram

Abstract

CBOW and Skip Gram are two NLP techniques to produce word embedding models that are accurate and performant. They were invented in the seminal paper by T. Mikolov et al. and have since observed optimizations such as negative sampling and subsampling. This paper implements a fully-optimized version of these models using Py-Torch and runs them through a toy sentiment/subject analysis. It is weakly observed that different corpus types affect the skew of work embeddings such that fictional corpus are better suited for sentiment analysis and non-fictional for subject analysis.

Share

COinS