Workshop: Using Natural Language Processing (NLP) to Compare Popular Authors
In this workshop, participants will learn common Natural Language Processing (NLP) techniques using a dataset of popular works from Project Gutenburg. We will review basic NLP methods including: data cleaning (removing stop and filler words, lemmatization, tokenization), extracting lexical features (word frequency, most common n-grams, etc.), and conducting topic modeling and sentiment analysis using Python. Participants can choose which author to collect these features for and by the end of the workshop, we will compare the writing styles of various authors in the dataset. This is an introductory level workshop. No prior coding experience required – come ready to experiment and have fun!
Taught by CSC Computing Fellows, Anushka Kulkarni (CS, BC ’23) and Dipashreya Sur (CS with a minor in Arch, BC ’23)
This workshop is planned to take place in person (516 Milstein) and online. A link to join via Zoom will be sent to registrants shortly before the event.
We look forward to seeing you there!