Workshop "Data Structures and Cleaning Textual Data"
Yann Audin (Phd candidate in digital humanities at the Université de Montréal and project leader at the Canada Research Chair in Digital Textualities) will be leading a series of three workshops on automatic language processing.
The third workshop "Data Structures and Cleaning Textual Data" is aimed at people with a knowledge of Python who want to learn how to cleanse textual data and use JSON, CSV and XML data formats. Participants will analyze a literary text of their choice using the Python libraries Spacy and NLTK. They will also learn how to transform a text into textual data according to their research interests.
Python is used in automatic language processing, programming education, artificial intelligence, scientific programming, web development and many other fields. This so-called high-level language is particularly readable by humans, which contributes to its popularity. Python is distributed under a very permissive license, and is supported by a strong and vast community of practice that develops libraries for almost any situation.
This workshop will take place on November 11 2024 at CRIHN, room C-8132, 3150 rue Jean Brillant, Université de Montréal from 10:30 a.m. to noon.
Downloading a recent version of Anaconda is recommended, but not necessary.