GU4315: Multilingual Technologies and Language Development | S. Muresan, I. Zaugg

Comparative Literature and Society
Undergraduate and Graduate Seminar
F 10:10AM-12PM

Innovations in digital technologies have shown their potential to be at times breathtakingly beneficial, and at others divisive or troubling. With regard to digital technologies’ impact on the ecosystem of language diversity, evidence suggests that new technologies are one contributor to the decline and predicted extinction of 50-90% of the world's languages this century. Yet digital innovations supporting a growing number of languages also have the potential to bolster language diversity in ways unimaginable a few years ago. Will innovations in multilingual natural language processing bring about a renaissance of language diversity, as users no longer need to rely on English and other dominant languages? To address this question, this course will introduce a dual view on language diversity:

  1. A typology of language vitality and endangerment
  2. A resource-centric typology (low-resource vs. high-resource) regarding the availability of data resources to develop computational models for language analysis.

This course will address the challenge of scaling natural language processing technologies developed mostly for English to the rich diversity of human languages.

The resource-centric typology will also contribute to the dialogue of what is “Data Science.” Much research has been dedicated to the “Big Data” scenario; however “Small Data” poses equally challenging problems, which this course will highlight. This course brings data and computational literacy about multilingual technologies to humanities students, while also exposing computer science and data science students to ethical, cultural, business, and policy issues within the context of multilingual technologies. 

Link to Vergil
Note: only courses offered during the two previous semesters have active Vergil links.