Challenges and considerations with code-mixed NLP for multilingual societies

Show simple item record Srivastava, Vivek Singh, Mayank 2012-09-26T07:22:34Z 2012-09-26T07:22:34Z 2021-06
dc.identifier.citation Srivastava, Vivek and Singh, Mayank, "Challenges and considerations with code-mixed NLP for multilingual societies", arXiv, Cornell University Library, DOI: arXiv:2106.07823, Jun. 2021. en_US
dc.description.abstract Multilingualism refers to the high degree of proficiency in two or more languages in the written and oral communication modes. It often results in language mixing, a.k.a. code-mixing, when a multilingual speaker switches between multiple languages in a single utterance of a text or speech. This paper discusses the current state of the NLP research, limitations, and foreseeable pitfalls in addressing five real-world applications for social good crisis management, healthcare, political campaigning, fake news, and hate speech for multilingual societies. We also propose futuristic datasets, models, and tools that can significantly advance the current research in multilingual NLP applications for the societal good. As a representative example, we consider English-Hindi code-mixing but draw similar inferences for other language pairs
dc.description.statementofresponsibility by Vivek Srivastava and Mayank Singh
dc.language.iso en_US en_US
dc.publisher Cornell University Library en_US
dc.title Challenges and considerations with code-mixed NLP for multilingual societies en_US
dc.type Pre-Print en_US
dc.relation.journal arXiv

Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search Digital Repository


My Account