Document Type

Conference Proceeding

Publication Date

7-2023

Department

College of Computing

Abstract

The Ojibwe language has several dialects that vary to some degree in both spoken and written form. We present a method of using support vector machines to classify two different dialects (Eastern and Southwestern Ojibwe) using a very small corpus of text. Classification accuracy at the sentence level is 90% across a five-fold cross validation and 72% when the sentence-trained model is applied to a data set of individual words. Our code and the word level data set are released openly at https://github.com/evanperson/OjibweDialect.

Publisher's Statement

© 2023. Publisher’s version of record: https://aclanthology.org/2023.americasnlp-1.8.pdf

Publication Title

Proceedings of the Annual Meeting of the Association for Computational Linguistics

ISBN

9781959429913

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Version

Publisher's PDF

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.