Document Type

Conference Proceeding

Publication Date

7-2023

Department

College of Computing

Abstract

The Ojibwe language has several dialects that vary to some degree in both spoken and written form. We present a method of using support vector machines to classify two different dialects (Eastern and Southwestern Ojibwe) using a very small corpus of text. Classification accuracy at the sentence level is 90% across a five-fold cross validation and 72% when the sentence-trained model is applied to a data set of individual words. Our code and the word level data set are released openly at https://github.com/evanperson/OjibweDialect.

Publisher's Statement

© 2023. Publisher’s version of record: https://aclanthology.org/2023.americasnlp-1.8.pdf

Publication Title

Proceedings of the Annual Meeting of the Association for Computational Linguistics

ISBN

9781959429913

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Version

Publisher's PDF

Share

COinS