Document Type
Conference Proceeding
Publication Date
7-2023
Department
College of Computing
Abstract
The Ojibwe language has several dialects that vary to some degree in both spoken and written form. We present a method of using support vector machines to classify two different dialects (Eastern and Southwestern Ojibwe) using a very small corpus of text. Classification accuracy at the sentence level is 90% across a five-fold cross validation and 72% when the sentence-trained model is applied to a data set of individual words. Our code and the word level data set are released openly at https://github.com/evanperson/OjibweDialect.
Publication Title
Proceedings of the Annual Meeting of the Association for Computational Linguistics
ISBN
9781959429913
Recommended Citation
Hartwig, K.,
Lucas, E.,
&
Havens, T. C.
(2023).
Identification of Dialect for Eastern and Southwestern Ojibwe Words Using a Small Corpus.
Proceedings of the Annual Meeting of the Association for Computational Linguistics, 58-66.
Retrieved from: https://digitalcommons.mtu.edu/michigantech-p2/288
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Version
Publisher's PDF
Publisher's Statement
© 2023. Publisher’s version of record: https://aclanthology.org/2023.americasnlp-1.8.pdf