Michigan Tech Publications, Part 2

Cluster Validity for Fuzzy Text Segmentation

Evan Lucas, Michigan Technological UniversityFollow
Timothy C. Havens, Michigan Technological UniversityFollow

Document Type

Conference Proceeding

Publication Date

11-9-2023

Department

College of Computing; Department of Computer Science

Abstract

Topical text segmentation is an unsupervised learning process of separating documents, transcripts, and other text streams into segments-i.e., clusters-where the text in each segment is considered to be topically similar, and distinct from other segments. In this paper, we consider the task of fuzzy text segmentation, where words, or utterances, have shared membership in all segments. This is especially nascent for text sources like transcripts, where multiple topics are often simultaneously discussed: e.g., cost and deliverables in a sales meeting. One challenge in segmentation and clustering is how to choose the hyperparameters-e.g., number of clusters-in the algorithm. Hence, here we propose a fuzzy cluster validity metric, a modified Davies-Boudin index, and demonstrate how this index can be used to tune a fuzzy text segmentation algorithm. We demonstrate how fuzzy clustering can be used as a form of text segmentation and show some applications on benchmark data.

Publication Title

IEEE International Conference on Fuzzy Systems

ISBN

9798350332285

Recommended Citation

Lucas, E., & Havens, T. C. (2023). Cluster Validity for Fuzzy Text Segmentation. IEEE International Conference on Fuzzy Systems. http://doi.org/10.1109/FUZZ52849.2023.10309734
Retrieved from: https://digitalcommons.mtu.edu/michigantech-p2/358

Link to Full Text

COinS

Michigan Tech Publications, Part 2

Cluster Validity for Fuzzy Text Segmentation

Document Type

Publication Date

Department

Abstract

Publication Title

ISBN

Recommended Citation

LINKS

Browse

Search

Author Corner

Michigan Tech Publications, Part 2

Cluster Validity for Fuzzy Text Segmentation

Authors

Document Type

Publication Date

Department

Abstract

Publication Title

ISBN

Recommended Citation

Share

LINKS

Browse

Search

Author Corner