Assessing Clustering Methods to Establish Reliability and Consensus in Card Sorting Tasks

Document Type

Conference Proceeding

Publication Date



Department of Cognitive and Learning Sciences


Human factors researchers often collect qualitative data that involve statements about a system or tool. Establishing consistent patterns in such data is important for making conclusions about the data. When a theoretically motivated coding scheme has not been established, one might use card sorting techniques to have independent raters generate a similarity space in order to create a bottom-up taxonomy. In this paper, we will explore how clustering and scaling techniques can be used to derive a common taxonomy from multiple car sorting results and judge how consistent the groupings are. We examine this process on two datasets, one with the qualitative data from an interview study with physicians and another with the data regarding a website design for usability purposes. Results showed that the different clustering methods had very similar high-level results, and these had high within-group similarity across groups, suggesting inter-rater reliability. Finally, we will discuss the benefits of different algorithms for clustering and scaling and propose measures to assess strong consensus and inter-group sorting reliability.

Publication Title

Lecture Notes in Networks and Systems