Michigan Tech Publications, Part 2

“Minimum Necessary Rigor” in empirically evaluating human–AI work systems

Gary Klein, LLC
Robert R. Hoffman, Florida Institute for Human & Machine Cognition
William J. Clancey, Florida Institute for Human & Machine Cognition
Shane Mueller, Michigan Technological UniversityFollow
Florian Jentsch, University of Central Florida
Mohammadreza Jalaeian, The Ohio State University

Document Type

Article

Publication Date

Fall 2023

Department

Department of Cognitive and Learning Sciences

Abstract

The development of AI systems represents a significant investment of funds and time. Assessment is necessary in order to determine whether that investment has paid off. Empirical evaluation of systems in which humans and AI systems act interdependently to accomplish tasks must provide convincing empirical evidence that the work system is learnable and that the technology is usable and useful. We argue that the assessment of human–AI (HAI) systems must be effective but must also be efficient. Bench testing of a prototype of an HAI system cannot require extensive series of large-scale experiments with complex designs. Some of the constraints that are imposed in traditional laboratory research just are not appropriate for the empirical evaluation of HAI systems. We present requirements for avoiding “unnecessary rigor.” They cover study design, research methods, statistical analyses, and online experimentation. These should be applicable to all research intended to evaluate the effectiveness of HAI systems.

Publisher's Statement

© 2023 The Authors. AI Magazine published by Wiley Periodicals LLC on behalf of the Association for the Advancement of Artificial Intelligence. Publisher’s version of record: https://doi.org/10.1002/aaai.12108

Publication Title

AI Magazine

Recommended Citation

Klein, G., Hoffman, R., Clancey, W., Mueller, S., Jentsch, F., & Jalaeian, M. (2023). “Minimum Necessary Rigor” in empirically evaluating human–AI work systems. AI Magazine, 44(3), 274-281. http://doi.org/10.1002/aaai.12108
Retrieved from: https://digitalcommons.mtu.edu/michigantech-p2/467

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Version

Publisher's PDF

Download

Included in

Cognitive Science Commons

COinS

Michigan Tech Publications, Part 2

“Minimum Necessary Rigor” in empirically evaluating human–AI work systems

Document Type

Publication Date

Department

Abstract

Publisher's Statement

Publication Title

Recommended Citation

Creative Commons License

Version

Included in

LINKS

Browse

Search

Author Corner

Michigan Tech Publications, Part 2

“Minimum Necessary Rigor” in empirically evaluating human–AI work systems

Authors

Document Type

Publication Date

Department

Abstract

Publisher's Statement

Publication Title

Recommended Citation

Creative Commons License

Version

Included in

Share

LINKS

Browse

Search

Author Corner