Predicting Human Psychometric Properties Using Computational Language Models

Journal article

A. Laverghetta, Animesh Nighojkar, Jamshidbek Mirzakhalov, John Licato
arXiv, 2022

Semantic Scholar ArXiv DBLP DOI

Cite

APA Click to copy
Laverghetta, A., Nighojkar, A., Mirzakhalov, J., & Licato, J. (2022). Predicting Human Psychometric Properties Using Computational Language Models. ArXiv.

Chicago/Turabian Click to copy
Laverghetta, A., Animesh Nighojkar, Jamshidbek Mirzakhalov, and John Licato. “Predicting Human Psychometric Properties Using Computational Language Models.” arXiv (2022).

MLA Click to copy
Laverghetta, A., et al. “Predicting Human Psychometric Properties Using Computational Language Models.” ArXiv, 2022.

BibTeX Click to copy

@article{a2022a,
  title = {Predicting Human Psychometric Properties Using Computational Language Models},
  year = {2022},
  journal = {arXiv},
  author = {Laverghetta, A. and Nighojkar, Animesh and Mirzakhalov, Jamshidbek and Licato, John}
}

Abstract

Transformer-based language models (LMs) continue to achieve state-of-the-art performance on natural language processing (NLP) benchmarks, including tasks designed to mimic human-inspired “commonsense” competencies. To better understand the degree to which LMs can be said to have certain linguistic reasoning skills, researchers are beginning to adapt the tools and concepts from psychometrics. But to what extent can beneﬁts ﬂow in the other direction? In other words, can LMs be of use in predicting the psychometric properties of test items, when those items are given to human participants? If so, the beneﬁt for psychometric practitioners is enormous, as it can reduce the need for multiple rounds of empirical testing. We gather responses from numerous human participants and LMs (transformer- and non-transformer-based) on a broad diagnostic test of linguistic competencies. We then use the human responses to calculate standard psychometric properties of the items in the diagnostic test, using the human responses and the LM responses separately. We then determine how well these two sets of predictions correlate. We ﬁnd that transformer-based LMs predict the human psychometric data consistently well across most categories, suggesting that they can be used to gather human-like psychometric data without the need for extensive human trials.