TY - JOUR
T1 - Automated Assessment of Creativity in Multilingual Narratives
AU - Luchini, Simone A.
AU - Moosa, Ibraheem Muhammad
AU - Patterson, John D.
AU - Johnson, Dan
AU - Baas, Matthijs
AU - Barbot, Baptiste
AU - Bashmakova, Iana
AU - Benedek, Mathias
AU - Chen, Qunlin
AU - Corazza, Giovanni E.
AU - Forthmann, Boris
AU - Goecke, Benjamin
AU - Said-Metwaly, Sameh
AU - Karwowski, Maciej
AU - Kenett, Yoed N.
AU - Lebuda, Izabela
AU - Lubart, Todd
AU - Miroshnik, Kirill G.
AU - Obialo, Felix Kingsley
AU - Ovando-Tellez, Marcela
AU - Primi, Ricardo
AU - Puente-Díaz, Rogelio
AU - Stevenson, Claire
AU - Volle, Emmanuelle
AU - Zielińska, Aleksandra
AU - van Hell, Janet G.
AU - Yin, Wenpeng
AU - Beaty, Roger E.
N1 - Publisher Copyright: © 2025 American Psychological Association
PY - 2025
Y1 - 2025
N2 - Researchers and educators interested in creative writing need a reliable and efficient tool to score the creativity of narratives, such as short stories. Typically, human raters manually assess narrative creativity, but such subjective scoring is limited by labor costs and rater disagreement. Large language models (LLMs) have shown remarkable success on creativity tasks, yet they have not been applied to scoring narratives, including multilingual stories. In the present study, we aimed to test whether narrative originality—a component of creativity— could be automatically scored by LLMs, further evaluating whether a single LLM could predict human originality ratings across multiple languages.We trained three different LLMs to predict the originality of short stories written in 11 languages. Our first monolingual model, trained only on English stories, robustly predicted human originality ratings (r=.81). This same model—trained and tested on multilingual stories translated into English—strongly predicted originality ratings of multilingual narratives (r≥.73). Finally, a multilingual model trained on the same stories, in their original language, reliably predicted human originality scores across all languages (r ≥.72).We thus demonstrate that LLMs can successfully score narrative creativity in 11 different languages, surpassing the performance of the best previous automated scoring techniques (e.g., semantic distance). This work represents the first effective, accessible, and reliable solution for the automated scoring of creativity in multilingual narratives.
AB - Researchers and educators interested in creative writing need a reliable and efficient tool to score the creativity of narratives, such as short stories. Typically, human raters manually assess narrative creativity, but such subjective scoring is limited by labor costs and rater disagreement. Large language models (LLMs) have shown remarkable success on creativity tasks, yet they have not been applied to scoring narratives, including multilingual stories. In the present study, we aimed to test whether narrative originality—a component of creativity— could be automatically scored by LLMs, further evaluating whether a single LLM could predict human originality ratings across multiple languages.We trained three different LLMs to predict the originality of short stories written in 11 languages. Our first monolingual model, trained only on English stories, robustly predicted human originality ratings (r=.81). This same model—trained and tested on multilingual stories translated into English—strongly predicted originality ratings of multilingual narratives (r≥.73). Finally, a multilingual model trained on the same stories, in their original language, reliably predicted human originality scores across all languages (r ≥.72).We thus demonstrate that LLMs can successfully score narrative creativity in 11 different languages, surpassing the performance of the best previous automated scoring techniques (e.g., semantic distance). This work represents the first effective, accessible, and reliable solution for the automated scoring of creativity in multilingual narratives.
KW - automated scoring
KW - creative writing
KW - creativity assessment
KW - large language models
KW - narratives
UR - http://www.scopus.com/inward/record.url?scp=105001194377&partnerID=8YFLogxK
U2 - 10.1037/aca0000725
DO - 10.1037/aca0000725
M3 - مقالة
SN - 1931-3896
JO - Psychology of Aesthetics, Creativity, and the Arts
JF - Psychology of Aesthetics, Creativity, and the Arts
ER -