Uncovering the Semantics of Concepts Using GPT-4 and Other Recent Large Language Models

Open Access

Authors: Gaël Le Mens, Balász Kovács, Michael T. Hannan and Guillem Pros

PNAS, Vol. 120, No 49, November, 2023

We use GPT-4 to create “typicality measures” that quantitatively assess how closely text documents align with a specific concept or category. Unlike previous methods that required extensive training on large text datasets, the GPT-4-based measures achieve state-of-the-art correlation with human judgments without such training. Because training data is not needed, this dramatically reduces the data requirements for obtaining high performing model-based typicality measures. Our analysis spans two domains: judging the typicality of books in literary genres and the typicality of tweets in the Democratic and Republican parties. Our results demonstrate that modern Large Language Models (LLMs) can be used for text analysis in the social sciences beyond simple classification or labelling.

DOI:

10.1073/pnas.2309350120

This paper originally appeared as Barcelona School of Economics Working Paper 1394

Uncovering the Semantics of Concepts Using GPT-4 and Other Recent Large Language Models

BSE Social Media

Want the latest from BSE?