Who is more creative in “branching thinking” tests—humans or large language models?

Scientific Reports examined the ability to generate ideas and originality, and the practical question is when the model expands creativity and when it reduces it.

Artificial intelligence and creativity. Illustration: depositphotos.com
Artificial intelligence and creativity. Illustration: depositphotos.com

A new study published inScientific Reports Systematically examines the performance of large language models (LLMs) on creative tasks compared to humans, focusing on branching reasoning tests—a common psychological tool for measuring creative ability. The findings raise important questions about the role these models play in creative processes: whether they enhance or replace human creativity.

Branching thinking as a measure of creativity

Creativity, despite being an elusive concept, is measured in psychology through standardized tests. One of the key concepts is divergent thinking—the ability to propose a wide range of different ideas to an open-ended problem, while creating solutions that are not just variations on the same basic idea.

Typical tests include tasks such as "How many different uses can be suggested for a simple object" or "How many possible solutions exist for a given situation." Performance on these tests is usually measured along three main dimensions: fluency (the number of ideas generated), flexibility (the number of different categories of ideas), and originality (how rare the idea is in relation to the pool of answers).

The methodological challenge: measurement without bias

With the advent of large language models, an interesting research tension has arisen. On the one hand, these models excel at generating lists and ideas very quickly. On the other hand, there is an argument that they are prone to statistical averaging, clichés, and patterns that have emerged in their training data.

The main challenge in the study is to prevent a situation where the measurement mainly reflects “typing speed” and not true creativity. Models can easily win the fluency measure simply because of their ability to produce text quickly. Therefore, according to Research announcement on EurekAlert!It is essential to put the models against metrics that penalize repetition and ask them to avoid predictable responses.

Statistical originality is not the same as creative value

Even if a language model scores high on originality according to statistical algorithms, this does not necessarily indicate creativity of human value. An idea can be statistically rare, but lack practical utility or relevance.

In addition, several important methodological issues were discovered:

Dependence on the wording of the prompt: A small change in wording may cause a dramatic change in the result, which raises the question of whether the index reflects the model's ability or the skills of the guideline drafter.

The anchoring effect: Models sometimes tend to create sequences influenced by the first examples they produce, similar to humans getting stuck in one line of thought. The interesting question is whether the models produce a truly broad space of ideas, or just variations around one central pattern.

Practical meaning: complementary or substitute tool

The productive approach to this research is not “man versus machine,” but understanding the conditions under which language models contribute to human creativity. If a model excels at idea flow, it can serve as a “sketch engine”—producing a wide initial variety from which humans select, combine, and filter.

However, if a model fails to be truly flexible, it can create the illusion of creativity: a lot of text with few real breakthroughs. The practical benefit lies in understanding the types of open-ended tasks to which the model contributes most, versus those to which it induces early convergence on similar ideas.

Research limitations and broader discussion

Findings should be considered in the appropriate methodological context. The study is limited by several factors: the nature of the population being tested, the language in which the tests were administered, and the specific model being tested. Moreover, “language model” is a broad category—different models behave differently, and even the same model may change with technological updates.

Therefore, it is appropriate to consider these results as a methodological demonstration rather than a final decision on the question of the creative capacity of language models. The study is part of a broader discussion on how artificial intelligence technologies affect creative processes in education, work, and other fields.

More of the topic in Hayadan:

3 תגובות

  1. Certainly, humans – the comprehensive theory of the exact sciences, geometry and physics, based on the natural knowledge of the nervous system – have set out, and artificial intelligence has no natural knowledge.

  2. The complete understanding is not creative and all it does is recite the opinions of the multitude.
    And so artificial intelligence inhibits creativity.

    A. Asbar

Leave a Reply

Email will not be published. Required fields are marked *

This site uses Akismet to filter spam comments. More details about how the information from your response will be processed.