August 2025

Prof. Adrien Doerig (BCCN & Freie Universität Berlin), together with collaborators from the Universities of Osnabrück, Minnesota, and Montréal, led a study showing that modern language models from Artificial Intelligence (AI)—when given natural descriptions of scenes—can predict how the human brain responds to visual input.
 
When we view the world, our brain builds rich internal representations—not just identifying objects like a “tree” or a “person,” but grasping context, meaning, and relationships. Yet until recently, scientists lacked the tools to capture and quantitatively study this high-level visual understanding.
 
In a new study published in Nature Machine Intelligence the research team used large language models (LLMs)—the same type of models behind ChatGPT—to extract “semantic fingerprints” from scene descriptions. These fingerprints turned out to align closely with patterns of brain activity recorded via fMRI as participants viewed the same scenes, and allowed the team to decode textual descriptions of what the people were seeing based only on the neuroimaging measurement.
 
The team also trained computer vision models to predict these LLM-based semantic fingerprints from images directly. These models, guided by linguistic representations, matched brain activity more accurately than many of today’s best image recognition systems.

The results suggest that human visual representations may be organized in a way that mirrors how modern language models represent meaning—opening new doors for both neuroscience and artificial intelligence.
 

→