The PRH posits that as AI models—particularly deep neural networks—grow in scale and are trained on diverse tasks and modalities, their internal representations begin to converge. This convergence suggests that these models are approximating a shared, abstract structure of reality, akin to Plato's concept of ideal forms. In this view, the varied data inputs (text, images, audio) are mere "shadows" of a deeper, more fundamental representation that models are collectively uncovering

image.png

representational convergence ~ convergence of realism

representation of reality, meaning a representation of the joint distribution over events in the world that generate the data we observe

We call this converged hypothetical representation the “pla- tonic representation” in reference to Plato’s Allegory of the Cave (Plato, c. 375 BC), and his idea of an ideal reality that underlies our sensations. The training data for our algo- rithms are shadows on the cave wall, yet, we hypothesize, models are recovering ever better representations of the ac- tual world outside the cave

Despite the fact that different text models were trained on different modalities, they found that the models often embed data in remarkably similar ways

Neural networks also show substantial alignment with biological representations in the brain (Yamins et al., 2014). This commonality may be due to similarities in the task and data constraints both systems are confronted with. Eventhough the mediums may differ – silicon transistors versus biological neurons – the fundamental problem faced by brains and machines is the same: efficiently extracting and understanding the underlying structure in images, text, sounds, etc.

similarity means

image.png

image.png

image.png

image.png

image.png