Knowledge collapse

A long but worthwhile paper published by Andrew J. Peterson on AI and the Problem of Knowledge Collapse

We identify conditions under which AI, by reducing the cost of access to certain modes of knowledge, can paradoxically harm public understanding. While large language models are trained on vast amounts of diverse data, they naturally generate output towards the ‘center’ of the distribution. This is generally useful, but widespread reliance on recursive AI systems could lead to a process we define as “knowledge collapse”, and argue this could harm innovation and the richness of human understanding and culture. However, unlike AI models that cannot choose what data they are trained on, humans may strategically seek out diverse forms of knowledge if they perceive them to be worthwhile. 

The author analyzes how our dependence on AI could limit our knowledge to a narrow subset of views—views on which the AI was trained—and eventually forgetting the “long-tail” ideas. We see this happening right now when we ask the Chinese artificial intelligence, DeepSeek, about Tiananmen Square or Taiwan.

With increasing integration of LLM-based systems, certain popular sources or beliefs which were common in the training data may come to be reinforced in the public mindset (and within the training data), while other “long-tail” ideas are neglected and eventually forgotten. 

Such a process might be reinforced by an ‘echo chamber’ or information cascade effect, in which repeated exposure to this restricted set of information leads individuals to believe that the neglected, unobserved tails of knowledge are of little value.

We have all seen the effect of the echo chamber in social media. An echo chamber with AI responses would only go on to solidify people’s beliefs—whether right or wrong.

But knowledge collapse has been happening throughout human evolution.

…traditional hunter-gatherers could identify thousands of different plants and knew their medicinal usages, whereas most humans today only know a few dozen plants and whether they can be purchased in a grocery store. This could be seen as a more efficient form of specialization of information across individuals, but it might also impact our beliefs about the value of those species or of a walk through a forest, or influence scientific or policy-relevant judgements.

This will be a long term effect of AI, the implications of which will be visible only after 10 or more years. 

Filed under