Frédéric Clavert, Assistant Professor in Contemporary European History and Head of the Contemporary European History research group at the C²DH, is using AI to study collective memory.
Collective memory is a social phenomenon, shared through individuals and passed from one generation or group to the next. It both shapes a group’s identity and influences the way in which that group understands or interprets the past.
Clavert is interested in this topic and how it intersects with AI, specifically chatbots. He is currently analysing prompts used to generate images via chatbots. “In a given prompt, there are references to the past that are linked to the user’s specific views, and the result is generated by a system that was trained on datasets in which similar or different views of the past are embedded,” Clavert explains. “Plus there are layers of software that are added to the model to create the chatbot, where you can have filters that also influence the view of history embedded in the answer.”
‟ In a given prompt, there are references to the past that are linked to the user’s specific views, and the result is generated by a system that was trained on datasets in which similar or different views of the past are embedded”
Assistant professor / Senior research scientist
If users don’t get the images they had hoped for, they can rephrase their prompts, a process Clavert calls “negotiation with the bots”. This process of negotiation, together with the confrontation between the views of the past that the user brings and the views of the past embedded within the AI platform, are particularly interesting for his work.
Popular themes
Depending on the AI source, obtaining data can be a challenge. Clavert’s research has therefore focused on Stable Diffusion, an open-source text-to-image platform initially released in 2022. Given that Stable Diffusion is regularly used to train other systems, the platform provides open access to databases of user-generated prompts and their related images.
For Clavert, this meant narrowing down a corpus of some 10 million prompts and associated images. One challenge was to find a way of identifying all the prompts that reference the past, both implicitly and explicitly. Using a smaller sample of these prompts, Clavert then used the APIs of Claude AI and ChatGPT to further narrow down this subset, before performing quantitative and qualitative analysis with topic modelling.
One striking example is a prompt describing European Union army tanks entering Budapest. Of course, the European Union doesn’t have an army, so there’s no reference to the actual past. But the prompt was written a few years ago, during a time of tough negotiations between Hungary and the European Commission when EU funding to Hungary had been blocked over the country’s human rights violations. “Theoretically, it’s a prompt about the present, but if you put it into a search engine it generates images from the Hungarian Revolution of 1956 and the Soviet tanks sent to Budapest at that moment, so it’s an implicit reference to the past,” Clavert explains.
Clavert has also focused on other prompts related to the EU to try to identify which concepts, people and moments from the past are associated with it. Interestingly, he has discovered that the EU tends to be associated with the Middle Ages – or at least a modern interpretation of the Middle Ages – and also empire and war. “There’s always a part of the prompt that’s more about the aesthetics of the image that the user is searching for,” he adds. “One thing that is very popular is Soviet propaganda-style posters. Also since the data is from 2022 and 2023, the aggression against Ukraine features strongly.” Other popular themes among the prompts he has reviewed are the World Wars, Napoleon Bonaparte and Arthurian legends.
Narratives and other consideration
Given that certain layers of the software are “black boxes”, it is not always clear why a chatbot provides a particular answer. In addition to filters, chatbots are also aligned with societal values. As Clavert explains, in authoritarian states, it can be easier to understand the filters on certain topics. On a Russian image generation platform, for instance, there is clear censorship on Ukraine. Meanwhile, the Chinese DeepSeek chatbot does not discuss certain political topics, such as Tiananmen Square.
Clavert is aware that “there could be a bias, depending on the corpus I’m using,” and that his corpus may be more European. Additionally, the corpus he is using is mainly in English. “I’m sure that if I used a corpus in Chinese, for instance, the themes would be quite different.”
If he had to provide a wish list for expanding his future research, Clavert would be interested in a partnership with one of the platforms and a bigger budget so he can research more prompts. He would also be keen to interview some users so he could glean more insights about their intentions when writing their prompts.