# chatbots as producers of sources **the example of prompts referring to the historical past** frédéric clavert / c²dh (university of luxembourg)
## who am i?
# chatbots as medium of memory - they store information - they allow a form of ciruclation of this information - they can also be collective memory triggers
Erll, Astrid. [Memory in Culture](https://doi.org/10.1057/9780230321670). Palgrave Macmillan UK, 2011.
Note: I define here ‘chatbots’ as generative AI platforms designed for wide audiences, made of an interface that encourages users to enter a prompt, in order to generate texts, images, videos. Those platforms are more and more multimodal – inputs can be texts or images, as well as outputs. So we consider here chatbots that are based on diffusion systems and/or large language models. I also consider specifically chatbots, and not only their underlying engines (large language or diffusion models), because chatbots have additional layers of filtering, feedbacks, alignments and fine tuning. For instance, DeepSeek – the online chatbot – is well known to refuse to answer a question on the Tiananmen massacre. But if you used one of the DeepSeek R1 models, when they were released, directly on your computer (I’ve tested the 7b parameters one), it did not refuse to answer questions on the Tiananmen massacre, though it clearly indicated in its reasoning that it should be respecting the chinese law and sensibility on the subject. Even if, since then, they have been updated those models in order either to refuse to answer or to deny it even existed, the fact that the initial models runable on your computer would answer a question that the chatbot refused to answer show that there can be very strong differences between a model and a chatbot using this model. The Tiananmen example brings us back to memory: in those additional layers of alignment and fine tuning that are made to allow wide audiences to use easily chatbots, there are additional views on the past that are modified or embedded. As a consequence, those chatbots can be considered as medium of memory. I use here Astrid Erll’s work (Memory in Culture, 2011) where she writes that medium of memory are « “constructs versions of a past reality” and plays a role “in the encoding and decoding [Stuart Hall, 1980.] of that which is (to be) remembered.” » (p. 120ff). For Erll, medium of memory performs several functions: they store information they allow a form of ciruclation of this information they can also be collective memory triggers For instance, let’s take the example of the French numerous Monuments aux morts, that can be considered as medium of memory: they in some ways allow for the storage of information, though it is not their main function, they allow this information to circulate – basically they are a message to people passing by, they are trigger of collective memory with the many commemorations that are organised around them. Let’s go back to chatbots and let’s try to define them as medium of memory: chatbots are based on models that are storing information. Models are sets of parameters deduced from a training phase, parameters that contain patterns based on training datasets. In this sense they can be seen as storage of information, including information on the historical past, chatbots allow a form of circulation of information, when their users query them, even if their stochastic ways to restitute information, that does not include any sense of truth that can lead to hallucinations, should be carefully considered. chatbots are also triggers of collective memory. That’s their interface, based on, often, a single box where users can type questions. Studying the past, in a professional way or a more amateur one, is all about asking questions and trying to find ways to answer them. Prompts are hence a huge incentive / trigger to query the past, even if it is not the main use of generative AI platforms, of course. So, if we consider chatbots as medium of memory, we should also try understanding how they embed views on the past.
## produced by and producer or primary sources
Chatterji, Aaron, Thomas Cunningham, David J. Deming, et al. [How People Use ChatGPT](https://doi.org/10.3386/w34255). NBER Working Paper Nᵒ 34255. National Bureau of Economic Research, 2025 ; Hitzig, Zoë. [« OpenAI Is Making the Mistakes Facebook Made. I Quit. »](https://www.nytimes.com/2026/02/11/opinion/openai-chatgpt-ads.html) 11 février 2026.
Makhortykh, Mykola, Victoria Vziatysheva, et Maryna Sydorova. [« Generative AI and Contestation and Instrumentalization of Memory About the Holocaust in Ukraine »](https://doi.org/10.1515/eehs-2023-0054). Eastern European Holocaust Studies, 27 novembre 2023.
Öhman, Carl. [« Wherever There Is AI There Is Memory: AI as the Agency of the (Synthesized) Past »](https://doi.org/10.1017/mem.2025.10008). Memory, Mind & Media 4 (2025).
Note: What is of interest to me here is to see specifically chatbots as producers of sources. What do I mean by chatbots? Not only text, but also images and video generators -- all the more than platforms today are multimodal and can at least accept text and images as input, and sometimes -- this is the case for ChatGPT or Mistral Vibe for instance -- as output. Video is a slightly different case, as it is quite heavy in terms of ressources. What do I mean here by producer or primary sources? Two aspects: 1) prompts - on a daily basis, millions of people, of note ten or hundred of millions are prompting, ask chatbots to generate images or texts. - those prompts can be seen as primary sources, depending on what you want to study: they can be seen -- with lots of caveat -- as sort of open doors to users' lives. see: [@Chatterji_2025] - a former researcher at OpenAI, now at the Anthropic Institute, called that an "archive of human candor", which I'm still not sure is the best expression. [@Hitzig_2026] 2) generated artefacts Of course, the generated artefacts -- texts, images, etc -- can also be seen as a sort of source. The uses of those sources can be quite diverse, but one is to use those artefacts to understand what is in the training dataset of the underlying LLM. See for instance the research of Makhortykh *et al.*.[@Makhortykh_2023] 3) In the end, some researchers see chatbots / LLMs as “agency of the past”. So, it's not only LLMs as producer of sources, but also LLMs as the ressults of the massive primary sources that are training datasets. [@Öhman_2025]
## an archive of human candor (about the past) > army of the european union with tanks fighting on the streets of budapest Note: Let's go back to prompts as "archive of human candor". As a more or less memory studies scholar, my research question tends to consider those prompts that have mention / references to the past, and hos those references are used to “negotiate” the past -- to get textual or iconographic representations of the past that fits their own visions of the past. The difficulty of this kind of research is to find sources. - hopefully, there's stable diffusion: databases of prompts are available, - then the difficulty is to sort out of 10 million prompts the one rfereing to the historical past, including when it's implicit. And it's here that AI comes back, this time as a digital tool / method.
# language models as a method Note: - looking through 10 millions prompts is not an easy task - with a relatively small language models or through the API of a larger one (OpenAI, Claude, etc), possibly with a prompt system to try doing something - depending on the "tool" used, the percentage of positive answers is changing quite a lot from 4-5% (OpenAI) to 9% (Claude). The one I have finally used (small Ministral-3-3B-Instruct-2512-4bit). - no
## system prompt
Analyze if this image generation prompt contains a reference to the historical past. Answer YES if the prompt contains: - Historical figures (Napoleon, Caesar, Cleopatra, Marie Antoinette, etc.) - Historical events (World War, Revolution, Cold War, etc.) - Historical periods or eras (Victorian, Medieval, Renaissance, 1920s, Ancient Rome, etc.) - Historical artists or their works (Da Vinci, Rembrandt, Michelangelo, etc.) - Historical art movements (Baroque, Art Nouveau, Impressionism, etc.) - Mythology and ancient legends (Greek gods, Norse mythology, Egyptian mythology, etc.) Answer NO if the prompt: - Only uses stylistic words (vintage, retro, sepia, old photograph) - Only describes fictional/fantasy content (steampunk, cyberpunk, sci-fi) - Only mentions living celebrities in modern context - References extinct animals without historical context (dinosaurs, dodo) - Is purely futuristic with no past reference Format: [yes/no]: [one sentence reason]
## taylored analysis code Note: Code written with Claude Code and/or Mistral Vibe. Allows to focus on the research question, but involves to chezck the code.
## getting historians ready for government archives genai ready Note: Questions of the historians' skills / of interdisciplinary cooperation, etc.