#
AI and Historical Practice: The Archipelago and the Method
**Frédéric Clavert** / C²DH — University of Luxembourg
Ctrl-Alt-History / University of Antwerp / 29 April 2026
Note: - Thanks to hosts + Antwerp as city of overlapping pasts - Title split: *The Archipelago and the Method* (to be unpacked over the hour) - But first: spend a few minutes on the conference title itself — it frames what I want to do **Development:** First, thank you for inviting me to talk today, a special thanks to Eline Ceulemans with who I have quite a lot discussed, and to Maarten an Ginderachter for presenting me. The title of my keynote — *The Archipelago and the Method* — is deliberately split in two. I will return to each term across the next fifty-odd minutes. But before I do, I need to spend a few minutes on the title of *this conference* — Ctrl-Alt-History — because it frames what I want to do today.
## From 'Ctrl-Alt-Del'... Note: **Development:** The keyboard shortcut Ctrl+Alt+Del was designed by IBM engineer David Bradley around 1980-81, not as a user-facing feature but as an internal debugging tool for developers working on the original PC. It's a quite famous anecdote, narrated on the shortcut wikipedia page. Bradley told the story publicly at the IBM PC 20th anniversary panel (Tech Museum of Innovation, San José, August 2001), where he delivered the widely-quoted line: "I may have invented it, but I think Bill made it famous." Bill Gates, for his part, publicly called it a mistake at a Harvard Campaign fireside chat with David Rubenstein (21 September 2013) — he wanted a single key but IBM's keyboard designer refused. In a way, there were two visions of this: Bradley did not want a hard reboot to be easily delt with only one hand, Gates wanted something easier. There is a theoretical framework on the temporality of updates, crashes and reboots as a cultural regime, in a quite important work by Wendy Hui Kyong Chun (Wendy Hui Kyong Chun, *Updating to Remain the Same: Habitual New Media*, Cambridge, MA: MIT Press, 2016). She uncovers a paradox: we update and restart precisely in order to stay inside the same habitual media environment. In this framework, rebooting is not seen as an isolated gesture but as a structure of habitual digital life: we restart constantly in order to continue. That trajectory — from a hidden engineering fix (Bradley) or from a habitual regime of restarts (Chun) — tells us something. The reboot is what we reach for when we no longer understand what is happening inside the machine. It is, in a way, an act of epistemic surrender: we give up trying to debug, and we start over. Since 2022 and the rise of chatbots that emphasized the current trzend of AI generative based innovations, did we, as historians, lost the understanding of what is happening around us and inside our main tool, the computer, so that we need to reboot?
## ...to '*Ctrl + Alt + **History***' **History as what is threatened** — to be erased, replaced, deleted OR **History as what allows us to reboot differently** Note: This conference replaced *Del* with *History*. I read this substitution in two incompatible ways, and I am going to keep them both alive across the whole keynote. **First reading — history as the threatened term.** The *Del* key, in this reading, is generative AI. History is what stands to be deleted: archival knowledge automated away, contextual judgment replaced by statistical prediction, hermeneutic traditions flattened into tokens. In this reading, Ctrl-Alt-History is a cry of alarm. **Second reading — history as the response.** History, here, is what we reach for when our system of knowledge freezes. When the epistemology of the chatbot leaves us speechless — when we do not know how to evaluate what a machine "tells" us about the past — we reach for the historian's craft. In this reading, Ctrl-Alt-History is a programme: use history to reboot the debate. I am going to refuse to choose between these readings today. I like the tension between them, tension that I see as a strong motivation. I will come back to this tension in the conclusion.
## "Look, they have six fingers" Note: I will add one more comment to this introduction. We need to move away from chatbotds, or at least from the way chatbots are putting together words to form some sort of narration of the historical past. Those narrations (if there's no RAG, if there's no correct prompt) look like the images that were generated three years ago: metaphorically they have six fingers. This is not nothing. Factual errors in historical discourse matter, and the scale at which generative systems are now producing such discourse is genuinely new -- the scale is new, not misinformation, of course. I am not here to dismiss these concerns. But *this is the wrong question* around which to organise our debate. If the only question we know how to ask is "does AI get the facts right?", we remain stuck inside an image of AI that is obsolete — the AI-as-oracle. And the real transformation of our practice is happening elsewhere, in the places where this question does not reach. So something is blocking when we speak about AI -- here generative AI -- to many historians, students, etc. I want to argue that what is blocking is our image of what AI is. By 'our', I mean historians collectively.
# I. The wrong question?
## The fears are legitimate - Hallucinations and confabulations - Factual imprecisions - Plagiarism and authorship collapse - Deskilling of the researcher - Opacity of the models Note: There are a series of historians's fear that deserve to be taken seriously. They are legitimate and serious. They are all based on the *same image* of what AI is, the AI-as-oracle -- I'll get back to that in a few minutes.
## Hallucinations & confabulations *Not a bug. A feature* Note: - Not a bug, a feature of the mechanism: models produce *plausible* text. A language model produces plausible text. - Plausible-true = correct ; plausible-false = hallucination -- a very anthropomorphic term that is a problem, btw. - Ratio improves, mechanism unchanged - Any historian using these tools must understand this
## Factual imprecisions Note: - Shades into hallucination but distinct as a category - E.g. cites a real historian, attributes words they never wrote -- A recent example of a French article in communication science attributed a book from Latour, to Latour, but at Raisons d'Agir, a bourdieusian French publisher -- for those knowing a bit French sociology, it is honestly hilarious. - For our discipline based on the indirect observation of the past through old artefacts — it is non-trivial.
## Plagiarism & authorship collapse > What does it mean, *epistemically*, to incorporate into one's argument a sentence produced by a machine trained on millions of other sentences? Note: This question if all the more interesting, that I asked Claude to generate it. - Beyond the student-essay case. - epistemic status of a machine-produced sentence in an argument - Trained on millions of other sentences — where does authorship sit?
## Deskilling the pedagogical question. Note: If the machine does the translation, the summary, the first draft — does the graduate student of 2030 *still learn* how to do those things? - it's a real pedagogical fear — not dismissing - If the machine does translation / summary / first draft → does the 2030 grad student learn them? - Honest: I have not resolved this And we could go further: deskilling can also concerns us, already trained historians, if chatbots are badly used.
## Opacity Note: We do not have full access to what is inside these models, nor to the data on which they were trained. With open source models, we can know the weights but we still do not know the training data of those models (with few exceptions). As our discipline's epistemology rests on *traceability of sources*, this introduces a structural tension that we cannot easily dismiss.
## The dominant image: the chatbot-oracle Note: - AI figured as speaking entity: question → pronouncement → we grade it - Oracle frame: correct *for chatbots* -- though with time, it's less and less true, wrong *for AI as a whole* - Chatbots = consumer tip of something much larger -- and we could also argue, that from one to the other, the oracle dimension is not at all the same - If we stay within the oracle frame, we miss something, we miss the real transformation
## A diversity of epistemic signatures - Language models (LLMs) - Agents and automated workflows - Research infrastructures - Corpus interrogation tools - Systems connecting heterogeneous sources Note: - AI ≠ chatbot. The landscape: - LLMs (often silent inside pipelines — classify, extract, embed) - Agents/workflows (not answering — *operating*, with tools) - Research infrastructures (→ P2) - Corpus tools (→ P3 Lester) - Systems connecting heterogeneous sources (the decisive one) - Each has different failure modes and epistemic commitments - Collapsing them all = "microscope = all scientific instruments" **A language model** is a statistical system trained to predict the next token in a sequence. It can be deployed as a chatbot, usually with many other layers of software, but it can also be deployed silently inside a pipeline — to classify documents, to extract entities, to align translations, to generate embeddings for semantic search. Most of these uses have nothing to do with dialogue. **Agents and workflows** are systems in which a language model is given tools — a search function, a database query, a file reader — and invoked iteratively to accomplish a task. The model is no longer "answering"; it is operating. This is a very different epistemic object. **Research infrastructures** are the layer where AI meets the cyberinfrastructure tradition — I will return to this at length in Part 2. **Corpus interrogation tools** are what let us ask semantic questions of messy archival material — more on this in Part 3. **Systems connecting heterogeneous sources** are for me a decisive point. The most interesting thing generative AI does, for historical research, is not produce text. It is connect things that did not previously connect. Each of these has different failure modes, different epistemic commitments, different relationships to the historian's craft.
## Connecting The most significant capability of generative AI in research is not *answering*. Note: The main epistemic shift introduced by generative AI in our field is not about what these systems *say*. It is about what they allow us to *ask*, -- and in a way, this question is present since the advent of digital humanities and digital history as we know it for 20 years, For instance, semantic search across a noisy corpus means we can find conceptual proximities where keyword search would have failed. There was technics that existed before machine learning, but machine learning and language models today allow something much more efficient. Dynamic query reformulation means the research question can evolve *during* the search, not only before and after it. Bridging structured and unstructured sources means we no longer have to pre-model our archive in order to interrogate it. And — this matters — these tools *negotiate* rather than *impose*. An agent asks itself whether it has enough information. It calls one more tool. It revises its approach. This is categorically different from a database query, which either returns results or does not. I will demonstrate this concretely in Part 3 with the Sean Lester dataset. For now, I ask you to hold this substitution in mind: not AI as *speaker*, but AI as *connector* and hence as infrastructure.
## The displacement we need | From... | To... | |---|---| | AI as *oracle* | AI as *infrastructure* | | "Does AI get things wrong?" | "What does AI change about what we can ask?" | | Content reliability | Transformation of practice | Note: - **Oracle → Infrastructure**: From Delphi to Latour. An oracle is consulted; an infrastructure is inhabited. We do not evaluate an infrastructure by asking whether it tells the truth. We evaluate it by asking what it enables, what it makes impossible, whom it serves, and whom it excludes. Moving from *oracle* to *infrastructure* is moving from a Delphi frame to a Latourian frame — from a question about pronouncements to a question about the sociotechnical assemblage. - **"Is AI wrong?" → "What does AI change?"** The first question is evaluative and terminal: we give the system a grade. The second is genealogical and ongoing: we ask how our practice is being reconfigured. The first question is answered once and for all; the second has to be asked again every time the practice shifts. - **Content reliability → Transformation of practice**: This is the most important shift for us as historians. The epistemology of our discipline is not primarily about the reliability of individual statements; it is about the traceability of how we came to those statements. The transformation of practice — *how* we search, *how* we connect, *how* we revise — is where the epistemological action is.
## Uncertainty as a core feature Note: - History has always been probabilistic navigation — fragment, hypothesis, revision - **Ginzburg, *paradigme indiciaire*, 1979** — medical/criminal/historical inference - We have never had a shared *vocabulary* for this Our discipline has long practiced something that looks very much like probabilistic reasoning. There is a full tradition of philosophy of history that is based on the bayasian paradigm and that is not related to AI, with a prominent figure that is Aviezer Tucker. - AI makes probabilistic navigation *explicit* And it is time to make this maybe more explicit. - The seed: *the agent revising its queries = mirror for the historian revising her hypotheses* We read fragmentary evidence. We formulate hypotheses. We revise them in the light of new sources. We make arguments that could, in principle, be overturned by the next find. Carlo Ginzburg, in 1979, called this the *paradigme indiciaire* — the evidential paradigm — and located it at the intersection of medical diagnostics, criminal investigation, and historical inference. We have not, as a discipline, developed a shared vocabulary to describe this probabilistic navigation. We have left it implicit, as a kind of tacit craft. One of the things generative AI is doing — and this is what I want us to notice — is making this probabilistic navigation *explicit*. Not because the machines reason the way we do, but because working with them forces us to put into words what we do. The seed I am planting: *the agent that revises its own queries is a mirror for the historian who revises her own hypotheses.* I will come back to this in the conclusion, and I will argue that this mirror is, perhaps, the most interesting thing AI gives us.
# II. The archipelago syndrome
## A long established diagnosis The **archipelago syndrome** > Frédéric Clavert & Serge Noiret (eds.), *L'histoire contemporaine à l'ère numérique / Contemporary History in the Digital Age*, Bruxelles: P.I.E.-Peter Lang, **2013**. Note: In 2009 (published 2013), during a conference I organised the first time I worked in Luxembourg, Marin Dacos, at this time head of what became a bit later OpenEdition, talked about the Humanities as a digital archipelago (in terms of data, methodes, platforms, etc). The archipelago syndrom is about dispersed corpora, inert documents, an impossible cartography. His answdr was -- remember we're in 2009 -- a **cyberinfrastructure** (borrowed from NSF Atkins Report 2003) In a way -- and I don't want to understimate what many projects on linked open data for instance did since then -- we can argue that the current trend of innovation in AI could be the most serious attempt, more that 15 years later to deliver this cyberinfrastructure that would connect the different islands of the historians' digital archipelago between them. It's that moment where I want to historicize myself -- because what I am going to talk about is something where i played a small role, but still a role. In 2009, we convened a conference in Luxembourg on the state of digital history. The proceedings appeared in 2013 with P.I.E.-Peter Lang as *L'histoire contemporaine à l'ère numérique / Contemporary History in the Digital Age* — a bilingual volume that attempted to take stock of what digital tools were changing in the practice of contemporary history. Marin Dacos -- who's belgian, by the way -- published this chapter oin the archipelago, an idea that he developped later. Digital history, we wrote, was a field of projects that did not yet constitute a field of practices. Databases did not talk to one another. The tools that would let a historian move across heterogeneous sources did not exist. I say "same limits" because, looking back from 2026, none of the 2010-era answers resolved the archipelago syndrome. > the Atkins Report, *Revolutionizing Science and Engineering Through Cyberinfrastructure: Report of the National Science Foundation Blue-Ribbon Advisory Panel on Cyberinfrastructure* (Daniel E. Atkins et al., NSF, 2003).
## Did the archipelago persist? > Historical data is not a kitten, it’s a sabre-toothed tiger (Lermercier / Zalc) Note: - 15 years of massive investment (Huma-Num, DARIAH, CLARIN, H2020) — archipelago still felt - Structural blockage: each answer required *upstream standardisation* - Cost of standardisation > benefit of the bridge, for most scholars - Why AI agents are interesting: *remove* the upstream requirement, negotiate heterogeneity **Development:** Yes. Between 2010 and 2025, massive investments were made — at national level (Huma-Num in France, CLARIAH in the Netherlands), at European level (DARIAH and CLARIN as ERICs, European Research Infrastructure Consortia), at project level (countless Horizon 2020 projects). And yet: any historian in this room, sitting down to a new research project in 2025, still experiences the archipelago. The project-specific database. The local metadata schema. The tool that works for one corpus but not the next. The shared standards that required so much upstream ontological work that most scholars simply did not use them. Of course there are strong exceptions, and very visible ones: Europeana, for instance. Why? The structural blockage — and I want to be precise here — is that each generation of answers presupposed that a connection was required prior to standardisation. You had to model your sources in a schema compatible with the schema of your neighbour before any bridge could be built. The cost of that upstream standardisation was, for most scholars, higher than the benefit of the resulting bridge. The reason I find AI agents epistemologically interesting is not that they are more powerful than a database. It is that they remove the requirement of prior standardisation. They negotiate across heterogeneity rather than imposing homogeneity, of course with limitations. During a meeting at the C2DH, my colleague Caroline Muller (from Rennes 2), and I, enoun ced a series of provocations. One of them was: History is dealing with messy data. Claire Lemercier and Claire Zalc even argue that it's what good about historical data: "Historical data is not a kitten, it’s a sabre-toothed tiger".
## Successive answers and their limits: from 1990s databases to AI agents Note: | Moment | Answer | Limit | |---|---|---| | 1990s | Databases, GIS | Disciplinary silos | | 2000s | Linked Open Data, cyberinfrastructures | Rigid upstream standardisation | | 2010s | DARIAH, CLARIN | Institutional scale, low agility | | Today | AI agents | To be evaluated — the subject of this keynote | - Walk through the four rows on the vertical sub-slides (↓) - Each row: a different *kind* of answer to the same archipelago problem - Today's row = what we are evaluating — *structurally different* answer
## 1990s (and before) — Databases / GIS **Powerful tools. Deep silos.** Note: - First wave: discipline-specific DBs, prosopography, historical GIS, TEI corpora - Tools sophisticated within each discipline - No way to query across silos The first wave of digital history: prosopographical databases for medievalists, GIS for historical geography, TEI-encoded corpora for literary studies.
## 2000s — Linked Open Data, cyberinfrastructures **Magnificent ambition. Prohibitive cost for most scholars.** Note: - Answer to silos: standardise *upstream* — CIDOC-CRM, RDF, stable URIs - Ambition: magnificent - Cost: remodel the entire corpus before participating → most historians opted out The answer to silos: standardise upstream. CIDOC-CRM, RDF, stable URIs. *You had to remodel your entire corpus in shared ontologies before you could participate.* Can be pertinent for lots of projects.
## 2010s — DARIAH, CLARIN **Genuinely valuable. Not agile by design.** Note: - Institutional answer: ERICs (DARIAH, CLARIN) — shared tools/services at EU scale - Genuinely valuable — I use them; many of us do - But not agile: individual researcher with specific archive cannot wait The institutional answer: European Research Infrastructure Consortia. Shared tools and services at continental scale. *The individual researcher with a specific archive cannot wait for a DARIAH service to be built around her needs.*
## AI agents *We are historians. We do not believe in solutions.* Note: - Not framed as "solution" — historians don't do solutions - *Structurally* different: removes the upstream-standardisation requirement - What we evaluate across the rest of the keynote - There is a price -- we'll come back to that
## From linking data to linking practices Note: - Previous tech: connected *data* via formal ontologies. Labour upstream. - AI agents: connect *practices* — heterogeneous, unstructured, ongoing - Why: (a) LLMs produce semantic representations of unstructured text on the fly; (b) agents iterate, decide mid-task - Metaphor: not *better maps* of the archipelago, but a *navigator* that reasons with incomplete maps - Delivery on the promise = empirical question → P3 shows it Previous generations of connective technology operated at the level of data: they required the sources to be pre-modelled in a shared representation. Databases needed schemas. Linked data needed ontologies. The intellectual labour was all upstream. AI agents operate at the level of *practices*. An agent can read a PDF of poor scan quality, a structured database, a blog post, an archival description, and a tweet, and reason across them without requiring that they be pre-modelled in a common schema. This is not magic; it is the combination of two things: (a) language models can produce robust semantic representations of unstructured text on the fly, and (b) agents can decide, mid-task, to call additional tools or sources. This is a genuinely different epistemic object. Whether it delivers on its promise is an empirical question — Part 3 will show what it does in practice, on the Sean Lester corpus.
## cyborg sources Note: Let's introduce a concept I have been developing — the *cyborg source*. - Concept I've been developing: *cyborg source* - Horizon: **Haraway, "Cyborg Manifesto", 1985** — figure of hybridity (borrowed, not full programme) - Definition: a source partly produced by a machine, requires double reading (past trace + machine mediation) - Examples: OCR'd page, AI summary, embedding-space proximity - Not degraded — *additional* layer of historicity (the machine's) - Returns in P3 (Lester's OCR'd diaries) and P4 (political economy of the mediation) The reference horizon is Donna Haraway, "A Cyborg Manifesto: Science, Technology, and Socialist-Feminism in the Late Twentieth Century", originally published in *Socialist Review* no. 80 (1985). Haraway's cyborg is the figure of hybridity — the refusal of the clean line between human and machine, nature and culture, organism and artefact. I am borrowing the figure, not the full political programme. What I call a *cyborg source* is a historical source that is partly produced by a machine — and cannot be understood without reckoning with that production. Examples: - The OCR'd page of an archival document: the text we read is a machine interpretation of the ink; every reading is mediated. - The AI summary of a corpus of press articles: the summary tells us about the corpus *and* about the statistical regularities the model has learned from its training data. - The embedding of a historical document in vector space: the proximity relations we discover reflect both the document and the model's implicit theory of semantic similarity. In each case, the historian has two objects to interpret: the past-trace, and the machine-mediation. The cyborg source is not a degraded source; it is a source with an additional layer of historicity — namely, the historicity of its machine production. This concept will return in Part 3 (Sean Lester's OCR'd diaries are cyborg sources) and in Part 4 (the political economy of the machines that produce the mediation matters ethically).
## The archipelago of (discrete) practices The challenge is not the reliability of a single tool. It is the **coherence of an archipelago of (discrete) practices**. Note: - Contemporary historian (not just DH people) already navigates strata — paper, digitised, born-digital, scraped data - Already an archipelago *of practices* Most of those practices are what we have called with Caroline Muller "discrete practices" for some years and that we developped a bit more in our book, *Écrire l'histoire* -- which is another problem. - generative AI and AI agents: *navigate this archipelago without demanding that it first become an empire* - Research challenge shifts: not "is this tool reliable?" but "is the archipelago of my practice coherent?" - Epistemic burden back where it belongs — on the historian - Transition: let me show what this looks like
# III. connecting the archipelago today (demo)
## ClioDeck A local application (Electron / React / TypeScript): - Integration with Zotero, Obsidian, Tropy, - Hybrid search (semantic + lexical) - Knowledge graph, named-entity recognition, OCR - MCP server exposing tools to frontier models - MCP clients (Europeana / Gallica for now) Note: - In development since last December. **Local-first** (cf. P4 sovereignty). - Four layers + MCP overlay — detail on the vertical sub-slides (↓) At the scale of a researcher.
## ClioDeck — Ingestion layer Heterogeneous sources, by design. - Zotero libraries (bibliography, annotations) - Obsidian vaults (research notes) - Tropy collections (archival photographs) *No upstream data-modelling required.* Note: **Key beats** - Designed for heterogeneity, not against it - Meets each tool where the historian already works - Crucial: no pre-modelling required — the archipelago is the input, not the problem
## ClioDeck — Processing layer Each document indexed *lexically* and *semantically*. - **OCR** for scanned PDFs - **Named-entity recognition** (persons, places, organisations, dates) - **Embedding generation** — vector representation of text *This is where the machine-mediation happens* Note: - OCR + NER + embeddings — each document gets a dual index - This is the layer where "cyborg source" becomes material: every processed document is mediated - Invisible to the end-user but the epistemic work is here
## ClioDeck — Search & Knowledge graph **Hybrid retrieval** — HNSW (semantic, nearest-neighbour on embeddings) + BM25 (classical lexical). Ask *semantic* questions — *"passages about diplomatic negotiation"*. Or *lexical* questions — *"every mention of Rauschning"*. Note: - Hybrid = HNSW (semantic) + BM25 (lexical), results merged - Historian can ask either kind of question and get coherent results - Knowledge graph: emergent, not pre-designed - Entities and their relations as the corpus surfaces them
## ClioDeck — MCP server An overlay, not a layer. Exposes ClioBrain's tools to frontier models (Claude, GPT) via Model Context Protocol. **The model is not inside ClioBrain. It is a client that negotiates with its tools.** Note: **Key beats** - MCP server is the *overlay* that makes tools *negotiable* - Model is a client — outside, calling in - The agent decides which tool, when, how Keeping the archipelago without too much cost.
## The Sean Lester case **Sean Lester (1888-1959)**: Irish diplomat, High Commissioner of the League of Nations in the Free City of Danzig, 1933-1937. His diaries cover one of the first major international crises faced by the League with Nazi Germany. *An accessible point of entry for archivists and heritage specialists who do not necessarily know the League of Nations in detail.* Note: - Sean Lester (1888-1959) — Irish journalist → diplomat; first Irish rep at the League in Geneva - **Danzig 1933-1937**: Free City created by Versailles under League supervision; German majority, Polish minority, autonomous government - Local government taken over by Nazi Party from 1933; Lester writes through the dismantling of minority protections + League oversight itself - First-hand front-line witness with diplomatic immunity - Later: Deputy SG, then wartime (1940-) and last SG of the League - For us: a messy archive no existing tool handles well
## The corpus problems - OCR of uneven quality - Inconsistent orthography — names, places, concepts - Multilingual (English, with German and Polish traces) *A corpus that traditional keyword search effectively does not reach.* Note: - Handwritten → variable, uneven OCR - Names unstable: "Rauschning" rendered many ways; "Danzig" / "Dantzig" / "Gdańsk" depending on language context - Simple keyword question ("Lester on Rauschning Jan-Jun 1935") = most hits missed - Traditional answer = weeks of upstream cleaning *before the first historical question* - This is the archipelago syndrome at corpus scale — and where the AI agent shows what it can do
## a changing process Note: - Start with a deliberately vague question - Agent does not *answer*, it *operates*: semantic search, reads passages, identifies recurring entities (Greiser, Forster, Rauschning), knowledge graph relations, revises - End-of-session: draft doc with pointers back to sources - I verify, correct interpretations, reformulate, re-run - **Key claim**: the historical question *reformulates itself during the search* - Pre-agent: question fixed upstream, errors discovered too late. Agent: continuous re-framing, mirrors how historical thinking actually works.
## Expliciting (and externalizing?) what historians already do *The question transforms as the archive answers.* Note: - Traditional image (question → sources → answer) = fiction every working historian knows - Reality: question changes as sources answer The traditional image of historical research has the historian posing a question, consulting sources, and producing an answer. In reality, every working historian knows that this linear image is a fiction. The question changes as the sources answer. New sources suggest new questions. The final question of a monograph is rarely the question with which the monograph began. - ClioDeck makes this iteration *explicit and external* — logged, auditable - Not *automation* of historical thinking. *Externalisation*. - First time we can study/evaluate/improve the iterative process This is *externalisation* of historical thinking. And externalisation matters epistemically because it means we can now study, evaluate, and improve the iterative process — something we could not do when it happened only inside the historian's head. The machine does not introduce probabilistic navigation into history. It makes that navigation legible (A. Tucker)
## My archipelago Note: - part of this keynote was born in a conversation with Claude about MCP servers → drifted to cyberinfra → Dacos 20098 → structure emerged - Not a confession — **the demonstration in act** of what I'm arguing - The keynote is itself a cyborg artefact; genesis traceable in a log - So is much of the intellectual work I produce today — and much of yours, named or not - Turn now: from demonstration to consequence
## Stochastic maieutics > AI as the interlocutor who helps thought to be born. Note: - *Maieutics* = Socratic, **Plato, *Theaetetus* 148e-151d** — philosopher as midwife, adds no content, facilitates delivery - *Stochastic* = interlocutor is probabilistic (LLM generates statistically plausible) - Empirical finding: those plausible responses are precisely the ones that let me formulate my next thought - Labour mediated by stochastic process → thinking-*with*, not outsourced thinking - ≠ oracle (consulted for answers); = maieutic interlocutor (engaged for the labour of thinking-with)
## distal complicity Note: Let's conclude Part 3 with a question — what about ethics? I built ClioDeck as a POC. I don't code -- basically, it's Claude Code which coded it, along my instructions. - *Distal complicity* (**Lepora & Goodin, *On Complicity and Compromise*, OUP 2013**): contribution real but at a distance, partial control, individual withdrawal wouldn't undo the wrong - Using Claude = distally complicit (labour, energy, sovereignty). I didn't build it, can't individually fix it. My use contributes at distance. - Not disqualifying. Requires a *specific ethical posture*: critical engagement with conditions of implication - Third path: not abstention, not absolution. **Complicit, and lucid about it.** When I use Claude to help think through a keynote, I am distally complicit — in the labour conditions of the annotators who made the model possible; in the energy consumption of the data centres that run it; in the geopolitical asymmetries of who controls which model. I did not build the system. I cannot individually fix it. But my use contributes, at a distance. Lepora and Goodin's point is not that distal complicity is disqualifying. Their point is that distal complicity *requires a specific ethical posture*: critical engagement with the conditions of one's implication. Not abstention, nor absolution.
# IV. The cost of using AI
## Discrete practices and debates on AI uses Note: - Everything in P3 happens *quietly* — inside the workflow, mostly invisible in the final output - A monograph shaped with Claude looks, on the page, exactly like one written alone - Discretion is not neutral: makes these practices both powerful *and* dangerous - Dacos 2010 saw this risk for cyberinfra — same pattern here - Twist: the plumbing now *reasons* with us **Development:** Stochastic maieutics, cyborg sources, negotiated tools — everything I described in Part 3 has a common feature: most of it happens quietly. It happens inside the researcher's workflow, between them and the model, can be erased from the final output (theoretically ClioDeck keeps traces of everything). A monograph that took shape in dialogue with Claude could look exactly like a monograph written alone. This discretion is not neutral. It is what makes these practices both powerful and dangerous. Powerful, because they integrate seamlessly into existing intellectual work. Dangerous, because they escape both institutional regulation and disciplinary debate. Dacos, in 2009, already identified this risk for cyberinfrastructures: the quiet plumbing that no one discusses is exactly the plumbing whose design assumptions become unexamined constraints. The risk is identical here, with an additional twist: the plumbing now *reasons* with us.
## Discrete digital practices Note: **Key beats** - **Clavert & Muller, *Écrire l'histoire à l'ère numérique*, Armand Colin, 2025** — concept of *pratiques numériques discrètes* - Three criteria: digitally mediated, undocumented in output, not methodologically discussed - Aggregate effect: discipline transformed silently — not at individual ethics level, at aggregate level - A discipline that shifts by 2030 without its literature noticing has lost control of its own transformation - Inequality: invisible practices favour those with best tools → visibility as precondition for action
## Distal complicity and the historian's ethics *How to act lucidly — without abstaining, without absolving?* Note: Using Claude contributes to the commercial viability of Anthropic, which contributes to the economic logic demanding more data, more compute, more labour. **This could describe most of our situation.** - Real contribution, at distance, partial control - Our actual zone when using frontier models - Individual withdrawal would not, on its own, undo the wrongs - Does not disqualify — *requires a specific posture* Their answer, simplified: 1. Minimise the contribution one can reduce 2. Make the contribution **visible** rather than hidden 3. Engage actively with forces working to reform the system *This is the framework I apply to the three structural risks that follow.* - Lepora & Goodin's normative answer — three-part - Visibility is key: why I'm saying this on stage rather than silently - Reform engagement: collective, not individual - This is the frame for labour / environment / sovereignty
## Three structural ethical risks 1. **Digital labour** — the annotators who make the models possible 2. **Environment** — energy, water, carbon, raw materials 3. **Sovereignty** — who controls the corpora, who accesses the results Note: All three are distal complicity. None is a reason to abstain. All three require collective, lucid action. - Name the three structural risks — all forms of distal complicity - Walk each on the vertical sub-slides (↓) - Common posture (per Lepora & Goodin): minimise reducible contribution, make visible, engage reform - Not reasons to abstain — reasons to act collectively
## Digital labour **The annotators who make the models possible.** Note: - References: **Gray & Suri, *Ghost Work*, 2019** — empirical study of on-demand workforce labelling training data - **Casilli, *En attendant les robots*, 2019** — political economy of click-work and platform capitalism - RLHF phase: annotators reviewing outputs, often reading toxic content for hours to teach refusals - Documented exploitative cases (e.g. Sama / OpenAI Kenya — Time Jan 2023, **to verify**) - Honest posture: acknowledge (saying it on stage = visibility), support collective reform - Dishonest postures: total refusal while ignoring other structural wrongs / silent use
## Environment **Energy. Water. Carbon. Raw materials.** Note: - Strubell, Ganesh & McCallum, "Energy and Policy Considerations for Deep Learning in NLP", *ACL* **2019** - Bender, Gebru, McMillan-Major & Shmitchell, "On the Dangers of Stochastic Parrots", *FAccT* **2021** Training costs are dramatic. Deployment costs, at scale, may exceed training. *Using these models at research scale is distal complicity in this resource regime.* - **Training** (Strubell et al., ACL 2019) — first systematic estimate; transatlantic-flights headline; absolute costs have grown with model size - **Inference** — less discussed but at deployment scale potentially exceeds training; water cooling a concern in drought regions - **"Stochastic Parrots"** (Bender, Gebru et al., FAccT 2021) — essential reading (also: the paper that cost Gebru her Google job) - For us: cannot individually solve. Minimise waste (do we need to run the same query ten times?). Advocate transparency. Support regulation. - Distal complicity not dischargeable — can be managed lucidly
## Sovereignty **Who controls the corpora? Who accesses the results?** Note: Most frontier models are US-owned. Most European research data is processed through infrastructures we do not control. *For a European, public-sector humanities discipline, this is not a neutral situation.* **ClioBrain's local-first architecture is partly a response to this concern.** - Most frontier models US-owned → queries subject to US law (CLOUD Act), embedded in US geopolitical strategy - Non-issue for some material; *serious* for oral history with consent protocols / conflict archives / vulnerable populations - Broader: European humanities processing itself through non-European infra = partial cession of intellectual autonomy - Partial responses underway: open-weight European models (Mistral), EU sovereign compute, AI Act (in force in stages 2024-2026). None yet a full answer. - **ClioBrain local-first** = partial, individual-scale answer: everything that can run locally does. Frontier model called only when its capability is genuinely needed.
## These are not reasons to abstain They are reasons to act **collectively**. Note: Everything I have said in Part 4 — the discretion of the practice, the distal complicity, the labour, the environment, the sovereignty — is real. It is not hypothetical. It is the condition under which we work. - But conclusion ≠ abstention: abstention presumes a neutral outside that doesn't exist (Google Scholar too is algorithmic, commercial, energy-intensive) But the conclusion to draw is not abstention. Abstention is a luxury that presumes a neutral outside from which one can refuse complicity. No such outside exists. The historian who refuses to use any AI-mediated tool is still using Google Scholar, which is also algorithmic, also commercially operated, also environmentally costly. - Conclusion = collective action: - break discretion by collective documentation - address complicity by collective advocacy - answer sovereignty by collective infrastructure-building - Leads to final concept of P4 → the forge This brings me to the final concept of this part.
## bricolage, braconnage, sabotage... Three classical individual postures: - **Bricolage** (Lévi-Strauss, *La pensée sauvage*, 1962): making do with what is at hand. - **Poaching** (*braconnage* — Michel de Certeau, *L'invention du quotidien*, 1980): tactical, against-the-grain use of institutional resources. - **Sabotage**: critical refusal — lucid, but sterile. Each is an *individual* response to a collective problem. Note: **Key beats** — three classical individual postures, none adequate - **Bricolage** (Lévi-Strauss, *La pensée sauvage*, 1962) — improvisation with what's at hand. Individual, non-cumulative. - **Braconnage / poaching** (de Certeau, *L'invention du quotidien*, 1980) — tactical reading against the grain, using resources not as foreseen. Clever, still individual. - **Sabotage** — lucid critical refusal. Sometimes honourable. But sterile — leaves the infrastructure to others. - All three = *individual* responses. Inadequate for a *collective* problem. Not wrong — they don't scale. Each of these postures is an *individual* response. To a collective problem, they are inadequate — not because they are wrong, but because they do not scale.
## ...and forge **A sociotechnical, collective infrastructure.** Note: I propose the forge as the fourth posture, and as the one our discipline needs. A forge — in the software-engineering sense (GitLab, GitHub, institutional forges) but also in the older artisanal sense — is a place where a collective produces shared tools, documents them, and maintains them over time. The forge is not a tool; it is an infrastructure for making tools. The epistemic shift from bricolage to forge is: from individual improvisation to collective, documented, critically-evaluated practice. - Shift: from individual improvisation to collective, documented, critically-evaluated practice - Concrete instances: **Journal of Digital History** — which I co-edit — is building institutional practice for peer review of AI-mediated scholarship, including authors' declarations of their AI use.
# Conclusion: An epistemology of probability
## Back to Ctrl-Alt-History **Refusing the naive reboot** Note: - **History as threatened** demands → refuse erasure: interpretive skill is a *precondition* for responsible use of probabilistic tools, not its replacement - **History as response** demands → refuse the naive reboot: no clean slate, no Ctrl+Alt+Del for the discipline — critical continuity on sociotechnical conditions, discrete practices, distal complicity - Neither erasure nor reboot. **Critical continuity** = the posture.
## Assuming a probabilitic epistemology Note: The AI agent that dynamically decides which source to consult next makes visible a logic the historian has always exercised — intuitively, tacitly. We navigate in uncertainty. We formulate hypotheses. We revise. - Historical work = probabilistic navigation, always. Fragment → hypothesis → revise → defeasible argument. Not weakness — our *specific mode of rigour*. - Discipline has lacked the vocabulary. Oscillated positivism ↔ relativism, missing the probabilistic middle - AI agent = a *mirror*: externalises what we've always done internally - The most interesting thing AI gives us: not an answer, a mirror. Historical work has always been probabilistic navigation. We read a fragment. We form a hypothesis about what it suggests. We read another fragment. We revise. We identify patterns that are never certain. We make arguments that are always defeasible. This is not a weakness of our craft; it is its specific mode of rigour. The AI agent — with its visible sequence of query revisions, its explicit uncertainty, its iterative reformulation — holds up a mirror. In the mirror we see, externalised, something we have always done internally. - Question: is the AI moment the one where the discipline finally **owns** the epistemology it's always practised? - If yes: new ways to think *error*, *revision*, *disagreement*; new vocabularies for dialogue with public, other disciplines, machines now in the room I want to leave you with a question, not a claim. Our discipline has resisted the explicit thematisation of its probabilistic character. We tell ourselves we do empirical research. We produce "findings". We "establish" facts. The vocabulary is indicative, not modal. We do not say "this claim holds with probability *p*" — we say "this happened" or "we do not know". But our practice, at its best, is modal all the way down. The tentative interpretation, the competing hypotheses, the arguments from silence, the weighing of sources of different reliability — all of this is probabilistic reasoning in a vocabulary that refuses to say so. The question I leave you with: is the AI moment — this moment in which we find ourselves working daily with systems that are openly probabilistic — the moment in which our discipline finally *owns* the epistemology it has always practised? If so, this is no small thing. A discipline that owns its probabilistic character can think about error differently. Can think about revision differently. Can think about disagreement differently. And can — perhaps — find new vocabularies for dialogue with the public, with other disciplines, and with the machines now in the room.
## Thank you [inactinique.net](https://inactinique.net)
## References - Bender, Emily M., Timnit Gebru, Angelina McMillan-Major & Shmargaret Shmitchell. "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?" *Proceedings of FAccT 2021*. - Bowker, Geoffrey C. & Susan Leigh Star. *Sorting Things Out: Classification and Its Consequences*. Cambridge, MA: MIT Press, 1999. - Casilli, Antonio. *En attendant les robots. Enquête sur le travail du clic*. Paris : Seuil, 2019. - Chun, Wendy Hui Kyong. *Programmed Visions: Software and Memory*. Cambridge, MA: MIT Press, 2011. - Chun, Wendy Hui Kyong. *Updating to Remain the Same: Habitual New Media*. Cambridge, MA: MIT Press, 2016. - Clavert, Frédéric & Serge Noiret (eds.). *L'histoire contemporaine à l'ère numérique / Contemporary History in the Digital Age*. Bruxelles : P.I.E.-Peter Lang, 2013. - Clavert, Frédéric & Caroline Muller. *Écrire l'histoire à l'ère numérique*. Paris : Armand Colin, 2025. - Dacos, Marin. « Une cyberinfrastructure pour les sciences humaines et sociales ». *Blogo Numericus*, 13 septembre 2010. - de Certeau, Michel. *L'invention du quotidien, 1. Arts de faire*. Paris : UGE, 1980. - Edwards, Paul N. *A Vast Machine: Computer Models, Climate Data, and the Politics of Global Warming*. Cambridge, MA: MIT Press, 2010. - Ginzburg, Carlo. « Spie. Radici di un paradigma indiziario », in A. Gargani (ed.), *Crisi della ragione*. Torino : Einaudi, 1979. - Gray, Mary L. & Siddharth Suri. *Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass*. Boston : Houghton Mifflin Harcourt, 2019. - Haraway, Donna. "A Cyborg Manifesto: Science, Technology, and Socialist-Feminism in the Late Twentieth Century", in *Simians, Cyborgs and Women*. New York : Routledge, 1991. - Lepora, Chiara & Robert E. Goodin. *On Complicity and Compromise*. Oxford : Oxford University Press, 2013. - Lévi-Strauss, Claude. *La pensée sauvage*. Paris : Plon, 1962. - Ricœur, Paul. *La mémoire, l'histoire, l'oubli*. Paris : Seuil, 2000. - Star, Susan Leigh. "The Ethnography of Infrastructure". *American Behavioral Scientist* 43, no. 3 (1999): 377-391. - Strubell, Emma, Ananya Ganesh & Andrew McCallum. "Energy and Policy Considerations for Deep Learning in NLP". *Proceedings of ACL 2019*.