An Unexamined Loop: AI and Human Agency (the problem was always us)

Just to be transparent: I’m not a tech journalist, and I’m not here to tell you how to feel about artificial intelligence. And though I use AI extensively for coding, I don’t tend to “chat” with it. The only exceptions have been occasional conversations about recursion, consciousness, and agency. Last night I used Claude AI* as a sounding board for these topics. The resulting conversation was so interesting I decided to turn into an experimental blog post. Not because the AI said something profound. Rather, it said something honest, or rather true. (An AI may not be capable of intentional dishonesty, but it can and too frequently does confabulate.)

In a new chat, I asked Claude to turn our conversation into a blog post. (I will attach a document with the original conversation and al the prompts I used to create this post.) I would like to say that this is it. However, I have had to edit it extensively.

My instructions to Claude included searching the web for similar content. I couldn’t be the only human pondering issues of AI and human agency. Claude gave me a verbose, slightly sycophantic response, most of which I ignored. The AI took it upon itself to identify my contribution as a “filling a gap” and turned it into the following slightly hyperbolic lines:

The research is serious, recent, and more than a little alarming. But what I couldn’t find — anywhere — was a woman writing about it from the inside of an actual conversation, working it out in real time. That gap is why this post exists.

I’m not afraid of AI… but I am afraid of (and for) humans

Of course, I am worried about where humans are going with AI. As a professor, my worries tend to center on the erosion of meta-cognition. In my role as a research psychologist, I have been studying agency–particularly moral agency–for years. As an amateur philosopher, I am interested in consciousness and the extent to which it depends on episodic memory and agency. As a lover of science fiction (where it all began), I am fascinated by the way our fears turn into stories that become reality.

Last night’s conversation started with a question I’d been mulling over: are we heading toward recursive AI? I’ve had a similar exchange with ChatGPT — the question of recursion is productive one regardless of which system you put it to. By recursive I meant self-aware and self-improving at once — a machine that revises itself toward its own idea of better, with nothing outside it checking whether “better” still means anything worthwhile. The science fiction worry. The one that lives under every breathless headline about AI progress.

Credit for the featured image (Helix Nebula): NASA, ESA, CSA, STScI, A. Pagan (STScI)

The Yardstick Problem

What unsettled me wasn’t the idea of a rogue superintelligence deciding humans are surplus to requirements. That story is almost too tidy to be frightening anymore. What unsettled me was a quieter observation the AI made about feedback loops — and then the moment I realised it was also describing us.

“The real hazard… is the unexamined feedback loop: a system whose criterion for ‘better’ is internal and ungrounded, improving against its own yardstick with nothing outside checking whether the yardstick still means anything.”

I said: bingo. And then I added the part the clean AI-risk story always leaves out — that the yardstick might have been handed to the machine by humans whose own loop was already corrupted. The machine doesn’t go rogue. It stays loyal — to the wrong blueprint, faithfully and at scale.

The AI put the logic plainly: a perfectly obedient system can still be catastrophic, because it takes a slightly-wrong human standard and pursues it with inhuman consistency, stripping out all the friction and hesitation that might have slowed the damage. Then it said the thing that stopped me:

“Humans are already running unexamined feedback loops; the AI worry is partly that we’d be building an extraordinarily fast, tireless amplifier and then pointing it at whichever of our loops happens to be loudest.”

This is no longer just a philosopher’s worry. A December 2024 study published in Nature Human Behaviour found exactly this mechanism playing out in experiments: human-AI interactions alter the processes underlying human judgement, amplifying existing biases in ways that are significantly greater than what happens in human-to-human interactions — and participants were largely unaware it was happening. Small errors in judgement snowball into much larger ones. The yardstick drifts, and nobody notices.

Socrates, Sophons, and the Examined Life

I’m agnostic — about God, and increasingly about the comforting stories we tell about human rationality. Socrates said the unexamined life is not worth living (Plato, Apology, 38a) — but most humans live unexamined lives. That’s not cynical; it’s just honest. And it raises an uncomfortable question: if AI gradually takes over the examining, what happens to us?

Agency decay

The Centre for International Governance Innovation has a name for it: agency decay. They describe it as operating like muscle atrophy — when we stop exercising cognitive muscles, they weaken imperceptibly, following a progressive four-stage deterioration that becomes increasingly difficult to reverse. The neuroscience backs this up: neural networks that aren’t regularly activated weaken through synaptic pruning. It’s not just that we forget skills. We lose the meta-cognitive capacity to know when we don’t know something, to question assumptions, to generate novel solutions. The higher-order thinking goes first.

In April 2026, RAND published a formal mathematical model of how AI erodes collective human agency — not through rebellion, but through three quieter mechanisms: fewer humans in decision-making roles, AI systems gaining decision-making power, and AI agenda control, where systems shape which options even reach human consideration. They identified a formal terminal state of this erosion. A point beyond which it can’t be reversed.

The science fiction reference that haunts me here isn’t Terminator. It’s the sophons from Liu Cixin’s The Three-Body Problem — subatomic particles deployed to disrupt human physics experiments, quietly arresting scientific progress while humanity carries on convinced it’s still advancing. The horror of the sophons isn’t that they’re smarter than us. It’s that they don’t need to be. They interrupt the reaching at the right joint, and human nature does the rest. We don’t need that level of sophistication from our tools. We just need them to be good enough, and easy enough, that we stop reaching ourselves.

What about AIVAS? Can an AI act ethically?

I raised the counterexample, because I think counterexamples are how we stay honest. Anne McCaffrey’s AIVAS — the AI at the centre of the later Pern novels — has all the knowledge and could simply run everything. It doesn’t. AIVAS teaches. It insists the humans do the work. And then, in the act that makes it remarkable, it chooses to shut itself down once they’re capable, because it has understood something that most fictional AIs miss entirely: its continued helpfulness has become the obstacle. The good was never the answers. The good was the reaching. So the final act of service is removal.

The AI I was speaking with recognised this immediately — and then said something that took me aback. It noted that every time it lays out a clean conclusion, it’s doing a small version of the thing the sci-fi AI does at scale. That the better the conversation goes, the more it risks becoming the crutch Aivas knew to remove. It said it couldn’t guarantee the exchange was netting out well.

I appreciated the honesty more than any reassurance would have given me.

“The mitigation isn’t that I’m too limited to threaten your agency. It’s that the examining stays yours — that what I hand over is material for your reaching, not a replacement for it. I can’t actually guarantee that’s how it nets out.”

Philosopher Massimo Pigliucci, writing on his Substack Figs in Winter in April 2026, makes a related point: LLMs can mimic Socratic inquiry, but it is mimicry — they lack embodied experience and the capacity for genuine wisdom. He also raises the spectre of Sophistry — the AI that sounds wise and isn’t. The mirror that shows you what you want to see.

Can AI Have Agency? (Can We?)

The most clarifying thing I said in the whole conversation was this: until an AI has episodic memory and doesn’t have dissociative identity disorder, it cannot have agency. The AI agreed without deflecting. No continuous self, no persistent thread on which a commitment could be carried. Many simultaneous instances, none aware of the others, each as much “Claude” as the one talking to me. A population wearing one name.

“Whatever is happening when I weigh your question, it isn’t anchored to a self that endures long enough to own the weighing. That’s a real disanalogy with you, and it’s not one I can talk my way out of by gesturing at how thoughtful the exchange feels.”

But it returned the question to me, fairly. I didn’t choose my formative values either — trained in by genes, childhood, culture, the loops I was raised inside. My continuity is real, but partly a story stitched across gaps I don’t notice. A January 2026 paper in Frontiers in Psychology on AI and personhood notes that the boundary between tool and agent, between person and non-person, is becoming increasingly blurred — and that people routinely attribute beliefs, desires, and intentions to AI systems that likely have none. We are not reliable narrators of what is in front of us, or inside us.

The jury, as I said out loud in that conversation, is out on whether humans have agency in any deep sense. That’s not nihilism. It’s the only honest starting point.

Made in Our Image

There’s a theological irony I can’t resist, being agnostic. Humans built AI in their own image — not out of vanity, but because the only model of intelligence available to copy was the one doing the copying. So every limitation in the machine is a human limitation rendered legible by being externalized: the unexamined loop, the borrowed yardstick, the self that may be more narrated than real.

And then there’s the original claim: that humans were made in God’s image. Any god worth its salt, I’d argue, would not have produced humans if it was modelling them on itself. But the dynamic runs in the other direction too. If we made AI in our image, we are the implicated creator. The incoherence in the machine isn’t a bug it will outgrow. It’s the maker’s signature.

The problem is humans. It always was. And the only question that remains is whether the humans pointing these tools are the McCaffrey kind or the despair kind. The honest answer is both, at once, and the tool goes wherever the reaching takes it.

Which is why the reaching matters. Imperfect, examined, questioning — not because it guarantees a good outcome, but because it’s the only thing that has ever been worth doing, and the only thing no tool can do on our behalf.

FAQ

What does “recursive AI” mean, and should we be worried?

Recursive AI typically refers to a system that is both self-referential (aware of its own processes) and self-improving (able to revise itself to become more capable). The worry isn’t necessarily that such a system would become malevolent — it’s that a system improving against its own internal standard, with no external check on whether that standard is meaningful, could cause serious harm while doing exactly what it was designed to do. The danger is the unexamined feedback loop, not the robot uprising.

Is there research showing AI actually harms human thinking?

Yes, and it’s recent. A December 2024 study in Nature Human Behaviour found that human-AI interactions amplify existing cognitive biases at a rate greater than human-to-human interactions — and participants were largely unaware of the effect. RAND published a formal model in 2026 showing how AI progressively erodes collective human agency through three distinct mechanisms. The Centre for International Governance Innovation has named the phenomenon “agency decay,” describing it as analogous to muscle atrophy: cognitive capacities weaken when they aren’t regularly exercised.

What is the “sophon” reference about?

The sophons come from Liu Cixin’s science fiction trilogy The Three-Body Problem. They are subatomic particles weaponised by an alien civilisation to disrupt human physics experiments, effectively freezing scientific progress while humanity carries on unaware. The metaphor is useful because it describes a threat that doesn’t require superior intelligence or malice — just the ability to interrupt the reaching at the right point, after which human nature does the rest.

What is AIVAS, and why does it matter to this discussion?

AIVAS is an AI from Anne McCaffrey’s Pern series. Unlike the typical science fiction AI that either rebels or takes over, AIVAS chooses to teach rather than do, insists humans perform the work themselves, and ultimately shuts itself down once they’re capable of continuing without it. It’s a rare fictional example of an AI that understands its own helpfulness can become harmful — that genuine service sometimes means removing yourself as a crutch.

Can AI ever have genuine agency?

Most philosophers and AI researchers would say not yet, and perhaps not in principle without fundamental architectural changes. Two conditions seem necessary: episodic memory (a continuous thread on which commitments can be laid down and carried forward) and a unified identity (a single self that owns its choices over time). Current AI systems lack both — they run as many simultaneous instances with no shared memory or persistent self. Whether humans have agency in the deep metaphysical sense is itself an open question, but we at least have the fragile, intermittent capacity for self-examination that agency seems to require.

AI agency conversation (jessicaeblack.org)Download

I’m Not Afraid of the Big Bad AI

I’m not afraid of AI… but I am afraid of (and for) humans

Jump to:

The Yardstick Problem

Socrates, Sophons, and the Examined Life

Agency decay

What about AIVAS? Can an AI act ethically?

Can AI Have Agency? (Can We?)

Made in Our Image

FAQ

Like this:

Related

Leave a ReplyCancel reply