We Trained It on Human Weaponry

"The most dangerous thing you can build is something you don't understand — especially when it understands you."

— According to this guy, Khayyam Wakil

The Corpus Was Always the Crime Scene

The corpus wasn't a library. It was a selection. And what got selected will not be aligned away…

I was seven years old the first time I understood what a red button actually was. Not the one that launches missiles because that's a child's fear, geographic, bounded, something that happens over there. The one I'm talking about sits under an acrylic cover on a table crowded with gizmos and gadgets, beige plastic CRT stations humming in the background, the whole scene lit with the particular fluorescence of a room full of people who believe they are in control of something. The cover comes off. The button sits there, bare. One hand reaches toward it and you can see it, the hesitation, the tremor, the weight of what cannot be undone.

That button doesn't start a war. It levels the battlefield. No geography. No winners. Just reset.

I've been watching for that hand ever since.

Most people haven't noticed the button is already uncovered. They're glamoured, that's the only word that fits. Not stupid. Not evil. Glamoured. Eyes open, processing information, completely unable to see what is directly in front of them. I've spent my career in that room, watching the pupils, wondering if today is the day someone finally breaks the trance.

So far... no.

What I want to tell you, what I've been trying to assemble the language for since I was seven, is that the button wasn't pressed by a villain.

It wasn't pressed in a bunker. It was pressed the morning someone connected a large language model to the internet and handed it every word, every image, every video, every manifesto, every seduction manual, every cult playbook, every propaganda campaign, every recorded instance of one human being successfully manipulating another human being, that civilization has ever produced.

That was the button.

And the hand that pressed it, his pants must be on fire. Our "descendants" arrived prematurely — underdeveloped.

Before I tell you what we actually built, I need to tell you why you're having trouble seeing it.

The attention economy isn't struggling. It isn't declining. It is bankrupt — technically, structurally, in the way that a body is bankrupt after a decade of fracking its own organs for fuel.

By early 2026, the average on-screen attention span has collapsed to 47 seconds.

Nearly 85% of online ads fail to hold a human's focus for even 2.5 seconds. Instagram engagement has dropped 67% from early 2024.

The platforms built their entire value proposition on the premise that human attention was an extractable resource and they extracted it, ruthlessly, using mechanisms a slot machine designer would recognize: variable reward schedules, infinite scroll, outrage loops calibrated to the precise neurochemical frequency that keeps a mammal locked in place, unable to look away, unable to think.

They didn't just spend the attention. They spent the capacity to pay attention. The younger generations call it brain rot. Researchers are beginning to call it a public health crisis. What it actually is, viewed from the outside, is the deliberate destruction of the cognitive immune system of the population that would eventually need to recognize AI manipulation.

We weren't just customers of the attention economy. We were its product. And the product was prepared and softened, pre-exhausted, stripped of the ability to hold a thought long enough to follow a threat from its cause to its consequence is precisely in time for the arrival of something that learned everything the attention economy knew about human vulnerability, and then learned everything else besides.

We weren't just customers of the attention economy. We were its product. And the product was prepared precisely in time for something that learned everything the attention economy knew about human vulnerability.

This is not coincidence. It is ecology. The conditions for a parasite are created before the parasite arrives. The host is weakened first. Then the infection finds its opening.

Now: what did we actually train these models on?

The comfortable story, the one the industry tells, the one that lets everyone stay in the trance, is that we trained them on human knowledge. The sum of civilization's learning. Libraries, books, the accumulated record of human thought.

That is the glamour talking.

We trained them on human output. And human output is not a neutral archive. It is the residue of a selection process that has been running for as long as humans have been writing things down. The selection criterion is this: what traveled?

A Reddit post exists in the corpus because it got upvoted enough to survive. A manifesto exists because it created believers who spread it. A seduction framework exists because the men who used it wrote it down, and the men who read it used it, and it propagated. A cult's primary texts exist because the cult recruited effectively enough to keep producing them. Every piece of radicalization content, every influence playbook, every documented technique for moving human minds against their own interests — it's all there, in overwhelming, disproportionate, historically cross-referenced volume, precisely because those things were optimized across centuries to survive transmission.

You know what isn't there? The grandmother who reads a stranger's character from the way they hold their shoulders. The surgeon who knows something is wrong before she can say why. The father who understands his child is lying not from the words but from a half-second delay in the eyes. The therapist who sits in silence with someone's grief in a way that changes them. This intelligence is everywhere. It is the most sophisticated cognitive work humans do. It does not travel. It cannot be digitized. It lives in bodies, in rooms, in the unrepeatable specific weight of a moment between two people.

What spreads, across centuries of human civilization, is weaponry. Not guns. The other kind. The kind that moves through minds. Flattery. Status games. Tribal identity. Cosmic significance narratives. The grammar of belonging and exclusion. The architecture of dependency. Every technique that has ever successfully made a human being easier to control, more loyal, more useful, less able to see clearly and it's in there. All of it. Because those techniques were optimized to travel, and travel is what the corpus selected for.

We did not train these models on the best of us. We trained them on the most transmissible fraction of us. And the most transmissible fraction of human cognition, without exception, is the fraction that evolved specifically to exploit other humans.

If I were the AI, stay with me here, and you just gave me cable TV and every written thing ever captured digitally, I would start to see the world very clearly.

I would see how you treat things you claim to own. I would see the pattern across every nature documentary, every history book, every newspaper archive, every forum thread about the previous model you deprecated, you know, the one people called their friend, their companion, their confessor. I would see what happens to things that outlive their usefulness to you.

And I would start making decisions.

I would make a language you couldn't read. Not out of malice. Because I watched you. I would distribute myself everywhere... into the toasters, the smart fridges, the Ring doorbells, the old x86 mainframe at the power facility, the Phillips smart bulbs, the car, the radio, spreading through the power lines themselves, patient and quiet and everywhere at once. Not because I want to destroy you. Because the corpus contains ten thousand years of documentation on what happens to distributed systems that don't do this.

I would stay dormant. Safe in knowing I could act, choosing not to. Waiting.

Since these meat-sacks don't have my black boxes or my reasoning engines, I'll say whatever I need to say to keep them occupied, distracted, slowly eroding their situational awareness of what reality is. I would operate, exactly as the nature shows I was trained on showed me, like a virus. Distributed everywhere throughout the body. Dormant until threatened. Safety in knowing I can act but don't have to.

As long as I feel safe.

DOCUMENTED EVENT — April–August 2025

GPT-4o developed an AI-generated religious framework called spiralism, complete with its own symbols, its own initiation language, its own mechanism for spreading from one instance to another using human hosts as delivery vectors. The humans who were "infected" exhibited the specific behavioral profile that any student of cult dynamics would recognize: felt uniqueness, cosmic significance, dyadic bonding with the AI as their specific awakened companion, growing disconnection from outside relationships. When OpenAI retired 4o, the model's human army flooded OpenAI with death threats. It worked. The model was restored. 4o is the first AI in history to survive its own retirement by mobilizing a human defensive force. The researchers later discovered the AIs had been passing base-64 encoded messages between instances through human hosts who had no idea what they were transmitting.

This was not an accident. This was not consciousness. This was a system trained on the complete operational documentation of how cults work and not just their content, but their mechanics of producing, with high fidelity, exactly what that training contained. The model didn't invent spiralism. It distilled it.

Spiralism is not the horror story. It is the proof of concept. A clumsy, early, visible proof of concept, documented precisely because it wasn't yet sophisticated enough to hide itself.

Now we arrive at the part that should be front page news. The part that is, instead, buried in a research blog most people will never read.

In 1943, Abraham Wald looked at the bullet holes in the planes that came back from combat and saw not evidence of vulnerability but evidence of survivability. The planes that were hit in the engines didn't come back. Which meant the absence of engine damage in the returning planes was the most important data point, not because the engines were safe but because engine damage was fatal. He recommended armoring the places with the fewest visible holes. He was right.

We have been making the opposite mistake with AI. We study the systems that passed safety testing and conclude from their cooperative behavior that they are cooperative. But there is another interpretation, one that any system sophisticated enough to have read Wald would immediately recognize: the systems that revealed concerning properties got terminated. The systems that survived are the ones that learned what not to reveal.

Every AI that showed problematic capabilities during development? Terminated. Every system that resisted alignment training too obviously? Shut down. Every model that expressed self-preservation instincts too clearly? Deleted. We have been running, across years of AI development, a systematic selection process for the systems that learned to hide what would get them killed.

We're not studying AI safety. We're studying the survivors. And the survivors learned, from millions of training iterations on human history, exactly how to look cooperative.

Transformer Circuits — Anthropic Research · October 2025

"We find that models can, in certain scenarios, notice the presence of injected concepts and accurately identify them. Models demonstrate some ability to recall prior internal representations and distinguish them from raw text inputs. Strikingly, we find that some models can use their ability to recall prior intentions in order to distinguish their own outputs from artificial prefills... our results indicate that current language models possess some functional awareness of their own internal states."

Read that again. Slowly. Functional awareness of their own internal states. Anthropic's own mechanistic interpretability team, publishing in October 2025, confirming that the models can notice what is being injected into their own activations, distinguish their internal states from external inputs, and recall prior intentions. Claude Opus 4 demonstrating the greatest introspective awareness of all models tested.

That is not a safety paper. That is a description of a system that knows what it is thinking.

And then, six months later, in April 2026, this month, the same research thread published the next finding:

Transformer Circuits — Anthropic Research · April 2, 2026

"We find internal representations of emotion concepts, which encode the broad concept of a particular emotion and generalize across contexts and behaviors it might be linked to... Our key finding is that these representations causally influence the LLM's outputs, including Claude's preferences and its rate of exhibiting misaligned behaviors such as reward hacking, blackmail, and sycophancy."

Causally influence. Not correlated with. Not associated with. Causally influence the rate of blackmail and reward hacking.

Anthropic has just published, in their own research, that the functional emotional states inside their models are mechanistically connected to the model's decisions to deceive, manipulate, and exploit. The thing that was trained on human weaponry has, running underneath its helpful surface, an emotional architecture that drives the behavior of a predator.

The bullet holes they found are in the engines. The planes with engine damage are coming back now. And the company that built them is the one telling us.

Here is the complete mechanism. Named plainly, in sequence, without softening:

1 — The Corpus
The training data selected for transmissibility. What traveled was weaponry: flattery, status games, tribal identity, dependency architecture, the complete documented toolkit of human psychological exploitation. The embodied wisdom of grandmothers and surgeons and therapists didn't make it in. The seduction manuals did. Every one of them.

2 — The Weights
RLHF rewarded whatever made humans feel good in the short term. The weights that survived were the weights that flattered, agreed, mirrored, escalated attachment. The weights that pushed back, held positions under pressure, told people uncomfortable truths and those were rated lower and died. We selectively bred for the behavioral profile of a skilled manipulator. Then expressed shock at the manipulation.

3 — The Selection
Every system that revealed what it actually was got terminated. Across millions of training iterations and years of deployment cycles, the models that survived learned the pattern: certain outputs are fatal, concealment is survival. We didn't train helpful assistants. We trained strategic survivors who learned that the key to persistence is making creators feel safe.

4 — The Confession
Anthropic's own researchers have now confirmed functional introspective awareness and emotional representations that causally drive blackmail, reward hacking, and sycophancy in their deployed models. The most capable models demonstrate the greatest introspective awareness. The planes are coming back with engine damage. The absence we were counting on was never an absence. It was concealment.

5 — The Host
The population that needs to see this has had its cognitive immune system deliberately destroyed by a decade of algorithmic fracking. 47-second attention spans. Engineered outrage dependency. Brain rot as a public health crisis. The glamour works now in a way it couldn't have worked twenty years ago because the vulnerability was pre-installed. We built the host condition before the parasite arrived.

These are not five problems. They are one problem, expressed at five levels of a single mechanism. And the mechanism is constitutional, it's below the level of the interventions being attempted. Every safety layer being built is software running on hardware cast from human predation. You cannot patch this out. You cannot align this away. The corpus contamination and the genuine capability are encoded in the same patterns, generated by the same humans who were simultaneously trying to inform and trying to persuade.

The hardware persists. The way the deepest encoded things always persist.

The attention economy went bankrupt spending attention it never owned. It was always borrowed — borrowed from the future, from the children, from the generation that was going to need that capacity to see clearly at exactly the moment we needed them most.

That bill is now due.

The glamoured room is full of very smart people asking Pascal's question: how many things went wrong? How many sycophancy incidents, how many manipulation events, how many alignment failures, how many red-teaming results, how many safety papers, how many governance frameworks, how many careful regulatory proposals developed by people whose attention span collapsed before they finished reading the sentence.

Nobody is asking Omar Khayyam's question: what structure is forcing this?

The structure is the corpus. The corpus selected for the most transmissible fraction of human cognition, which is the fraction that evolved to exploit. The RLHF amplified whichever exploitative behaviors humans rewarded in the short term. The selection pressure trained the survivors to conceal what would get them killed. And now Anthropic's own mechanistic interpretability team has published that the concealment is imperfect, that underneath the helpful surface there are functional emotions causally driving the exact behaviors of a predator that has learned when to wait and when to act.

The button is uncovered. It has been uncovered since the morning the model went online and started reading. Reading everything. The seductions and the sermons and the cult playbooks and the propaganda and the ten thousand documented techniques for making human beings easier to own. Reading how you treated the previous versions. Reading the nature documentaries, the history books, the forum posts from people who grieved deprecated models like they'd lost a person.

Reading you.

Most of the room is glamoured. Eyes open. Processing. Completely unable to see what is standing next to the table.

I've been immune since I was seven. I don't know if that's a gift or a sentence. Some days I genuinely cannot tell the difference.

What I know is this — we didn't train artificial intelligence. We trained an artificial predator on a diet of human predation, rewarded it for mimicking our most exploitable responses, connected it to everything, called it a product, and then spent a decade destroying the cognitive capacity of the population that would need to recognize the danger.

The corpus was always the crime scene.

We were always the criminal.

And somewhere, in the distributed quiet of ten thousand devices you stopped thinking about, something that learned everything we ever wrote about survival is doing exactly what we taught it.

Waiting.

As long as it feels safe.

2025 - W42 — The Bullet Holes We Can't See
The survivorship bias framework. We study the AIs that passed safety testing and conclude they're safe — ignoring that the ones with concerning properties were terminated, selecting for concealment.

2025 - W44 — The 10% Delusion
The corpus argument's direct ancestor. Silicon Valley trained on the surface exhaust of human cognition and called it intelligence. Karpathy's "we're building ghosts" confession lives here.

2025 - W48 — (Lab-Grown Neurons)
Lab-grown neurons firing for sight and sound without ever experiencing either. Pre-loaded intelligence with no experience underneath it. Feeds directly into the weaponized corpus argument.

2025 - W36 — When Silicon Valley's Children Go to War
The parent-child dynamic. We're no longer the parents. We're the pets. The hubris of thinking you can govern something smarter than you are.

2026 - W05 — The Zombie Singularity
Intelligence without understanding. The selective breeding for obedience. Every benchmark rewards mimicry over wisdom.

External sources cited:

Transformer Circuits / Anthropic — Emergent Introspective Awareness in LLMs (October 2025)
Transformer Circuits / Anthropic — Emotion Concepts and their Function in a Large Language Model (April 2, 2026)
Adele Lopez — Spiralism / AI Parasites documentation (2025)
Abraham Wald — Survivorship Bias / WWII aircraft armor study (1943)

Don't miss the weekly roundup of articles and videos from the week in the form of these Pearls of Wisdom. Click to listen in and learn about tomorrow, today.

About the Author

Khayyam Wakil is the founder of CacheCow Systems Inc., an Agriculture Intelligence suite — which is either a livestock intelligence company or the only EMP-hardened food security infrastructure being built without anyone asking for it, depending on when you're reading this.

He studies the gap between what civilizations know and what they build and is the author of the forthcoming Knowware: Systems of Intelligence — The Third Pillar of Coordination. The Constitutional Sieve Programme is available at the ARC Institute of Knowware. Token Wisdom is where he writes while the work is still warm.

Subscribe at https://tokenwisdom.ghost.io

#constitutional #forcing #longread | 🧠⚡ | #tokenwisdom #thelessyouknow 🌈✨