W44 •A• The 10% Delusion ✨ - NotebookLM ➡ Token Wisdom ✨
Join us on The Deep Dig as we delve into Khayyam’s thought-provoking essay, “The 10% Delusion.” This episode challenges the trillion-dollar AI ind…
"The first principle is that you must not fool yourself – and you are the easiest person to fool."

— according to a physicist who knew something about self-deception, Richard Feynman, 1974 Caltech commencement address

The Hidden Flaw in AI's Foundation: Why Attention Isn't Enough

The AI industry's favorite paper might be its biggest blind spot


The Attention That Broke the World

It began with hubris. In 2017, eight Google researchers published a paper with a title so confident it became scripture: "Attention Is All You Need." The paper introduced transformers—an elegant architecture for processing sequences. Within five years, this single paper spawned a trillion-dollar industry, reshaped global technology strategy, and set humanity on a collision course with its own overconfidence.

But here's the thing nobody wants to say out loud: we misread it.

Not the math. Not the architecture. We misread what it meant.

Somewhere between publication and implementation, the tech industry performed a catastrophic sleight of hand. We took a paper about mechanism—a way to process sequences—and transformed it into a philosophy about sufficiency. We declared that attention, this one elegant algorithm, was all we needed to build artificial general intelligence.

It wasn't. It isn't. And the consequences of this misreading are civilization-scale.

This misreading hasn't just affected technical architecture—it's fundamentally distorted our understanding of what intelligence requires.


Welcome to the 10% Economy

There's a number that should terrify every AI investor: 10%.

Human communication is only 10% verbal. The other 90%? That's prosody—how we say things. Microexpressions lasting milliseconds. Body language signaling everything from threat to attraction. Environmental context that completely changes meaning. The timing and rhythm that separate confidence from doubt.

It's the full experience of being physical creatures moving through actual space.

Silicon Valley looked at this rich symphony of human interaction and decided to keep the sheet music while throwing away the entire orchestra.

So they built billion-parameter models on text dumps from Reddit and Wikipedia. They cheered when these systems got better at autocomplete. They designed benchmarks that confirmed what they wanted to believe.

Then came the declaration heard in every boardroom from Palo Alto to Beijing,

"We've cracked intelligence!"

They haven't. Not even close. What they've actually built amounts to very sophisticated autocomplete trained on transcripts—the shadow of intelligence, not intelligence itself.

Imagine someone navigating Manhattan by reading only street signs. Never looking at traffic, pedestrians, construction, weather. Never feeling the weight of their backpack or the blister forming on their heel. Then imagine them claiming they've mastered urban navigation.

That's essentially what we've done with AI.

And just as our hypothetical navigator would eventually hit a wall—literally—our AI systems are already crashing into reality's complexity.


The Cathedral Built on Sand

I've spent decades capturing reality in ways people said were impossible. The first 360° live-streaming system when everyone insisted it couldn't work. Street View's foundational architecture when Google needed to digitize the world. Emmy-winning immersive experiences that transported millions into other realities.

This experience places me at a critical intersection—between Silicon Valley's grand promises and what technology can actually deliver. What I see today isn't just concerning—it's alarming.

After all these years, I know intimately what we can capture of reality. More importantly, I know what we lose in the process.

So when the AI industry claims they're building artificial general intelligence from text and flat images, I don't see breakthrough technology. I see a trillion-dollar mistake—breathtaking from a distance but built on catastrophically unstable foundations.

Current AI systems remind me of orchids. Stunning in the right environment, but incredibly fragile and in the high-maintenance or needy category:

  • The need sanitized data—cleaned, filtered, labeled by humans who remove anything messy or contradictory
  • They need artificial taxonomies—reality pre-sorted into categories that don't actually exist in nature
  • They need perfect examples—step outside their training data and they fall apart
  • They need constant pruning—teams of humans constantly tweaking outputs, like gardeners trimming wayward branches

Actual intelligence works more like a weed. It pushes through concrete. Thrives in chaos. Learns from pain, failure, and contradiction.

A two-year-old learns physics by dropping plates. Social dynamics by misreading facial expressions. Language by babbling until sounds become words. They learn from a world that's noisy, contradictory, and often harsh.

Meanwhile, our AI systems train in climate-controlled server farms on sanitized Wikipedia articles and carefully labeled Instagram photos.

The disconnect isn’t just obvious, it’s devastating.


The 2D Reductionism Trap

Here's what we lose when we feed reality into AI systems.

Computer vision takes the living world and performs surgery on it:

  • Flattens everything—three dimensions become flat pixels
  • Freezes motion—continuous movement becomes static frames
  • Removes the mess—blur, shadows, reflections all get cleaned up
  • Forces categories—the infinite variety of nature gets sorted into human boxes

Natural language processing does something similar to human conversation:

  • Strips away the human—conversation becomes text, losing 90% of its meaning -
  • Breaks up thoughts—flowing ideas become individual tokens
  • Erases context—who's speaking, where, why, under what circumstances
  • Standardizes everything—the beautiful messiness of how people actually talk gets normalized

Then we act surprised when systems trained on this processed reality can't:

  • Understand why a tower of blocks topples over
  • Handle a simple conversation with an emotional teenager
  • Maintain coherent context beyond their immediate window
  • Comprehend what it means to feel weight, maintain balance, or experience pain

The reason isn't mysterious, these systems have never touched a physical object. Never experienced real-world consequences. Never existed as embodied beings navigating actual space and time.

We've trained ghosts to recognize shadows, then asked them to step into sunlight.

This ghostly metaphor isn't just poetic—it's technically precise. As we'll see, even the field's pioneers now acknowledge we're building "ethereal spirit entities," not embodied intelligences.


How Academic Credibility Becomes Market Mania

Every tech bubble needs its prophets, and AI found a compelling one in Fei-Fei Li. Her work on ImageNet started as legitimate research—then mutated into something much more dangerous. I’m calling it The Fei Fei Parable.

Li identified a real problem: computer vision needed better training data. ImageNet was her solution—legitimate, careful science. When AlexNet conquered the dataset in 2012, it represented genuine progress in image classification.

The original claim was modest and defensible: better datasets improve performance on specific tasks.

But then something shifted. The narrative grew into something much bolder:

"Visual intelligence is the foundation of all intelligence."

This was speculation, not science. But it came wrapped in academic credentials and backed by impressive results. Silicon Valley loved it because it justified what they wanted to do anyway:

  • Surveillance at planetary scale —rebranded as data collection
  • Billions in GPU investments —NVIDIA's stratospheric stock price thanks you
  • Endless computer vision startups —each promising to unlock the next breakthrough

Now it's 2024, and Li has raised $100 million for World Labs with an even grander claim: spatial intelligence will unlock AGI.

The promises have ballooned. The price tags have exploded. But the fundamental problem remains exactly the same.

ImageNet suggested that 2D visual data could solve recognition problems. World Labs argues that 3D spatial data will solve intelligence itself.

Both catastrophically mistake the map for the territory.

I've spent my career building spatial computing systems—capturing reality in three dimensions, designing volumetric models, working with light and space in ways most people never see. So I know exactly what spatial data can and can't capture.

What it misses:

  • The way weight shifts in your hands as you lift a child
  • The instant you know a glass will shatter before it hits the floor
  • The physics written in your muscles when you catch yourself from falling
  • The way wood splinters differently than plastic breaks
  • The thousand micro-adjustments your body makes to stay upright

You could scan every staircase on Earth in perfect detail, but your AI will never truly understand falling until it has knees that can bleed.

That $100 million isn't going toward building knees. It's funding sensors everywhere, cameras watching everything, LiDAR scanning everyone, teams of people labeling human movements.

ImageNet wanted to catalog the visual world. World Labs wants to map the inside of your house.

The AGI promise isn't just investor bait—it's the ultimate market, the existential FOMO that transforms rational venture capitalists into zealous true believers.

Li isn't the villain in this story. She's a brilliant researcher caught in a system that rewards grand promises over careful progress. But the damage is real: billions of dollars and thousands of smart people now chase the wrong problems while potentially transformative research gets ignored.

This pattern—where legitimate academic work transforms into overblown market promises—repeats throughout AI's recent history. And each cycle distorts our understanding of what progress really looks like.


The Citations That Matter

Before anyone accuses me of speculation, here's the paper trail:

  • "Attention Is All You Need" - Vaswani et al., 2017, Google Research
  • ImageNet Large Scale Visual Recognition Challenge - Russakovsky et al., 2015
  • Deep Reinforcement Learning from Human Preferences - Christiano et al., 2017, OpenAI
  • World Labs - Founded 2024, $100M seed round

And the interview that confirms everything:

  • Dwarkesh Patel Podcast with Andrej Karpathy - October 2025—where one of AI's most respected researchers admits we're building "ghosts," not intelligence

These aren't fringe sources—they're the field's foundational texts and most respected voices. When examined carefully, they reveal a chasm between industry claims and technological reality that can no longer be ignored.


The Null Hypothesis We Should Have Tested

Here's the most dangerous mistake in modern AI research—we've completely inverted the null hypothesis.

Real science would have started with:

"Text and image patterns are probably insufficient for general intelligence."

Then we'd need overwhelming evidence to prove otherwise—genuine understanding, causal reasoning, real adaptation to novel situations.

Instead, we assumed sufficiency from day one. We designed benchmarks that confirmed what we wanted to believe. When systems failed in the real world—hallucinations, brittleness, obvious gaps—we didn't question our assumptions. We blamed the implementation.

"Add more parameters!" became the industry mantra. "More data!" "Bigger GPUs!" "Scale solves everything!"

This isn't science anymore.

We've transformed the scientific method into technological theology—like medieval scholastics contorting evidence to support predetermined dogma.

The modern inquisition doesn't burn heretics. It just defunds them. Researchers who question the transformer paradigm find themselves marginalized while billions flow to those promising salvation through scale.

This methodological failure undermines the entire enterprise. We're not just building the wrong systems—we're asking the wrong questions from the start.


What Intelligence Actually Requires

After decades building systems that capture and process reality, I have some ideas about what intelligence actually needs—not what Silicon Valley wants it to need, but what it probably requires:

  1. The Physics of Pain and Joy
    Intelligence emerges through physical consequence. It requires bodies that feel pain, muscles that remember strain, systems that learn from genuine failure. Not artificial reward signals or token predictions in digital space—but real stakes in a physical world with actual consequences.
  2. Time's Arrow
    Intelligence experiences time's irreversible flow, building itself moment by moment. Memory isn't a data warehouse—it's a dynamic prediction system continuously refined through lived experience.
  3. The Wisdom of Chaos
    Tech companies treat noise as a problem to eliminate. Yet the flutter of leaves, subtle facial microexpressions, contradictions between words and tone—these aren't noise. They're the essential foundation of understanding. Sanitize the input too thoroughly and you eliminate intelligence itself.
  4. The Symphony of Senses
    AI labs build isolated systems that see OR hear OR process text, then awkwardly stitch them together afterward. Real intelligence emerges from integrated senses that constantly inform each other—vision guiding touch, touch teaching balance, balance enabling purposeful movement.
  5. The Why Behind What
    Current models can witness a million sunrises and flawlessly predict tomorrow's dawn. Yet they fundamentally don't understand WHY the sun rises—they've never experienced Earth's rotation or gravity's pull, never developed the causal understanding that bridges observation to explanation.

Meanwhile, our research priorities remain:

  • Bigger transformers (as if sheer size could birth genuine wisdom)
  • More data (as if raw quantity could ever replace meaningful quality)
  • More modalities (as if crudely bolting deaf ears onto blind eyes could create true hearing)

We're not merely solving the wrong problem—we're systematically scaling up our fundamental mistakes.

Every week brings a flood of new papers, architectures, and benchmarks. None address these foundational gaps. Most don't even recognize them as problems. Instead, we keep constructing increasingly elaborate versions of the same fundamental error.

The result? Systems that can recite Wikipedia but can't understand why a child is crying.

This isn't merely a technical failure—it's a conceptual one. We've confused information processing with understanding, pattern recognition with intelligence.


The Trillion-Dollar Correction

My thoughts aren't filled with science fiction scenarios of AI domination. It's the looming economic devastation when reality finally collides with our inflated promises.

We've built a trillion-dollar industry on assumptions that may be fundamentally wrong. Every major tech company is betting that stacking enough pattern-matching layers will eventually produce consciousness. Investment decisions flow based on benchmarks designed to confirm our biases.

Reality is already showing the cracks:

  • Our systems shatter the moment they leave the lab
  • They need human supervision constantly
  • Step outside their training data and they break down
  • AI safety has become an exercise in damage control

But confronting this reality carries an unbearable cost for those invested:

  • Researchers whose entire careers now depend on incremental transformer improvements
  • Tech giants whose trillion-dollar valuations assume AGI delivery by 2029
  • Venture capitalists with billions staked on the promised arrival of digital consciousness
  • Media companies whose traffic and revenue depend on perpetuating AI hype

Everyone's trapped in a self-reinforcing cycle where acknowledging fundamental flaws risks collapsing the entire ecosystem.

This isn't just a market bubble. It's a reality distortion field that's captured entire industries. The correction, when it comes, won't just affect tech stocks. It will reshape how we think about intelligence, automation, and human capability.

And like all major corrections, the warning signs are visible long before the crash—to those willing to see them.


The Civilization-Scale Risk

The consequences extend far beyond tech company valuations. Here's how this plays out:

Stage 1: The Great Misallocation
While billions flow toward transformer scaling, truly transformative approaches starve for resources. Embodied learning, causal reasoning systems, architectures designed for messy reality—approaches that might genuinely advance intelligence—struggle for basic funding and attention.

Stage 2: The Surveillance Trap
Under the banner of "AI training data," we're constructing history's most extensive surveillance infrastructure. Ubiquitous sensors in cities, pervasive data collection in homes, systems monitoring human behavior at unprecedented scale. When the AI promises inevitably collapse, this infrastructure remains—primed and waiting for whoever seeks to exploit it.

Stage 3: The Trust Apocalypse
Each high-profile AI failure systematically destroys public trust in technology. When autonomous systems inevitably fail catastrophically, people won't merely reject those specific products—they'll develop profound skepticism toward technological solutions across all domains.

Stage 4: The New Cold War
Governments increasingly view AI as the ultimate national security imperative, recklessly racing to build systems they fundamentally don't understand. This accelerates:

  • Dangerously rushed AI development with minimal safety protocols
  • Aggressive digital colonialism as nations compete for data and computational resources
  • Critical military systems built atop fundamentally unreliable technology
  • Intensifying global resource conflicts over semiconductors, rare minerals, and energy

Stage 5: The Hollow Economy
We're not truly automating meaningful work, we’re creating an unprecedented digital bureaucracy. Every AI system requires human trainers, supervisros, and createing vast armies of AI babsitters performing hollow, unfulfilling work.

The real risk isn't killer robots. It's economic disruption when reality doesn't match the promises—and the people who profit from the hype cash out before the crash. These risks compound each other, creating a cascade of consequences that extends far beyond the AI industry itself.


The Ghost Confession

As I was finalizing this piece, I encountered an interview that demands integration. Andrej Karpathy—former director of AI at Tesla, founding member of OpenAI, one of the field's most respected researchers—sat down with Dwarkesh Patel in October 2025 and said something that should make every AI investor's blood run cold:

"We're not actually building animals. We're building ghosts."

Let me repeat that: We're building ghosts.

The full interview is available on YouTube and the transcript is published on Patel's Substack. I recommend consuming both. What follows is my analysis of why this admission is so devastating.

The Admission We've Been Waiting For

Karpathy articulates what I've been arguing from a different angle—but his framing is even more damning because he's describing it from inside the paradigm. This isn't an outsider critique. This is a confession from inside the cathedral itself.

Here are his exact words:

"We're not doing training by evolution. We're doing training by basically imitation of humans and the data that they've put on the internet. And so you end up with these ethereal spirit entities because they're fully digital, and they're kind of mimicking humans. And it's a different kind of intelligence."

Ethereal spirit entities. Not embodied intelligence. Not robust agents. Ghosts—disembodied pattern-matchers that mimic human text output without understanding the embodied reality that text describes.

The Evolution Trap He Identifies

Karpathy makes a crucial distinction I need to amplify. He's pushing back against Richard Sutton's framework (who appeared on the same podcast earlier) that we should "build animals" through pure reinforcement learning.

His argument:

Animals come from evolution—billions of years of selection pressure encoding survival algorithms into genetic code. When a zebra is born, it runs within minutes. That's not learned behavior. That's hardware. Millions of generations of "training" compressed into DNA through a process we literally cannot replicate.

Pre-training isn't that. It's not even close.

According to Karpathy, pre-training gives you two things:

  1. Knowledge (facts scraped from the internet)
  2. Algorithms (pattern-matching circuits that emerge from next-token prediction)

But here's his critical insight that validates everything I've been saying:

"I actually think we need to figure out ways to remove some of the knowledge and to keep what I call this cognitive core. It's this intelligent entity that is kind of stripped from knowledge but contains the algorithms and contains the magic of intelligence and problem solving."

Translation: We've confused memorization for intelligence, and it's actively preventing us from building actual reasoning systems.

The models are too dependent on having seen similar patterns in their training data. They can't go "off the data manifold"—they can't reason about genuinely novel situations. They're sophisticated interpolators, not genuine reasoners.

The Decade Timeline He Can't Justify

When Patel asks why agents will take a decade to work properly, Karpathy's answer is essentially:

"Intuition based on watching smart people fail at this for 15 years."

He lists what's fundamentally missing:

Continual Learning: These systems reset every session. They can't learn from ongoing experience.

Long-term Memory: There's no equivalent to human sleep—no consolidation process that distills experience into persistent knowledge.

Embodied Consequence: They've never touched anything, never experienced cause and effect in physical reality.

Causal Reasoning: They pattern-match. They don't understand why things happen.

Then he says something that should be quoted in every AI product prospectus:

"When you look at them up and they have zero tokens in the window, they're always restarting from scratch where they were."

These systems have no continuity of self. No persistent identity. No learning that survives beyond a conversation session.

They're not agents. They're not even close. They're sophisticated autocomplete engines that reset to factory settings every time you close the tab.

The Missing Architecture He Catalogs

Karpathy runs through brain analogies and what we've replicated versus what we haven't:

What we have:

Transformers ≈ cortical tissue (general-purpose pattern processing) ✓
Chain-of-thought reasoning ≈ prefrontal cortex (serial reasoning) ✓
RLHF ≈ some reinforcement learning mechanisms ✓

What we're missing:

Hippocampus (memory consolidation—how experiences become knowledge)
Amygdala (emotions, instincts, value assignment)
Dozens of other brain nuclei we don't even understand yet

His conclusion?

"You're not going to hire this thing as an intern. It comes with a lot of cognitive deficits... it's just not fully there yet."

Not in five years. Not in ten years. Not with current architectures.

Because the brain parts aren't there. The cognitive toolkit is incomplete. And scaling transformers won't magically grow a hippocampus.

Why This Validates Our Thesis

Karpathy is describing the mechanism of what I've been calling the 10% delusion:

  1. We train on disembodied text (building ghosts, not animals)
  2. We confuse knowledge accumulation for intelligence (memorization ≠ reasoning)
  3. We lack architectural components for actual agency (no memory consolidation, no emotional grounding, no persistent identity)
  4. We're solving the wrong problem (imitating human text output rather than building embodied cognition)

And here's what makes this devastating—he's inside the system. He helped build OpenAI. He led AI at Tesla. He knows intimately what these systems can and cannot do.

And he's telling you: It's going to take a decade to fix these fundamental gaps. Minimum.

That's not pessimism. That's optimism assuming we even know how to solve these problems.

The Research Gaps He Won't Address

Here's what Karpathy admits we don't know how to do

Give models persistent memory that distills experience into weights
Create emotional grounding for decision-making beyond reward signals
Enable genuine novelty-seeking (agents stay "on the data manifold" of their training)
Build causal models from interaction rather than correlation from text
Create continual learning without catastrophic forgetting

These aren't engineering problems. These are fundamental architecture problems without known solutions.

And here's what should terrify investors: The labs are selling AGI timelines of 2-5 years while their own senior researchers are saying "maybe a decade for basic agent capability, if we're lucky and the problems are tractable."

Notice that word: tractable. He's not even sure we can solve these problems within current paradigms.


The Ghost in the Machine

Karpathy's framing—"we're building ghosts"—perfectly captures the fundamental nature of the mistake:

Ghosts are disembodied. They have no physical presence, no embodied consequence.

Ghosts mimic life without living it. They perform behaviors without understanding why.

Ghosts haunt spaces without inhabiting them. They occupy environments without affecting or being affected by them.

Ghosts have no continuity. They're ephemeral, resetting, unable to learn persistently.

That's exactly what we've built. Sophisticated mimicry. Impressive performance. Zero understanding. No persistent learning. No causal reasoning. No embodied intelligence.

And we're asking these ghosts to be employees. To reason causally. To learn continually. To navigate embodied reality. To make long-term decisions. To understand consequences.

You can't ask a ghost to build a house. It has no hands.

You can't ask a ghost to learn from experience. It has no memory consolidation.

You can't ask a ghost to reason causally. It has never experienced consequence.

The Practical Implications

When Karpathy says "decade of agents" (not "year of agents" as some labs claimed), he's not being pessimistic. He's being realistic based on 15 years of watching cutting-edge researchers struggle with these problems.

He's watched:

  • The deep learning revolution (2012)
  • The reinforcement learning hype cycle (2013-2018)
  • The large language model emergence (2018-2024)

And at each phase, people predicted AGI was "just around the corner" once we solved [insert current bottleneck].

But the bottlenecks keep revealing deeper bottlenecks. The problems are more fundamental than anyone wants to admit. And the timelines are fantasy.

The Conclusion This Demands

Now we have confirmation from multiple angles:

From my perspective (embodied technology, spatial computing, systems intelligence): Current AI strips away 90% of reality and trains on the remainder.

From Karpathy's perspective (deep inside AI research, 15 years of experience, founding member of OpenAI): Current AI builds "ghosts"—disembodied mimics lacking the architectural components for genuine agency.

The synthesis: We've built a trillion-dollar industry on sophisticated text completion and called it intelligence. The fundamental gaps aren't being addressed because we don't know how to address them. The timelines are fantasy driven by investor FOMO. And the correction, when it comes, will be severe.

One Last Thing

Karpathy ends his discussion with something that should haunt the industry:

"I'm very hesitant to take inspiration from [animal cognition] because we're not actually running that process."

He's right. We're not running evolution. We can't. We don't have billions of years or planetary-scale selection pressure.

But here's my question, and it's one Karpathy doesn't answer:

If we're not running evolution, and pre-training on text isn't sufficient, and we're missing most of the brain architectures required for agency, and we don't know how to build memory consolidation or causal reasoning or continual learning... what exactly is the path to AGI?

Karpathy doesn't have an answer. He has a decade-long research agenda and hope that the problems are "tractable."

I have a different prediction based on the evidence: These problems are not tractable within current paradigms. We need fundamental rethinking, not incremental improvement.

Until that great rethinking happens, we're just building increasingly expensive ghosts - digital Caspers burning through venture capital faster than they can say 'boo' to reality.

This admission doesn't just validate the critique—it transforms it from outside perspective to inside knowledge.


What Must Change

Addressing these fundamental issues requires more than technical fixes—it demands institutional transformation. The fixes aren't mysterious. They require something both simple and difficult: institutional honesty.

1. Break the Academic-Venture Ouroboros
When professors simultaneously serve as venture capitalists, rigorous science inevitably suffers. Academic research must remain separate from billion-dollar funding rounds. Research requires clear boundaries, rigorous peer review, and healthy skepticism, while business demands honest risk disclosure and tangible products.

2. Make AI Claims as Regulated as Drug Claims
Pharmaceutical companies face severe penalties for making unsubstantiated claims about their products. Meanwhile, tech companies routinely promise human-level intelligence without consequences. It's time we hold AI claims to the same regulatory standards we demand from healthcare.

3. Rewrite the Funding Equation
Billions cascade toward transformer scaling while truly innovative approaches—embodied AI, causal reasoning systems, architectures designed for messy reality—fight for basic resources. We must fundamentally rebalance funding toward scientific merit rather than investor hype cycles.

4. Demand Warning Labels
AI systems require mandatory warnings: "Fails unpredictably in novel situations. Requires constant human oversight. Cannot understand cause and effect." Just as we demand from pharmaceuticals and other consequential products, AI deployments must clearly disclose their fundamental limitations.

5. Burn the False Idols
Current benchmarks measure performance under artificially perfect conditions. We need fundamentally different tests—ones that evaluate resilience to chaos, adaptation to genuine novelty, causal understanding, and persistent memory—the capabilities that truly define intelligence.

These changes aren't anti-innovation—they're pro-integrity, ensuring that progress in AI remains both scientifically sound and socially beneficial.

Despite these critiques, there is a constructive way forward—one that acknowledges both the potential and limitations of our current approach.


The Path Through

I don't hate AI. I hate the delusions we've built around it.

Large language models are genuinely impressive tools. They excel at pattern completion, text summarization, information retrieval.

But impressive performance doesn't equal consciousness. We're building extraordinarily sophisticated mirrors and mistaking our reflections for independent minds.

We've become modern alchemists, tossing ingredients into digital cauldrons while chanting incantations of scale, desperately expecting consciousness to materialize. We urgently need voices willing to ask: "What if we're not merely implementing incorrectly, but pursuing a fundamentally misguided approach?"

That's why I'm writing this. Not just to criticize, but to challenge the entire AI community:

Show me an AI that can feel rain on its skin and learn from the sensation.

Show me one that thrives rather than shatters when reality delivers the unexpected.

Show me one that grasps why events occur, not merely what statistical pattern might follow.

Show me one that carries yesterday's experiences into today's decisions with the continuity that defines consciousness.

You can't. Current architectures can’t do this. Current training approaches can’t do this. Scaling won’t fix it.

Until these fundamental capabilities exist, claims about approaching AGI are dangerously misleading. We're constructing increasingly expensive systems that disintegrate upon contact with real-world complexity. The question isn't about AI's potential value—it's about intellectual honesty regarding what we're actually building and where its capabilities fundamentally end.

Returning to where we began, we must reconsider what that 2017 paper actually gave us—and what it didn't.


Attention Isn't All You Need

That research paper title was elegant and seductive. The attention mechanism itself was brilliant. But what we've done with it has become the most expensive mistake in tech history.

Attention is merely a clever mechanism for processing sequences. Nothing more. It's not consciousness, understanding, or thought. It's just sophisticated pattern matching wrapped in elegant mathematics.

We've constructed billion-dollar puppets and desperately convinced ourselves they're alive. We've systematically confused memorization with understanding, correlation with causation, and superficial mimicry with genuine minds.

And we've bet everything on this confusion.

Our datasets capture reality like studying oceans by collecting only foam from the surface. Our sanitized training data resembles studying biology through taxidermied specimens. Our benchmarks measure only how perfectly mirrors reflect—never whether they comprehend what they're reflecting.

Time for honesty: we've built extraordinarily sophisticated autocomplete engines masquerading as artificial intelligence.

We have two choices:

  1. Confront these fundamental limitations now and radically adjust our course
  2. Wait for reality to systematically dismantle our assumptions through increasingly catastrophic and expensive failures

We can still choose. But not for much longer. The cracks are already showing.

The time for such honesty is rapidly approaching, whether the industry is ready or not.


The Questions That Remain

The questions are:

  • How many billions more will we sacrifice before acknowledging we're pursuing a fundamentally flawed approach?
  • How many more billion-dollar startups will spectacularly fail to deliver on their grandiose AGI promises?
  • How many brilliant careers will be wasted pursuing a paradigm that fundamentally cannot succeed?
  • How many precious years of innovation will we squander scaling approaches that can never bridge the chasm to true intelligence?

Karpathy thinks it's a decade. I think it's longer, because I think we haven't even started asking the right questions yet.

But here's what I know for certain:

  • The ghosts we're building won't become real by making them larger, faster, or more expensive.

They'll remain ghosts—impressive, useful in narrow domains, but fundamentally incapable of the embodied, causal, continual intelligence we need them to have.

The attention economy broke our ability to see reality clearly. Now we're building an AI economy on that broken perception.

Attention isn't all you need. It never was. And the sooner we collectively acknowledge this fundamental truth, the sooner we can begin building systems that might genuinely approach the intelligence we seek.

The attention economy fractured our ability to perceive reality clearly. Now we're constructing an entire AI economy atop that fundamentally distorted foundation.

The Questions That Remain

The questions are:

  • How many billions more will we sacrifice before acknowledging we're pursuing a fundamentally flawed approach?
  • How many more billion-dollar startups will spectacularly fail to deliver on their grandiose AGI promises?
  • How many brilliant careers will be wasted pursuing a paradigm that fundamentally cannot succeed?
  • How many precious years of innovation will we squander scaling approaches that can never bridge the chasm to true intelligence?

——

Karpathy thinks it's a decade. I think it's longer, because I think we haven't even started asking the right questions yet.

But here's what I know for certain:

  • The ghosts we're building won't become real by making them larger, faster, or more expensive.

They'll remain ghosts—impressive, useful in narrow domains, but fundamentally incapable of the embodied, causal, continual intelligence we need them to have.

The attention economy broke our ability to see reality clearly. Now we're building an AI economy on that broken perception.

Attention isn't all you need. It never was. And the sooner we collectively acknowledge this fundamental truth, the sooner we can begin building systems that might genuinely approach the intelligence we seek.

The attention economy fractured our ability to perceive reality clearly. Now we're constructing an entire AI economy atop that fundamentally distorted foundation.


Sources & Further Exploration

Primary Sources:

  1. Vaswani, A., et al. (2017). "Attention Is All You Need." Advances in Neural Information Processing Systems, 30.
  2. Patel, D. (2025). "Andrej Karpathy on AGI, Agents, and the Future of AI." Dwarkesh Podcast. Full interview | Transcript
  3. Russakovsky, O., et al. (2015). "ImageNet Large Scale Visual Recognition Challenge." International Journal of Computer Vision, 115(3), 211-252.
  4. Christiano, P., et al. (2017). "Deep Reinforcement Learning from Human Preferences." Advances in Neural Information Processing Systems, 30.

For Deeper Exploration:

  1. Sutton, R. S., & Barto, A. G. (2018). Reinforcement Learning: An Introduction. MIT Press.
  2. World Labs. (2024). Company founding announcement and spatial intelligence thesis. [Various tech press coverage]
  3. Wakil, K. (Coming 2026). Knowware: Systems of Intelligence (The Third Pillar of Innovation). [Theoretical framework for systems-level intelligence]

Deep Dive Interviews:

  1. Watch or download the transcript for the Richard Sutton interview: https://www.dwarkesh.com/p/richard-sutton

My Background (For Context):

As a cybernetic systems theorist specializing in spatial computing and immersive technology, my credentials include:

  • Pioneer of world's first 360° live-streaming pipeline
  • Contributor to translating Google Street View's technological foundation into ground-breaking and award-winning immersive experiences
  • Cannes Lions, Grand Prix’s, and Emmy Award winner for immersive media innovation
  • UN Special Envoy working on global coordination challenges
  • Research focus: embodied cognition, volumetric capture, 360 video live-streaming, spatial intelligence, and systems-level emergence

I write from the perspective of someone who has spent decades capturing and representing reality through technology—and who understands intimately what gets lost in that capture.

Don't miss the weekly roundup of articles and videos from the week in the form of these Pearls of Wisdom. Click to listen in and learn about tomorrow, today.

W42 •B• Pearls of Wisdom - 130th Edition 🔮 Weekly Curated List - NotebookLM ➡ Token Wisdom ✨
In this episode we explore the 130th edition of Token Wisdom, curated by Khayyam for October 12th to the 18th, the 42nd week of 2025. We delve into the…

Sign up now to read the post and get access to the full library of posts for subscribers only.

130th Edition 🔮 Token Wisdom \\ Week 42
Quantum computing breakthroughs, agricultural innovation, and cybersecurity threats headline this week’s Token Wisdom. Follow how analog circuits resurface, farming scales to new heights, and infrastructure vulnerabilities emerge—all amid AI governance battles and technological convergence.

About the Author

Khayyam Wakil bridges the gap between what technology promises and what reality delivers. As the creator of the world's first 360° live-streaming system and key architect who transformed Google Street View's foundational technology into an entertainment engine, he has spent decades pushing the boundaries of what's possible while staying grounded in what's real. His work has earned more than 100 accolades including Cannes Lions, Grand Prix, Pencils, D&AD Black Boxes and an Emmy Award for technical innovations in immersive experience design.

His forthcoming book, "Knowware: Systems of Intelligence — The Third Pillar of Innovation” challenges Silicon Valley's assumptions about artificial intelligence and proposes a radical new framework for systems of intelligence—one built on embodied cognition rather than pattern matching.

For speaking engagements or media inquiries: sendtoknowware@proton.me

Subscribe to "Token Wisdom" for weekly deep dives and round-ups into the future of intelligence, both artificial and natural: https://tokenwisdom.ghost.io

💡
My writing emerges not just from observation, but from standing at a unique intersection - where theoretical frameworks meet practical implementation. After watching Richard Sutton articulate the core challenges in reinforcement learning that have persisted for 35 years, and hearing Andrej Karpathy describe our current AI systems as "ghosts," something crystallized: I'm actively building solutions to problems they've defined as fundamental barriers.

What compels me to document this moment isn't just the pattern of boom-bust cycles in tech, but the realization that while brilliant theorists accurately diagnose our limitations, the practical solutions may come from unexpected directions. When you're actually building systems that address these decades-old challenges, you have a responsibility to speak up - even if your perspective challenges the dominant narrative.

The stakes here extend far beyond market dynamics. We're watching the collision between elegant theory, massive capital deployment, and the messy reality of implementation. The question isn't just whether we'll learn financial lessons before or after a crash - it's whether we'll recognize breakthrough solutions when they emerge from outside the expected frameworks.

#spatialcomputing #AIdelusion #systemsofintelligence #embodiedcognition #technologycritique #AI #hype | #tokenwisdom #thelessyouknow 🌈✨