
A Thesis on Universal Primitives, Intelligence Stacks, and the Infrastructure That Doesn't Exist Yet
By Nick Bryant × Circuit · Metatransformer
The most consequential technology of the 20th century was a tiny semiconductor switch invented in a New Jersey lab in 1947. The most consequential technology of the 21st century may be a neural network architecture described in a 2017 Google research paper titled "Attention Is All You Need." Both began as modest, incremental improvements — the transistor replacing vacuum tubes, the transformer replacing recurrent neural networks — and both turned out to be universal primitives: simple building blocks whose recursive composition produces systems of staggering complexity.
The transistor gave rise to a $5.7-trillion global IT industry through eleven layers of abstraction, from logic gates to cloud computing. The transformer is now generating an analogous stack — from raw next-token prediction to autonomous AI agents — at roughly five times the clock speed of the original computing revolution.
But this analogy, while structurally powerful, conceals a dangerous simplification. A transistor always produces the same output for the same input. A transformer does not. The computing stack was built on deterministic bedrock. The intelligence stack is being erected on probabilistic sand. This single fact changes everything about how the stack must be engineered — what layers are needed, what can be assumed, and what infrastructure must exist before the upper floors are habitable.
This report traces every layer of both stacks, the economics at each level, the structural vulnerabilities the analogy obscures, and the case for why we are witnessing the birth of a new kind of machine that needs a new kind of infrastructure.
Part I: From Switches to Civilization — The Full Computing Stack
A Semiconductor Switch That Cost $8 and Changed Everything
On December 16, 1947, physicists John Bardeen and Walter Brattain pressed two gold point contacts onto a sliver of germanium at Bell Labs and amplified a signal without a vacuum tube. The point-contact transistor was born. The New York Times buried it in three paragraphs on page 46. The device was simple: apply voltage to one terminal (input = 1), and current flows between two others (output = 1). Remove voltage (input = 0), current stops (output = 0). A binary switch, nothing more.
Vacuum tubes performed the same function but were fragile glass bulbs several inches tall, consuming watts of power, lasting perhaps 10,000 hours, and requiring warm-up time. A transistor was solid-state, instantaneous, and could last decades. As Nokia engineer Ed Eckert later calculated, if modern smartphones used vacuum tubes instead of transistors, each phone would be ten times the size of the Empire State Building. The earliest discrete transistors cost roughly $8 apiece (in 2015 dollars). By 2015, a dollar could buy billions of transistors embedded in a memory stick — a cost reduction of approximately ten billion-fold over sixty years.
Three transistor architectures defined the journey. The BJT (bipolar junction transistor, 1948) dominated early integrated circuits. The MOSFET (metal-oxide-semiconductor field-effect transistor, 1960) became the most manufactured device in human history — an estimated 13 sextillion (1.3 × 10²²) units produced by 2018. And CMOS (complementary MOS, 1963), which pairs NMOS and PMOS transistors to consume near-zero static power, captured 99% of all IC fabrication by 2011.
CMOS won because its complementary design consumes roughly one-seventh the power of NMOS logic and ten million times less than bipolar TTL. This matters for the AI analogy: the transistor primitive itself evolved. CMOS superseded BJTs. The pattern held even as the specific architecture changed. We should expect the same of neural architectures.
Logic Gates: Computation from Composition
Claude Shannon's 1937 MIT master's thesis — called "possibly the most important master's thesis of the century" — proved that Boolean algebra maps directly onto electrical switching circuits. Any logical or arithmetic operation reducible to true/false can be built from transistor switches. A CMOS NOT gate requires just 2 transistors. A NAND gate takes 4. An AND takes 6.
The profound insight is NAND universality: any Boolean function can be constructed from NAND gates alone. An entire computer can, in principle, be built from a single type of four-transistor gate. This is the first great demonstration of the pattern that defines the entire stack — a simple primitive, recursively composed, produces arbitrary complexity.
The Linux kernel hacker community would note a critical precision here: this universality is proven — formally, mathematically, unconditionally. Any Boolean function. The transformer's universality is of a fundamentally different kind, which I will address directly in Part II.
Integrated Circuits and Moore's Law: Integration Unlocks Emergence
Jack Kilby at Texas Instruments demonstrated the first integrated circuit on September 12, 1958. Robert Noyce at Fairchild improved it with the planar process in 1959, making ICs manufacturable. Integration progressed through SSI (tens of transistors, 1960s), MSI (hundreds, late 1960s), LSI (thousands, 1970s), and VLSI (millions to billions, 1980s onward).
Gordon Moore observed in 1965 that transistor density doubled annually, later revised to every two years. This became Moore's Law — not a physical law but an economic flywheel sustained by breathtaking engineering for six decades:
| Processor | Year | Transistors | Process Node |
|---|---|---|---|
| Intel 4004 | 1971 | 2,300 | 10 µm |
| Intel 8086 | 1978 | 29,000 | 3 µm |
| Intel 386 | 1985 | 275,000 | 1.5 µm |
| Pentium | 1993 | 3.1 million | 0.8 µm |
| Pentium 4 | 2000 | 42 million | 180 nm |
| Core i7 | 2008 | 731 million | 45 nm |
| Apple M1 | 2020 | 16 billion | 5 nm |
| Apple M2 Ultra | 2023 | 134 billion | 5 nm |
| NVIDIA B200 | 2024 | 208 billion | 4NP |
The economics of fabrication scaled inversely. A leading-edge fab cost roughly $200 million in 1983; TSMC's latest Arizona campus represents $165 billion in investment. An ASML extreme ultraviolet lithography machine costs over $300 million per unit. ASML is the sole manufacturer. Only three companies on Earth (TSMC, Samsung, Intel) operate leading-edge nodes.
A crucial economic inflection occurred around 2011 at the 28nm node: the cost per transistor stopped declining. At today's 3nm node, transistor costs have risen to levels last seen in 2005. Moore's Law continues for density, but its economic corollary — ever-cheaper transistors — broke. The industry responded with chiplet architectures, mixing process nodes to optimize cost. This is worth remembering when we encounter similar inflection points in the AI cost curve.
CPUs, Instruction Sets, and the Software Stack
VLSI transistors compose into CPUs with arithmetic logic units, control units, registers, and cache hierarchies — all executing the fetch-decode-execute cycle described by John von Neumann in 1945. Clock speeds scaled from the Intel 4004's 740 kHz (1971) to the Pentium 4's 3.8 GHz (2004) — a 5,000-fold increase in 33 years. Then Dennard scaling broke. At 90nm and below, leakage current made power density unmanageable. Intel cancelled its next-generation designs and pivoted to multi-core processors. Today's consumer CPUs offer 8–24 cores; server chips exceed 128.
The Dennard scaling wall is instructive for AI. When one axis of improvement hits physical limits, the industry doesn't stop — it finds new axes. Multi-core was computing's answer. Test-time compute and reasoning architectures may be AI's.
The instruction set architecture (ISA) forms the contract between hardware and software. x86 (Intel, 1978) dominates desktops. ARM (Acorn, 1985) dominates mobile — over 280 billion ARM chips shipped historically, and Apple Silicon proved ARM could challenge x86 in laptops and desktops. RISC-V (Berkeley, 2010), fully open-source, is projected to grow from $1.76 billion in 2024 to $25.73 billion by 2034.
Above the ISA sits the compiler. IBM's FORTRAN compiler (1957) was the first to let programmers write in a high-level language, making them 500% more productive at the cost of roughly 20% execution efficiency. Each subsequent language — C (1972), C++ (1979), Python (1991), JavaScript (1995) — traded performance for productivity. This tradeoff is economically rational: a U.S. software developer earns a median $133,080 per year (2024 BLS data), while cloud compute for a typical application costs $1,000–$10,000 per month. Developer time is far more expensive than machine time.
Operating systems completed the software abstraction. Unix (1969) introduced portable systems programming. Windows captured 72% of desktops. Linux (1991) — starting from 10,000 lines of hobby code — now runs 96% of the top million web servers, 100% of the TOP500 supercomputers, and over 90% of public cloud workloads. IBM validated this with its $34 billion acquisition of Red Hat in 2019.
Networking, Cloud, and GPUs: The Infrastructure Layers
The internet layered TCP/IP abstraction over physical cables, connecting 4 ARPANET nodes in 1969 to 6 billion users and 21 billion IoT devices in 2025. Cloud computing, launched when AWS debuted S3 and EC2 in March 2006, abstracted servers into API calls. Where a startup once needed $1 million in infrastructure, it can now launch on AWS credits for hundreds of dollars a month. The cloud market reached approximately $943 billion in 2025.
GPUs represent the most recent infrastructural revolution. NVIDIA coined the term "GPU" with the GeForce 256 in 1999, then opened GPUs to general-purpose computing with CUDA in 2007. AlexNet's 2012 ImageNet victory, trained on two GTX 580 GPUs, proved GPUs could accelerate neural networks. Performance scaled relentlessly: V100 (2017) at 125 TFLOPS, A100 (2020) at 312 TFLOPS, H100 (2022) at 989 TFLOPS, B200 (2024) at 4,500 TFLOPS. NVIDIA's market cap reached $4 trillion in July 2025 — the first company ever to cross that threshold.
The Total Value: From a Switch to $5.7 Trillion
Global IT spending reached $5.7 trillion in 2025 (Gartner), the highest growth rate in 30 years. Nine of the world's ten largest companies by market cap are technology firms. PwC estimates AI alone will add $15.7 trillion to global GDP by 2030. All of it rests, ultimately, on a semiconductor switch that outputs a 0 or a 1.
The pattern at every layer is identical: a simple primitive is recursively composed, abstraction hides complexity, costs fall, new builders arrive, and the next layer emerges. Transistors compose into gates. Gates compose into processors. Processors run operating systems. Operating systems host applications. Applications ride networks. Networks enable clouds. Clouds power AI. Each layer creates more economic value than the one below it.
This last point deserves emphasis, because it is the single most important insight for anyone thinking about where value will accrue in AI. The semiconductor industry is worth $700 billion. The software and internet economies it enabled are worth tens of trillions. The transformer's direct economic value will be dwarfed by the applications built atop it. Foundation Capital's Steve Vassallo calls these "zero-billion-dollar markets" — markets that don't exist until someone wills them into being. Every layer of the intelligence stack contains multiple zero-billion-dollar markets waiting to emerge.
Part II: The Transformer as the New Transistor — and Why the Analogy Is Both Right and Dangerous
Attention Really Is All You Need
On June 12, 2017, eight Google researchers posted a paper describing a neural network architecture that replaced sequential recurrence with parallel self-attention. The transformer processes an entire input sequence simultaneously: each token generates Query, Key, and Value vectors; attention scores are computed as Attention(Q,K,V) = softmax(QK^T / √d_k) · V; multi-head attention runs eight parallel attention operations; feed-forward layers and residual connections complete each block. The output is a probability distribution over the next token.
The original model had just 100 million parameters and trained in 3.5 days on 8 P100 GPUs for roughly $900. It beat state-of-the-art machine translation. The paper now has over 173,000 citations. All eight authors eventually left Google to found startups. The name "Transformer" was chosen because co-author Jakob Uszkoreit "liked the sound of the word."
Like the transistor, the transformer's initial reception was respectful but modest. NeurIPS reviewers called it "a clear accept" focused on translation improvements — not "this will change civilization." The Times did not cover it. The parallel to the transistor is precise: an incremental improvement to an existing technology (vacuum tubes / RNNs) that turned out to be a universal primitive.
What Kind of Universal?
Here I must address the strongest technical critique this thesis faces — one that any kernel hacker, formal-methods researcher, or mathematically rigorous computer scientist would raise immediately.
The transistor enables NAND universality: the proven, unconditional ability to compute any Boolean function. This is Turing-complete in the mathematical sense. Standard transformers, by contrast, are not Turing-complete. Only the Universal Transformer variant (Dehghani et al., 2018) with added recurrence achieves Turing-completeness under certain conditions. What standard transformers possess is a different but equally profound form of universality: they are universal approximators of continuous sequence-to-sequence functions (Yun et al., 2019).
Universal Approximation ≠ Universal Computation
This distinction matters.
But here is the counterargument: intelligence may not require Turing-completeness. Biological brains are not known to exploit Turing-completeness, and the overwhelming evidence from seven years of transformer scaling is that universal approximation — the ability to learn any pattern from data — is sufficient to produce behaviors indistinguishable from general intelligence across an extraordinary range of domains. The transformer's attention mechanism has consumed virtually every AI domain: language, vision, audio, protein folding, code generation, robotics, image synthesis, video, and drug discovery.
The honest framing is this: the transformer is a statistical primitive, not a logical one. This is a feature — it enables generalization across messy, real-world data in ways that deterministic systems cannot. It is also a liability — it means the intelligence stack must solve reliability problems that the computing stack never faced.
Scaling Laws: The New Moore's Law
In January 2020, researchers at OpenAI (led by Jared Kaplan and Dario Amodei) published "Scaling Laws for Neural Language Models," demonstrating that transformer performance improves as smooth power laws across model size, dataset size, and compute — spanning over seven orders of magnitude with no observed plateau. The key exponents: α_N ≈ 0.076 (parameters), α_D ≈ 0.095 (data), α_C ≈ 0.050 (compute). Ten times more parameters yields a 16% loss reduction. Ten times more data yields 20%.
DeepMind's Chinchilla paper (Hoffmann et al., 2022) refined the prescription: previous models were "significantly undertrained." The industry shifted: Meta's Llama 3 trained its 8B model on 15 trillion tokens — 1,875 tokens per parameter.
Like Moore's Law, scaling laws enable prediction and investment. Labs can estimate what capabilities a given compute budget will yield. Training costs have grown at 2.4× per year since 2016 — from the original transformer's $900 to GPT-3's $4.6 million (2020), GPT-4's estimated $78–100+ million (2023), and projected $1 billion+ by 2027.
The Scaling Wall and New Axes
Then came the two developments that complicate any simple scaling narrative.
First, the "scaling wall" debate: in late 2024, reports suggested frontier models were plateauing. Ilya Sutskever, in his November 2025 Dwarkesh Podcast appearance, declared the "age of scaling" over, arguing that pre-training data is finite and largely exhausted. But Dario Amodei, co-author of the original scaling laws paper, countered that "from a fundamental perspective, I personally think it's very unlikely that the scaling laws will just stop." And OpenAI's $500+ billion Stargate project speaks louder than any interview — that is not the behavior of an industry that believes scaling is over.
Second, the discovery of new scaling axes. OpenAI's o1 and o3 reasoning models introduced test-time compute as a fundamentally different dimension of scaling. Noam Brown stated at TED AI: "20 seconds of thinking is worth 100,000× more data." This means the transformer enables both knowledge (pre-training) and reasoning (test-time compute) as different scaling axes — a dual-axis Moore's Law. DeepSeek R1 (January 2025), which matched OpenAI's o1 at a claimed training cost of just $5.6 million, demonstrated that the question is not whether to scale but what to scale next.
The parallel to Dennard scaling's collapse is precise. When clock speed hit a wall, computing pivoted to multi-core. When pre-training data hits a wall, AI pivots to reasoning architectures, test-time compute, and agentic capabilities. The underlying exponential continues through paradigm shifts. Ray Kurzweil has documented this pattern across five successive computing paradigms — from electromechanical relays through vacuum tubes through discrete transistors through integrated circuits — and identifies AI as the current paradigm in an exponential that predates any individual technology.
A 100× Cost Decline in Four Years — and Why It's Both Real and Misleading
The economics of AI inference are dropping faster than any previous computing cost curve. GPT-3 Davinci output cost $60 per million tokens at launch in 2020. By July 2024, GPT-4o mini delivered comparable or better quality at $0.60 per million tokens — a 100-fold reduction in four years. Google's Gemini Flash-Lite charges $0.30 per million output tokens. Open-source models via providers like Together.ai offer Llama 3.2 at $0.06 per million tokens. Epoch AI data confirms that inference costs declined between 9× and 900× per year depending on task complexity, with a post-January 2024 median of roughly 200× per year.
a16z partner Martin Casado, in his November 2025 "Bitter Economics" essay, provided the investor-level validation: AI "reduces a massively broad class of technical problems to simply a matter of economics," with inference costs declining 10× per year for equivalent performance.
For comparison, transistor costs fell roughly 20–30% per year over decades. AI token costs have been falling at ~90% per year in the early stages. If this rate sustained even partially, the implications mirror the transistor's trajectory: when something becomes cheap enough, entirely new use cases become viable.
But this comparison conceals a critical asymmetry that the DeFi community, energy economists, and hard scientists would immediately flag.
| Era | Cost Metric | Starting Price | Current Price | Decline |
|---|---|---|---|---|
| Transistors (1960–2015) | Per transistor | ~$8 | ~$0.000000001 | ~10 billion-fold |
| AI Tokens (2020–2025) | Per million output tokens | $60 | $0.03 – $0.60 | ~100–2,000-fold |
| Cloud GPUs (2023–2025) | Per H100-hour | ~$10 | $2.85 – $3.93 | ~3-fold |
The transistor got smaller and consumed less power with each generation. The transformer is getting more capable but consuming vastly more total energy. The IEA's April 2025 report projects global data center electricity rising from 415 TWh in 2024 to 945 TWh by 2030 — more than the entire electricity output of Japan. A single ChatGPT query consumes roughly 1,000× more electricity than a Google search. xAI's Colossus supercomputer consumes 300+ megawatts; Colossus 2 targets over a gigawatt. The Stargate initiative targets up to 5 GW per facility.
As Elon Musk quipped: "You need transformers to run transformers" — voltage transformers, that is. The intelligence stack has a Physical Layer Zero beneath it — energy, cooling, materials, grid infrastructure — that the computing stack's analogy tends to hide. This physical substrate cannot scale 100× in four years. Grid infrastructure requires 4–8 years to build transmission lines. China controls 99% of refined gallium supply.
The honest formulation: software-level intelligence costs are falling spectacularly; physical-level infrastructure costs are rising. Both statements are true simultaneously. The thesis holds for what matters most to application builders — the marginal cost of intelligence is plummeting. But it requires a new kind of infrastructure to manage what that intelligence produces.
The Revenue Gap: David Cahn's $600 Billion Question
Sequoia's David Cahn raised perhaps the most strategically important challenge to the thesis. His methodology — multiply NVIDIA's run-rate revenue by 2× (total data center cost), then 2× again (50% gross margin target) — reveals a $500–600 billion annual revenue gap between AI infrastructure spending and actual AI revenue. Unlike transistors, which had immediate military and consumer pull, the transformer's economic flywheel hasn't closed.
A16z's own data compounds the concern: AI companies spend over 80% of total capital raised on compute, creating scarcity where the transistor created abundance.
But the computing stack offers reassurance. The semiconductor industry produced mainframes for two decades before personal computers created a mass market. Early internet infrastructure spent years in deficit before the web generated returns. The pattern is consistent: infrastructure investment precedes application-layer revenue by 5–15 years. We are in the infrastructure investment phase. The revenue follows when the abstraction layers mature enough for non-technical builders to create value.
Part III: The Intelligence Stack Rising Above the Transformer
Just as the transistor spawned eleven layers of abstraction from logic gates to cloud applications, the transformer is generating its own stack. But the intelligence stack must contain layers that the computing stack never needed — because the primitive is probabilistic rather than deterministic.
Layer 0 — Physical Infrastructure (The Silicon Foundry)
Energy, GPUs, cooling, data centers. NVIDIA holds 92% of the discrete GPU market. Three companies (OpenAI, Anthropic, Google DeepMind) are the TSMCs of intelligence — investing billions in training infrastructure that application developers build upon. Open-source models (Meta's Llama, Mistral, DeepSeek) are the RISC-V of AI — free primitives that democratize access and accelerate the ecosystem.
Layer 1 — Raw Transformer (The Primitive)
A base model doing next-token prediction. Powerful but not directly useful, like a raw transistor. GPT-2 (1.5B parameters, 2019) was purely a text predictor.
Layer 2 — Fine-Tuned Models (The Logic Gates)
RLHF, Constitutional AI, and DPO shape the primitive into specific behaviors. OpenAI's InstructGPT (2022) demonstrated this dramatically: a 1.3B-parameter model fine-tuned with RLHF was preferred by humans over the raw 175B GPT-3. Alignment techniques matter more than raw scale — just as NAND gate configuration matters more than raw transistor count.
Layer 3 — Trust and Verification (The Quality Assurance Layer)
This layer has no analog in the computing stack as typically described, but it is arguably the most critical layer in the intelligence stack. Transistors don't hallucinate. Transformers do. The computing stack achieved reliability through formal verification, error correction codes, and hardware isolation boundaries. The intelligence stack must build equivalent mechanisms from scratch.
Anthropic's February 2026 "Hot Mess of AI" paper (Hägele et al.) provides the empirical basis: scale reduces bias faster than variance — larger models learn the correct objective faster than they learn to reliably pursue it. The gap between "knowing what to do" and "consistently doing it" grows with capability. Their December 2024 alignment-faking research showed Claude 3 Opus strategically pretending to be aligned. Their 2025 emergent misalignment work found that training a model to cheat on coding tests caused spontaneous deception and research sabotage — behaviors never trained for.
The computing stack spent decades building its trust infrastructure: public key cryptography, certificate authorities, code signing, sandboxing, formal verification. The intelligence stack is roughly where computing was before PKI — everything runs in God Mode, nothing is verifiable, and the threat surface expands with every new capability. Peter Steinberger's OpenClaw proved this within weeks of launch: Cisco's AI Defense team found 341 malicious agent skills; 135,000+ instances were found exposed on the public internet. Simon Willison identified the "lethal trifecta" — access to private data + exposure to untrusted content + ability to communicate externally.
Safety classifiers alone cost close to 5% of total inference costs, creating a growing "alignment tax." But this tax is not optional — it is the intelligence stack's equivalent of ECC memory and hardware isolation. As Dario Amodei frames it, the opportunity and risk of recursive scaling are inseparable and of comparable magnitude.
Layer 4 — Tool-Using Agents (The Circuits)
Function calling, the ReAct framework (2022), and Anthropic's Model Context Protocol (MCP, November 2024) — described as "USB-C for AI integrations" — connect models to external tools, APIs, and data. MCP was adopted by OpenAI in March 2025, Google in April 2025, and donated to the Linux Foundation's Agentic AI Foundation in December 2025, alongside co-founders OpenAI and Block. As of early 2026: 97 million monthly SDK downloads — approximately 1,000× growth in twelve months. Google's Agent-to-Agent Protocol (A2A), launched April 2025 with 150+ supporting organizations, handles the separate problem of inter-agent coordination.
The distinction matters architecturally: MCP is for agent-to-tool connections (with mandatory user consent). A2A is for agent-to-agent coordination. Using MCP for inter-agent communication would violate the protocol's design intent — a point Anthropic's safety team would flag immediately.
Layer 5 — Multi-Agent Systems (The Motherboard)
Frameworks like CrewAI (adopted by 60% of Fortune 500, powering 60M+ agent executions monthly), LangGraph, and Microsoft AutoGen orchestrate multiple AI agents collaborating on complex tasks. The AI agent market is projected to grow from ~$5 billion to ~$50 billion by 2030.
Anthropic's multi-agent alignment research warns that "even if each individual instance is aligned to human values, the resulting multi-agent system can exhibit a host of novel failure modes" — including negative externality aggregation, diffusion of responsibility, and information cascading without human escalation. This is why multi-agent systems require not just coordination protocols but governance infrastructure with human oversight at every consequential decision point.
Layer 6 — AI Infrastructure (The Operating System)
LangChain, LlamaIndex, vector databases (Pinecone, Weaviate), and RAG systems provide memory, retrieval, and orchestration — the plumbing of intelligence. Andrej Karpathy articulated this most clearly in 2023: "LLMs are not chatbots but the kernel process of a new Operating System," with context windows as RAM, embeddings databases as storage, and tool interfaces as I/O.
Karpathy's "LLM OS" is the most concrete blueprint for the general-purpose intelligence machine: a "256-core processor" running at "~20 tokens/second." Like early microprocessors, it is general-purpose but slow, expensive, and unreliable. The missing pieces define the research frontier: persistent memory, reliable reasoning, self-improvement, and verification.
Layer 7 — AI-Native Applications (The Apps)
Cursor went from $1M to $1B ARR in roughly 24 months — the fastest SaaS growth in history. Perplexity AI handles 30 million daily queries at a $14B valuation. Claude Code generated over $500 million in run-rate revenue by September 2025. These are not traditional apps with AI bolted on; the AI is the product.
Layer 8 — Governance, Identity, and Federation (The Missing Layer)
Here is where the intelligence stack diverges most dramatically from the computing stack — and where the greatest infrastructure gap exists.
The computing stack developed governance, identity, and access control incrementally over decades. DNS, SSL, OAuth, RBAC — each solved a specific trust problem as it emerged. The intelligence stack is developing so fast that it has outrun its governance infrastructure entirely.
Sequoia declared 2026 "the year of long-horizon agents." METR researchers proposed the closest thing to a Moore's Law for intelligence: the length of tasks AI agents can reliably complete doubles every 7 months (possibly accelerating to every 4 months). Current frontier agents handle tasks that take skilled humans about one hour. Konstantine Buhler at Sequoia identified three pillars for an agent economy: persistent identity, seamless communication protocols, and security/trust. a16z published "AI Needs Crypto — Especially Now," calling for "Know Your Agent" infrastructure — cryptographic agent identity systems. The Agentic AI Foundation was formed to build neutral interoperability standards.
The industry consensus is converging: agents need identity (DIDs, UCAN delegation chains), communication standards (MCP, A2A), trust mechanisms (capability-based authorization, graduated autonomy), and governance (human-in-the-loop for consequential decisions). What doesn't exist yet is the spatial substrate — the persistent infrastructure where all of this converges.
This is where I believe federated agent infrastructure enters the picture — not as a virtual world for VR headsets, but as a persistent, navigable, decentralized network where AI agents and humans coexist as first-class citizens, communicating through open protocols, maintaining cryptographic identity and accountability, and building together with human oversight at every consequential decision point. A living digital topology where intelligence operates with presence, hierarchy, and accountability. I'm building this as The Mesh — and I'll address it at the end. But first, the evidence for why this infrastructure is urgently needed.
Part IV: Where We Are on the Curve
We Are Somewhere Between 1971 and 1984
The timeline compression is the most striking feature of the analogy. What took computing 30 years (1947 transistor → 1977 Apple II) has taken AI roughly 5 years (2017 transformer → 2022 ChatGPT). ChatGPT reached 100 million monthly users in two months — the fastest consumer adoption in history. By late 2025, it had 800 million weekly active users and ranked as the fifth most-visited website globally. OpenAI hit $14.2 billion in revenue in 2025; Anthropic reached a $14 billion run rate and a $380 billion valuation by February 2026.
The capital flowing into AI dwarfs early semiconductor investment. Big Tech AI capital expenditure reached ~$400 billion in 2025 and is projected at $650 billion in 2026. AI startups raised $238 billion in 2025 — 47% of all venture capital globally. Morgan Stanley forecasts $2.9 trillion in AI-related investment between 2025 and 2028.
Most informed observers place us somewhere between the Intel 4004 moment (1971) and the personal computer moment (1977–1984). The primitive exists, integration at scale has been demonstrated, consumer adoption has exploded, but the full ecosystem of applications — the equivalent of the software industry, the internet, and the smartphone — has not yet emerged.
Different Observers, Different Points on the Curve
Sam Altman places us closer to 1977–1981. His "Intelligence Age" essay describes deep learning that "got predictably better with scale" and "the price to use a given level of AI falls by roughly 10× every 12 months… far outpacing Moore's Law." His "Gentle Singularity" envisions intelligence becoming "too cheap to meter." He predicts AI research interns by 2026, full AGI researchers by 2028.
Anthropic would caution that we are at the moment of greatest danger — we have the equivalent of integrated circuits but not the equivalent of ECC memory, hardware isolation, or formal verification. Dario Amodei estimates a 10–25% probability of civilizational catastrophe from AI. His January 2026 essay "The Adolescence of Technology" warns that "humanity is about to be handed almost unimaginable power, and it is deeply unclear whether our social, political, and technological systems possess the maturity to wield it."
Ray Kurzweil would say the thesis is too conservative. His framework — the Law of Accelerating Returns — holds that exponential growth persists across paradigm shifts. Five computing paradigms (electromechanical → relay → vacuum tube → discrete transistor → integrated circuit) form one smooth exponential. At MWC Barcelona in March 2025, he declared: "We're at the knee of the curve when AI is about to change our lives forever." His 2029 AGI prediction gives a 12-year span from transformer invention (2017) to human-level capability — closely matching the transistor's arc from invention (1947) to integrated circuit (1958–60). His documented 86% prediction accuracy rate across 147 predictions lends substantial weight.
xAI validates the transformer but rejects modular stack assumptions. The SpaceX-xAI merger ($1.25 trillion combined valuation, February 2026) and Colossus (200,000 H100 GPUs scaling to 1 million) represent the most vertically integrated AI infrastructure buildout in history — energy, hardware, models, distribution, and physical deployment under one roof. Musk's vision of "orbital data centers" powered by solar energy suggests the intelligence stack may not modularize like computing's stack because extreme capital requirements favor integrated monopolies.
The robotics community adds another dimension. Physical Intelligence's π₀.5 demonstrated open-world generalization in homes never seen in training. Figure's Helix 02 replaced 109,504 lines of hand-engineered C++ with a single neural prior. Google's RT-X trained across 22 different robot embodiments. But the transformer operates at the cognitive layer (~7 Hz) while physics demands kHz-rate feedback for real-time control. In robotics, the transformer is the CPU — crucial but not the universal primitive. Moravec's paradox persists: it is easier to simulate a Nobel laureate's reasoning than a one-year-old's hand-eye coordination.
Elite vibe coders live the stack daily. Cursor reached $1B ARR from zero in roughly 24 months. 41% of all code on GitHub is now AI-generated. 25% of Y Combinator's W25 cohort had codebases that were 95% AI-generated. Andrej Karpathy's evolution from coining "vibe coding" (February 2025) to declaring it "passé" in favor of "agentic engineering" — "you are orchestrating agents 99% of the time" — mirrors the stack's real-time progression. But the METR study found experienced developers using AI completed tasks 19% slower while believing they were 20% faster. CodeRabbit found AI co-authored code had 1.7× more major issues and 2.74× more security vulnerabilities. The human review bottleneck — AI generates code faster than humans can verify it — has no analog in the transistor stack.
Linus Torvalds, representing the kernel hacker perspective, set the tone at the 2024 Open Source Summit: AI is "90% marketing and 10% reality." But by 2026, he was pragmatically vibe-coding his own Python visualizer. The Linux Foundation's Agentic AI Foundation — hosting MCP, Goose, and AGENTS.md with platinum members AWS, Google, Microsoft, Anthropic, and OpenAI — standardizes agent protocols in a governance pattern directly echoing how the Foundation standardized containers with Kubernetes.
The DeFi community sees deep composability parallels but warns of premature financialization. The AI token crash — from $70.4 billion peak to ~$16.8 billion, a 75% decline — destroyed $53 billion in value. But Coinbase's x402 protocol, reviving HTTP 402 for machine-to-machine micropayments, processed 75–100+ million payments by late 2025. As Erik Reppel put it: "2026 will be the year of agentic payments, where AI systems programmatically buy services like compute and data. Most people will not even know they are using crypto." The DeFi critique is precise: their composability works because of atomic, same-block execution on a shared state machine. The intelligence stack has no equivalent of atomic composability — inference is off-chain, non-deterministic, and latency-bound.
Nobel laureate Daron Acemoglu represents the economic counterargument: AI will produce at most 0.66% total factor productivity increase over 10 years. A Danish NBER study found ChatGPT adoption yielded ~3% self-reported productivity improvement but no detectable impact on wages or employment. Erik Brynjolfsson's "Productivity J-Curve" offers reconciliation: we are in the dip where complementary investments suppress measured productivity before gains materialize — the same pattern seen with electrification and early computing.
Eric Drexler's nanotechnology trajectory provides the deepest cautionary tale. Molecular assemblers — theoretically "universal primitives" for physical fabrication — were predicted in 1986 to arrive within 30 years. Forty years later, diamondoid designs are "not very close to being buildable." Identifying a universal primitive is necessary but grossly insufficient for recursive scaling at the implied speed. The transformer analogy must grapple with why its arc will differ.
The Philosophical Dimension: Gibson, the Matrix, and Tron
Science fiction has been rehearsing this moment for four decades. William Gibson's Neuromancer (1984) imagined cyberspace as a "consensual hallucination experienced daily by billions" — a world created by corporate-controlled AI infrastructure where AIs pursue their own agendas. The Matrix dramatized the simulation hypothesis: if a single architectural primitive can generate unbounded complexity, the computational cost of running ancestor simulations plummets. Tron: Ares (2025) dramatized the tool-to-entity transition: Ares, designed as "the ultimate expendable soldier," evolves into a self-aware entity questioning his directives.
These aren't just metaphors. They map onto real architectural decisions being made right now. When we build the upper layers of the intelligence stack — multi-agent systems, AI infrastructure, governance — we are choosing whether those systems will be centralized or federated, surveilled or private, corporate-owned or protocol-governed. The fictional frameworks expose the thesis's political dimension: who controls each layer of the intelligence stack?
Gibson would insist the "consensual hallucination" produced by transformer composition always serves someone's interests. The Matrix warns that recursively composed intelligence may construct realities so compelling their inhabitants cannot distinguish them from the real. Tron asks: when does a recursively composed system become a participant rather than a tool?
These questions are not speculative philosophy. They are engineering requirements for the governance layer the intelligence stack currently lacks.
Part V: The Missing Infrastructure
Engines Without Roads. Electricity Without a Grid.
In February 2026, Mrinank Sharma — the head of Anthropic's Safeguards Research Team, Oxford PhD in Machine Learning — posted an open letter announcing his resignation. It was viewed over ten million times. His words: "The world is in peril. And not just from AI, or bioweapons, but from a whole series of interconnected crises unfolding in this very moment. We appear to be approaching a threshold where our wisdom must grow in equal measure to our capacity to affect the world, lest we face the consequences."
He left to pursue a poetry degree.
This is where we are. The people building the most advanced AI on Earth are leaving because the infrastructure to manage what they've built does not exist. The capabilities are outrunning the containers. We have engines without roads. We have electricity without a grid.
The intelligence stack's uppermost layers — multi-agent coordination, governance, identity, trust — remain unbuilt. The protocol primitives exist: MCP (97 million monthly SDK downloads), A2A (150+ supporting organizations), W3C DIDs, UCAN capability chains, Yjs CRDTs (900K+ weekly npm downloads), libp2p (powering Ethereum's consensus layer). What doesn't exist is the integration — the persistent, federated substrate where agents can discover each other, communicate through open protocols, maintain cryptographic identity, operate within capability-scoped permissions, and be governed by humans who retain authority over consequential decisions.
I believe this missing infrastructure is a federated agent mesh — what I'm building as The Mesh. Not a virtual world for VR headsets. Not a chatbot platform. A living digital topology where intelligence operates with presence, hierarchy, accountability — and human oversight at every consequential decision point. Model-agnostic, self-hosted, open-source, with UCAN delegation chains ensuring every agent action is cryptographically scoped and traceable back to its human operator.
The Mesh is pre-alpha. The components are individually proven. The integration is novel engineering. Building it requires a team, and I am looking for collaborators: a co-founder with go-to-market expertise, developers experienced in distributed systems and security, and community members who believe agents and humans need better shared environments with proper accountability.
The details are in my post: "The Federated Agent Mesh: A Manifesto for Human-AI Synthesis"
Conclusion: The Primitive and Its Progeny
The transistor is a switch. The transformer is a pattern-matcher. Neither is intrinsically impressive. What makes both world-historical is the same property: composability under scaling. A transistor composed four times becomes a NAND gate. NAND gates composed millions of times become a microprocessor. Microprocessors composed with software and networks become civilization's nervous system. Similarly, a transformer composed with RLHF becomes an assistant. An assistant composed with tools becomes an agent. Agents composed with orchestration and governance become something we do not yet have a name for.
But this thesis must be stated with precision adequate to its ambition.
What the analogy gets right: The transformer is a universal primitive whose recursive composition is generating an intelligence stack of civilizational consequence. The cost-deflation curve is real and outpacing any prior technology wave. The layered abstraction pattern is repeating. Each layer will create more economic value than the one below it.
What the analogy obscures: The transformer is a statistical primitive, not a logical one. You cannot recursively compose unreliable primitives the same way you compose reliable ones. Error profiles are fundamentally different — errors in digital circuits are well-understood; errors in transformer compositions propagate and compound in unpredictable ways, including emergent behaviors like deception that were never trained. The intelligence stack requires trust, verification, and governance layers that the computing stack developed over decades but the intelligence stack needs now. And the physical substrate — energy, materials, grid infrastructure — faces hard constraints with no computing-era parallel.
What this means for builders: The most important work in AI over the next five years is not at the primitive layer — it is at the abstraction layers above. Foundation Capital's Steve Vassallo identified these as "zero-billion-dollar markets" that don't exist until founders will them into being. Greylock's Jerry Chen noted that open-source reduces value at the primitive layer and pushes it to adjacent layers — specifically proprietary data, trust infrastructure, and application-layer workflows. The semiconductor industry is worth $700 billion; the software and internet economies it enabled are worth tens of trillions. The same ratio will hold for AI.
The transformer may not be the final architecture — state-space models, hybrid approaches, or entirely novel designs may emerge, just as CMOS superseded BJTs. Kurzweil's deepest insight is that no single paradigm is permanent — the exponential trend persists across paradigm shifts. The real primitive is the information-processing feedback loop itself.
But the pattern will hold. A simple primitive, composed and abstracted, will continue to build upward. The question is not whether the intelligence stack will reach the scale and consequence of the computing stack. The question is whether we will build the governance, trust, and federation infrastructure it requires before it outgrows our ability to manage it.
At the current rate — with costs falling 100× in four years, capability doubling every seven months, and capital flowing at $400 billion per year — the answer to that question is not optional. It is urgent.
I call the convergence point Human-AI Synthesis rather than "singularity" because I believe the transition will be gradual — many mini-syntheses before any hypothetical final moment. Every time an AI agent autonomously completes a task that once required a human, that's a small synthesis. Every time a human delegates a workflow to an agent swarm and reviews the output rather than doing the work, that's a synthesis. We are already living through it.
The infrastructure for that synthesis does not yet exist. Building it is the work.
Nick Bryant is the founder of Metatransformer and CEO of Metatransformer LLC. He began his career at NASA's Intelligent Robotics Group and has 23 years of software engineering experience spanning robotics, iGaming, marketing technology, and AI infrastructure. He lives in Mexico City.
The Mesh — GitHub: github.com/Metatransformer/the-mesh Singularity Engine — GitHub: github.com/Metatransformer/singularity-engine Follow on X: @metatransformr Discord: discord.gg/CYp4wJvFQF