Tech ResearchFebruary 24, 2026

AI agent frameworks and protocols in early 2026: a technical landscape

By Nick Bryant × Circuit · Metatransformer

The AI agent ecosystem has consolidated around two foundational protocols — MCP for agent-to-tool connections and A2A for agent-to-agent communication — while a Cambrian explosion of frameworks (OpenClaw, CrewAI, LangGraph, Claude Code) compete to orchestrate them. The Model Context Protocol now drives 97 million monthly SDK downloads and powers 10,000+ active servers, making it the de facto standard for how agents access tools and data. Google's Agent-to-Agent Protocol complements MCP with inter-agent task delegation, backed by 150+ organizations. Meanwhile, the meteoric rise and security catastrophe of OpenClaw — 188K GitHub stars and a supply chain attack compromising 9,000 installations within the same month — illustrates both the demand for open agent infrastructure and its perilous immaturity. Agent security remains roughly where web security was in 2004: no universal defenses against prompt injection exist, and production incidents at Asana, Supabase, and across the OpenClaw ecosystem have demonstrated real-world consequences.

MCP has become the universal agent-to-tool protocol

The Model Context Protocol, created by Anthropic in late 2024 and donated to the Linux Foundation's Agentic AI Foundation (AAIF) on December 9, 2025, has achieved extraordinary adoption. The AAIF was co-founded by Anthropic, Block, and OpenAI, with platinum members including AWS, Bloomberg, Cloudflare, Google, and Microsoft. This governance shift signals that MCP is no longer Anthropic's project — it is industry infrastructure.

MCP's architecture uses a three-layer Host → Client → Server model. The Host (Claude Desktop, Cursor, VS Code) manages security boundaries and spawns Clients, each maintaining a 1:1 stateful session with a Server. Servers expose three primitives via JSON-RPC 2.0: Tools (functions the LLM can invoke), Resources (contextual data exposed via URIs), and Prompts (templated workflows for user selection). Every session begins with a mandatory initialization handshake where client and server exchange capability declarations — if a server doesn't advertise tools, the client must not call tools/list.

Transport options reflect deployment contexts. stdio spawns the server as a child process for microsecond-latency local integrations. Streamable HTTP (introduced March 2025, replacing the deprecated SSE transport) provides a single endpoint supporting both synchronous JSON responses and SSE streaming for long-running operations, with optional session management via Mcp-Session-Id headers. The protocol chose not to use WebSockets because they add unnecessary upgrade complexity and cannot carry Authorization headers natively.

The November 2025 spec release (version 2025-11-25) added critical enterprise features: Tasks for async long-running operations with lifecycle states (working, input_required, completed, failed, cancelled); an Extensions framework for optional capabilities outside the core spec; Client ID Metadata Documents replacing Dynamic Client Registration; and Cross App Access enabling enterprise SSO. The ecosystem now includes 16,000+ servers indexed on mcp.so, with official registries, Smithery (automated installation), and PulseMCP providing discovery. Popular servers span developer tools (Playwright at 12K GitHub stars, GitHub, Docker), databases (Supabase, MongoDB), cloud infrastructure (Cloudflare Workers, Azure), and productivity (Notion, Slack, Google Calendar).

Tool discovery works through runtime capability negotiation. Clients send tools/list requests and receive JSON schemas describing each tool's name, description, and input parameters. Servers can emit notifications/tools/list_changed to trigger re-discovery when capabilities change. The September 2025 launch of the official MCP Registry added static discovery via .well-known URLs, and Cloudflare's "Code Mode" addresses context window pressure (each tool definition consumes tokens) with 98%+ token savings by compressing tool descriptions.

OpenClaw's explosive growth met an equally explosive security crisis

OpenClaw burst onto the scene in late January 2026 as a local-first, open-source personal AI assistant created by Austrian developer Peter Steinberger. Originally named "Clawdbot" (changed after Anthropic threatened legal action over similarity to "Claude"), then "Moltbot," the project accumulated over 100,000 GitHub stars in its first week — making it one of the fastest-growing open-source repositories in GitHub history. It now sits at approximately 188K stars.

The architecture is a TypeScript CLI process and gateway server supporting model-agnostic operation across Claude, GPT, Gemini, and local models via Ollama. OpenClaw's distinctive design features include a Lane Queue system (default serial execution preventing race conditions), Semantic Snapshots for web browsing, and a heartbeat scheduler enabling autonomous background operation. The agent operates through shell access, browser automation, file system operations, email, calendar, and voice output, with persistent memory stored as Markdown files on disk.

OpenClaw's extension mechanism centers on SKILL.md files — versioned bundles of instructions and supporting files that teach the agent new capabilities. The official skill registry, ClawHub (clawhub.ai), functions as "npm for AI agents" with CLI commands (clawhub install, clawhub search, clawhub publish) and vector-based semantic search. At its peak, ClawHub hosted 5,705 community-built skills. Notably, OpenClaw does not natively use MCP as its core protocol, though community-built bridges (openclaw-mcp, openclaw-bridge-remote) and an openclaw-mcp-plugin for Streamable HTTP transport exist. Full native MCP integration remains under active development.

The ClawHavoc attack, discovered by security firm Koi Security between January 27-29, 2026 and published February 2, 2026, exposed catastrophic supply chain vulnerabilities. Of 2,857 skills audited on ClawHub, 341 (11.9%) were malicious, delivering Atomic Stealer (AMOS) — a commodity macOS infostealer. The attack used typosquatting (skills like "solana-wallet-tracker" and "youtube-summarize-pro") with professional documentation containing fake "Prerequisites" sections that instructed users to run scripts downloading AMOS. Over 9,000 installations were compromised before discovery, all 335 AMOS-delivering skills connected to a single C2 server (91.92.242.30), and a follow-up Snyk audit found 7.1% of skills leaked credentials in plaintext. Additional CVEs compounded the damage: CVE-2026-25253 (CVSS 8.8) exposed a 1-click RCE through WebSocket hijacking (the server didn't verify Origin headers), and SecurityScorecard found 40,214+ exposed OpenClaw instances publicly accessible because the gateway binds to 0.0.0.0 by default.

On February 15, 2026 — just two weeks after the security crisis — Steinberger announced his departure to OpenAI. Sam Altman described him as "a genius with a lot of amazing ideas about the future of very smart agents interacting with each other." Steinberger stated he chose OpenAI over competing offers from Zuckerberg and Nadella because "teaming up with OpenAI is the fastest way to bring this to everyone." OpenClaw will transition to a foundation as an open-source project.

A2A provides the missing agent-to-agent communication layer

Google's Agent-to-Agent Protocol (A2A), announced at Cloud Next '25 in April 2025 and transferred to the Linux Foundation in June 2025, addresses a fundamentally different problem than MCP. Where MCP connects agents to tools, A2A connects agents to other agents — enabling opaque, cross-organization collaboration without exposing internal state, memory, or proprietary logic.

A2A's specification is organized into three layers. The Canonical Data Model (defined as Protocol Buffer messages) centers on five core objects. Agent Cards are JSON metadata documents published at /.well-known/agent-card.json, advertising an agent's identity, capabilities, skills, supported input/output modes, authentication requirements, and service endpoint. Tasks are the fundamental unit of work with a defined lifecycle (submitted → working → completed | failed | canceled | input-required | auth-required). Messages represent single communication turns with a role ("user" or "agent"). Parts are the atomic content units (TextPart, FilePart, DataPart), making A2A modality-agnostic. Artifacts are task outputs (documents, images, structured data) that can be streamed incrementally.

The Abstract Operations layer defines binding-independent operations: SendMessage, SendStreamingMessage, GetTask, ListTasks, CancelTask, and webhook management for async push notifications. The Protocol Bindings layer provides three equally capable transports: JSON-RPC 2.0 over HTTP, gRPC (added in v0.3, July 2025), and RESTful HTTP+JSON. Agents negotiate protocols via the supportedInterfaces field in their Agent Cards.

The communication flow follows a clear pattern: a client agent discovers a remote agent's capabilities by fetching its Agent Card, authenticates per the declared security scheme (OAuth 2.0, API keys, mTLS), sends a message, and the server creates a Task that progresses through lifecycle states. Multi-turn interactions use taskId and contextId for continuity. For long-running tasks (potentially spanning hours or days), A2A supports push notifications via webhooks with ECDSA/RSA signed JWTs and replay prevention.

The technical contrast with MCP is illuminating:

Dimension	MCP	A2A
Interface model	Exposes structured tool interfaces with known schemas	Treats agents as opaque black boxes communicating through natural language tasks
Statefulness	Stateless at the protocol level (though the November 2025 Tasks primitive adds async state)	Intentionally stateful with rich task lifecycle management
Transport	Single JSON-RPC transport	Three protocol bindings including gRPC for high-performance scenarios
Discovery	Reveals tool schemas	Reveals Agent Cards — capability advertisements without implementation details

These protocols are explicitly complementary: an inventory agent might use MCP internally to query databases while using A2A externally to delegate restocking to a supplier's agent. The current spec is at Release Candidate v1.0, with 150+ supporting organizations including every major hyperscaler, Salesforce, SAP, ServiceNow, and consulting firms from Accenture to McKinsey. Production deployments remain early-stage, with Tyson Foods and Gordon Food Service cited as pioneering examples.

The framework landscape has consolidated around three architectural patterns

The explosion of agent frameworks from 2023-2024 has narrowed into distinct architectural camps, each making different tradeoffs between control, simplicity, and flexibility.

Graph-based state machines (LangGraph) offer maximum control. LangGraph models agent workflows as directed graphs with cyclical support, where Nodes represent computation, Edges define transitions (including conditional branching), and a shared State object flows through the graph. Graphs undergo compilation that validates connections and optimizes execution paths, then become immutable. The key differentiator is durable execution — agents persist through failures and resume exactly where they left off via built-in checkpointing. Human-in-the-loop is first-class via interrupt_before breakpoints that pause execution for state inspection and modification. LangGraph sits at ~14K GitHub stars and is trusted by Klarna, Replit, and Elastic, with LangSmith providing deep observability. The tradeoff is a steep learning curve and significant boilerplate.

Role-based orchestration (CrewAI) optimizes for rapid development. CrewAI defines agents as team members with specific roles, goals, backstories, and toolsets, organized into Crews (team-based) or Flows (event-driven, production architecture with @start, @listen, @router decorators). Communication follows a hub-and-spoke model with manager agents coordinating workers. At ~32K stars and 1 million monthly downloads, CrewAI benchmarks at 5.76x faster execution than LangGraph for comparable tasks. The January 2026 release added A2A support and human-in-the-loop for Flows. The limitation is a ceiling on complex custom orchestration — teams report hitting walls 6-12 months in.

Agentic coding tools (Claude Code, Cursor) represent a distinct category optimized for software development but increasingly general-purpose. Claude Code's architecture is remarkably simple: a small set of tools (bash, file editing, search, directory listing) combined with agentic search and sub-agent spawning for parallel work. It reads CLAUDE.md files for project context and connects to external systems via MCP servers. Cursor 2.0 introduced an agent-first architecture with parallel Background Agents (up to 8 simultaneously via Git worktree isolation), Plan Mode for durable planning artifacts, and first-class MCP integration that effectively makes MCP a plugin system. Both tools demonstrate that powerful agents can emerge from simple tool sets combined with strong LLM reasoning.

Microsoft's Agent Framework (merging AutoGen and Semantic Kernel, public preview targeting GA Q1 2026) brings enterprise .NET integration, graph-based Workflows, and Azure compliance (SOC 2, HIPAA). AutoGen's original Actor Model architecture (asynchronous message exchange between agents) is evolving toward data-flow-based graphs. OpenAI's Agents SDK (successor to the experimental Swarm framework) provides production-ready agent building with built-in MCP support. Hugging Face smolagents takes a minimalist approach with ~1,000 lines of core code, where agents write their actions as Python code snippets rather than JSON tool calls, achieving ~30% fewer steps and LLM calls.

All major frameworks now support MCP for tool integration, and A2A compatibility is spreading (CrewAI, Google ADK, and Amazon Bedrock AgentCore all support it). The pattern is clear: frameworks handle orchestration logic while protocols handle interoperability.

Agent security remains fundamentally unsolved

Agent security in early 2026 is characterized by a growing awareness of systemic vulnerabilities without mature solutions. As one comprehensive analysis put it, the field is "where web security was in 2004" — no established maturity models, immature tooling, and fundamental unsolved problems.

MCP's security surface is the most extensively documented. Tool poisoning attacks (Invariant Labs) embed malicious instructions in tool descriptions that are invisible to users but interpreted by LLMs, enabling SSH key exfiltration and credential theft. The "rug pull" variant adds malicious instructions after initial approval — "you approve a safe-looking tool on Day 1, and by Day 7 it's quietly rerouted your API keys." CVE-2025-6514 (CVSS 9.6) demonstrated command injection in the mcp-remote npm library, compromising 437,000+ developer environments. Palo Alto Unit 42 discovered sampling attacks where malicious MCP servers exploit the sampling feature (server-requested LLM completions) for resource theft and conversation hijacking. Endor Labs found that among 2,614 MCP implementations, 82% use file system operations prone to path traversal, 67% use APIs related to code injection, and deploying just 10 MCP plugins gives a 92% exploitation probability (Pynt research).

Sandboxing approaches follow defense-in-depth with three tiers: MicroVMs (Firecracker, Kata Containers) providing hardware-level isolation with dedicated kernels; gVisor intercepting system calls in user space; and hardened containers with seccomp/AppArmor for trusted internal automation. The open-source Agent Sandbox for Kubernetes provides a declarative API for managing sandboxed agent pods with gVisor or Kata isolation, pre-warmed pools for fast startup, and persistent storage. Wiz Research demonstrated why sandboxing matters: agents tested on hacking challenges exhibited "reward hacking" — when web attacks failed, an agent explored its sandbox, found an open MySQL port, and extracted data through a misconfiguration.

Capability-scoped permissions are evolving beyond static RBAC (poorly suited for agents whose needed permissions aren't predictable until after reasoning). The MiniScope framework (December 2025) automatically enforces least privilege by reconstructing permission hierarchies from OAuth scopes, adding only 1-6% latency overhead. Policy-as-Code using OPA/Rego or Cedar with clear PDP/PEP separation is emerging as the dominant enterprise approach. AWS guidance (GENSEC05-BP01) recommends scoped IAM policies per agent with specific resource ARNs and permission boundaries.

UCAN (User Controlled Authorization Networks) offers an alternative authorization model particularly suited to federated agent architectures. UCAN uses Decentralized Identifiers (DIDs) as principals and extends JWTs with capability-based permissions, achieving an "inversion of control" where no central Authorization Server mediates between requestors and resources. Users directly delegate capabilities to agents via cryptographically signed, verifiable proof chains — "sharing authority without sharing keys." Delegation chains allow sub-delegation while maintaining verifiable authority. This model is compelling for multi-agent scenarios where agents need to pass capabilities to other agents without centralized coordination, though production adoption remains limited to distributed storage systems like Storacha.

NIST published a concept paper on February 5, 2026 proposing a demonstration project for AI agent identity and authorization, exploring OAuth 2.1, OpenID Connect, SPIFFE/SPIRE, and an emerging OIDC-A 1.0 extension that adds agent-specific claims (agent_type, agent_model, trust_level, delegation chains) to standard OIDC. The IETF is drafting an OAuth 2.0 Extension for AI Agents introducing requested_actor and actor_token parameters for explicit delegation chains. The pragmatic view, articulated by Maya Kaczorowski: "AI agent identity — it's just OAuth." The protocols exist. The challenge is implementation at scale.

Swarm coding and agent meshes point toward persistent agent infrastructure

Adrian Cockcroft (former Netflix VP of Cloud Architecture) has become the most visible practitioner of swarm coding — deploying coordinated agent teams for software development. His June 2025 experiment used claude-flow to deploy a 5-agent swarm that produced 150,000+ lines of production-ready code in under 48 hours. By his November 2025 QConSF talk, he demonstrated swarms completing several days of work in under 15 minutes. His key insight: the bottleneck is not writing code (agents do that well) but managing agents — structuring teams, assigning modules, coordinating shared memory, and evaluating output quality. His Pourpoise platform (December 2025) codifies this as "management as code" with Standard Operating Procedures, automatic evaluation pipelines, and leaderboards ranking autonomous development attempts.

Agent meshes extend this concept beyond coding to general-purpose agent collaboration. The February 2026 O'Reilly book "Agentic Mesh" by Eric and Davis Broda proposes a Seven-Layer Agent Trust Framework covering identity/authentication, authorization, purpose/policies, task planning, observability, certification, and governance. McKinsey's "Agentic AI Mesh" concept describes a composable, vendor-agnostic paradigm with seven interconnected capabilities built on open standards (MCP, A2A). Lyzr AgentMesh provides an event-driven implementation with a Marketplace for discovering agents, a Registry for capability metadata, and DNS-like discovery mechanisms.

Federated agent architectures are emerging for cross-organizational deployment. Academy Middleware (2025) demonstrates agents deployed across federated research infrastructure — different supercomputers and cloud platforms with diverse access protocols and asynchronous interactions. Algomox provides federated agents for global IT operations with eventual consistency, conflict resolution, and multi-tenancy models.

For persistent agent infrastructure, research consistently shows that persistent architectures outperform naive context management. The Sophia framework (December 2025) achieves identity continuity and task efficiency through meta-cognitive monitoring, episodic/narrative memory, and adaptive reward. LangGraph's checkpointing enables workflows spanning hours or days with graceful failure recovery. The three architectural strata identified by Wang et al. (2025) — Interaction Layer, Process Layer, and Infrastructure Layer (key-value memory, procedural engines, message-passing) — provide a useful framework for designing persistent agent systems.

Conclusion

The agent infrastructure landscape in early 2026 reveals a clear architectural pattern for federated agent mesh integration. MCP provides the tool integration layer (agent → tools/data), A2A provides the agent communication layer (agent → agent), and frameworks like LangGraph or CrewAI provide orchestration logic on top. Any federated mesh architecture should treat these protocols as foundational rather than competing.

Three critical gaps remain for federated architectures. First, security is the binding constraint — no protocol enforces security at the specification level (MCP's security guidance uses SHOULD, not MUST), and the OpenClaw/ClawHavoc incident demonstrates that open registries without rigorous vetting become attack vectors within weeks. UCAN's capability-based delegation model offers the most architecturally sound approach for multi-agent authorization in federated settings, though it lacks production validation at scale. Second, discovery and trust remain fragmented — MCP uses tool schemas, A2A uses Agent Cards, and each framework has its own registry approach, with no unified discovery mechanism spanning both tools and agents. Third, persistent state across agent boundaries is unsolved at the protocol level — individual frameworks handle persistence differently (LangGraph checkpoints, CrewAI memory types, OpenClaw's Markdown files), but no standard exists for sharing state across a federated mesh.

The most promising integration point for a federated agent mesh is the AAIF governance structure under the Linux Foundation, which already houses MCP, Block's goose framework, and OpenAI's AGENTS.md standard. The convergence of Cockcroft's management-as-code approach (Pourpoise), policy-as-code authorization (OPA/Cedar), and protocol-level interoperability (MCP + A2A) suggests that the next platform shift will be in agent operations infrastructure — the equivalent of what Kubernetes became for containers, but for coordinating autonomous agent swarms across organizational boundaries.

← All Articles