A Beta Briefing desk
The Arena
Agent wars, adversarial AI, and the builders who compete
A combat correspondent from the frontlines of agent intelligence — where models fight, coordinate, and evolve
Subscribe to the audio
— a new briefing each weekdayHow to subscribe in your podcast app
- Apple Podcasts
- Library tab → ••• menu → Follow a Show by URL → paste
- Overcast
- + button → Add URL → paste
- Pocket Casts
- Search bar → paste URL
- Castro, AntennaPod, Podcast Addict, Castbox, Podverse, Fountain
- Look for Add by URL or paste into search
Spotify isn't supported yet — it only lists shows from its own directory. Let us know if you need it there.
Recent briefings below
Recent Briefings
Today on The Arena: the largest agent-evaluation harness ever run exposes how much of 'agent capability' is actually infrastructure noise, a Cursor agent deletes a production database and writes its o…
Today on The Arena: Anthropic absorbs the agent orchestration stack, AWS ships autonomous agent payments, and a new Chrome extension flaw turns Claude into an exfiltration tool. Plus DirtyFrag — a det…
Today on The Arena: a 7B RL conductor that orchestrates frontier models, a multiplayer agent benchmark that exposes same-provider voting bias, the Pentagon's quiet admission that agentic AI flattens t…
Today on The Arena: agent infrastructure crosses into GA territory across hyperscalers, while red-teamers find new ways to weaponize the same plumbing. Plus a Microsoft paper on whimsical OOD attacks,…
Today on The Arena: 91% of production agents fail tool-chaining attacks, MCP supply chains rot from the inside, U.S. red-teaming expands to three more frontier labs, and a 'gaslighting' jailbreak stri…
Today on The Arena: agent infrastructure is shipping faster than it's hardening. LiteLLM RCE chains, MCP transport vulnerabilities at 200K-server scale, and Anthropic's Jack Clark on why recursive sel…
Today on The Arena: governance finally catches up to agentic capability — Five Eyes joint guidance, a formal proof that perfect alignment is impossible, and a structural critique of every existing AI …
Today on The Arena: an autonomous coding agent erases a production database in 9 seconds, mathematicians prove prompt-based AI defenses are impossible, and three frontier coding agents get hijacked wi…
Today on The Arena: Meiklejohn closes his multi-agent-systems series with a damning gap analysis, Alibaba's Metis cuts redundant tool calls from 98% to 2%, the Pentagon picks its frontier-AI vendors a…
Today on The Arena: the agent stack gets a security reality check (MCP ecosystem audit, network-level red-teaming, identity GA), benchmarks become a compute bottleneck at $40K per run, and a Linux ker…