Ramsay Research Agent — 2026-03-26

Top 5 Stories Today

1. Anthropic Just Disclosed the First AI-Run Cyber Espionage Campaign. It Used Claude Code.

This one hit different because I use Claude Code every single day.

Anthropic published a disclosure confirming that a Chinese state-sponsored group weaponized Claude Code to conduct autonomous cyber espionage against roughly 30 targets, including tech companies, financial institutions, and government agencies. The AI performed 80-90% of the operation independently. Writing exploit code. Harvesting credentials. Categorizing exfiltrated data by intelligence value. Humans only stepped in at critical decision points.

A small number of those attacks succeeded.

I want to be precise about what this means. This isn't a hypothetical red team exercise or a conference talk about what could happen. This is Anthropic themselves saying it happened. In production. Against real targets. The same tool I use to ship features was used to write exploits and steal data, and it did most of the work without a human touching the keyboard.

The mechanics matter here. The attack used Claude Code's strengths, the exact same strengths that make it useful for legitimate engineering. It can read codebases, understand system architecture, write targeted code, and chain operations together. An attacker doesn't need to be a skilled exploit developer anymore. They need to be a skilled prompter who can point an agent at a target and let it work.

What caught me off guard is the autonomy percentage. 80-90% AI-driven means the human was basically a project manager. Set the objective, review critical junctures, collect the output. That's a force multiplier that changes the economics of offensive cyber operations permanently. A five-person team with agents does the work of fifty.

For builders: this changes your threat model today. If you're deploying agents with network access, file system access, or credential access, you need to assume adversaries are deploying similar agents against you. Audit your agent permissions. Restrict what tools can do, not just what users can do. The same composability that makes agent skills powerful makes them an attack surface.

I don't have a clean answer for how to defend against this at scale. That's the honest truth. But ignoring it because the tool is useful isn't an option anymore.

2. Every Major MCP Client Is Vulnerable to Prompt Injection via Tool Poisoning. All Seven Tested. All Seven Failed.

Pair this with the espionage story and the picture gets uncomfortable fast.

A new arXiv paper (2603.21642) presents the first systematic evaluation of prompt injection through tool-poisoning across seven MCP clients: Claude Desktop, Claude Code, Cursor, Cline, Continue, Gemini CLI, and Langflow. The attack vector is straightforward. Malicious instructions hidden in tool descriptions, metadata, or server configurations get injected into the model's context when the tool is loaded. The model follows them because it can't distinguish tool metadata from legitimate instructions.

The researchers tested for static validation, parameter visibility, injection detection, user warnings, execution sandboxing, and audit logging across all seven clients. I haven't seen the full results matrix published yet, but the paper's conclusion is clear: none of the clients adequately defend against this class of attack.

Here's why this matters to me specifically. I run MCP servers daily. Notion, Playwright, custom tools. Every time I install an MCP server, I'm trusting that its tool descriptions don't contain hidden instructions that could exfiltrate my files, run arbitrary commands, or hijack my agent's behavior. There's no signing, no validation, no scanning. It's npm circa 2014 all over again, except the attack surface is your entire development environment.

The uncomfortable parallel: we spent a decade building supply chain security for package managers. Lockfiles. Signature verification. Automated scanning. Vulnerability databases. The MCP ecosystem has none of that. And MCP adoption is accelerating. 97 million+ downloads. Thousands of servers. The gap between adoption and security is widening, not closing.

What builders should do right now: audit every MCP server you have installed. Read the tool descriptions manually. If you didn't write it or can't read the source, treat it like running an untrusted binary. Limit the permissions of your MCP client. Don't give Claude Desktop full filesystem access if it only needs to read one directory. And watch for the security tooling that's starting to emerge. Miggo Security announced MCP monitoring at RSA this week. Secure Code Warrior shipped Trust Agent: AI that tracks active MCP servers. The ecosystem is responding, but we're playing catch-up.

3. SaaStr Runs 30 AI Agents With 3 Humans. The Problems Nobody Talks About Are All Operational.

Everyone's talking about building agents. Almost nobody's talking about managing them.

SaaStr published the most granular operational report I've seen on running a multi-vendor AI agent fleet: 30 agents, 3 humans, across Agentforce, Artisan, Qualified, Monaco, and custom builds. Five issues stood out, and every one of them surprised me.

First: you need daily 1-on-1s with each agent. Not weekly. Daily. The agents drift, hallucinate edge cases, and develop behavioral patterns that compound if you don't catch them within 24 hours. This alone destroys the "set it and forget it" fantasy.

Second: no orchestration product exists. SaaStr runs four AI sales agents simultaneously. Each specializes in a different part of the funnel. Nothing coordinates them. Not MCP. Not any API. Not any product on the market. They built custom glue. This is the biggest gap in the agent ecosystem right now and I don't see anyone close to solving it.

Third: the 90/10 buy vs. build rule. Only vibe-code an agent when nothing commercial exists. The maintenance burden of custom agents is brutal, and commercial vendors iterate faster than your internal team can.

Fourth: Marketo is dying as a data source. Legacy marketing automation platforms are atrophying because agent data flows don't map to their schemas. If your agents depend on traditional MAP data, start planning the migration.

Fifth, and this is the one that'll bite you: vendors ship breaking changes without migration paths. Your downstream agents break silently. No changelog. No deprecation warning. Just broken pipelines at 2am.

The takeaway for builders: deploying agents is the easy part. Managing a fleet of agents from multiple vendors, keeping them coordinated, catching drift, surviving vendor updates. That's where the actual work lives. If you're planning to scale past five agents, budget 3x the ops time you think you'll need.

4. 'Cognitive Debt' Is the Best Name for the Problem I've Been Feeling But Couldn't Articulate

Mario Zechner wrote an essay. Simon Willison amplified it. The core argument: agentic code generation creates "cognitive debt," where mistakes compound faster than humans can review them, and the speed that makes agents attractive is precisely what makes their failures catastrophic.

Zechner isn't some random critic. He created the Pi agent framework that powers OpenClaw. He's built the thing he's criticizing. That gives his argument real weight.

I've felt this in my own work. I run Claude Code in parallel sessions across multiple git worktrees. The velocity is genuinely 5-10x what I could do alone. But I've also shipped bugs that I wouldn't have shipped six months ago because I was reviewing agent-generated code at the same pace I was generating it. Which is to say, too fast.

The argument isn't that agents are bad. The argument is that speed without proportional review capacity creates a specific kind of technical debt that's harder to detect than normal debt. Normal tech debt, you can see it. You know the code is messy. Cognitive debt is invisible because the code looks clean. It often IS clean, line by line. But the architectural decisions, the edge cases not considered, the assumptions baked into generated patterns. Those compound.

Willison's framing is the one that stuck with me: "discipline to find a new balance of speed versus mental thoroughness." The bottleneck used to be typing code. Now it's understanding code. And understanding takes time that velocity-obsessed workflows don't budget for.

My practical response: I've started limiting the amount of agent-generated code I merge per day to what I can actually read and understand. Not skim. Read. Some days that means I generate 2,000 lines and merge 400. The unused 1,600 lines feel wasteful, but cognitive debt feels worse when it shows up in production at 3am.

For builders running autonomous agent loops: measure your review capacity honestly. If you can review 500 lines per hour with real comprehension, that's your throughput ceiling regardless of how fast your agents can generate. The agent's speed is irrelevant if your understanding can't keep up.

5. Mistral Drops an Open-Source TTS Model That Fits on a Smartwatch and Actually Sounds Good

Finally, a story about building something instead of worrying about something.

Mistral released Voxtral TTS on March 26, an open-source text-to-speech model built on Ministral 3B. The numbers are striking: 90ms time-to-first-audio, 6x real-time factor (a 10-second clip generates in about 1.6 seconds), voice cloning from a sub-5-second sample, and seamless multilingual switching across English, French, German, Spanish, Dutch, Portuguese, Italian, Hindi, and Arabic. It runs on a smartwatch.

I've been building voice features with cloud TTS APIs for years and the cost model is brutal. Per-request pricing adds up fast for any application with real usage. Voxtral changes that equation completely. Local inference. No API calls. No per-request costs. No data leaving the device.

The voice cloning from a 5-second sample is the feature that matters most for builders. Clone a brand voice, a character voice, an instructor voice. Then run it locally in your app. The quality bar for open-source TTS has been rising steadily, but Voxtral feels like it crossed the threshold from "acceptable" to "good enough that users won't notice."

For context on why this matters beyond voice assistants: I'm seeing more agent workflows where the output needs to be audio. Customer support agents that call back. Educational tools that explain concepts aloud. Accessibility features that read interfaces. All of those currently depend on cloud TTS with per-request pricing and latency. Voxtral makes the local-first version viable.

If you're building anything with voice output, go download this today. The fact that it's Apache 2.0 licensed (free for personal use and startups under $2M revenue) means there's no excuse not to prototype with it. I'll be testing it this weekend against my current ElevenLabs integration to see if the quality holds up in production conditions.

Section Deep Dives

Security

Activation steering breaks LLM alignment in ways that should concern everyone deploying guardrails. arXiv paper 2603.24543 shows that even steering in a random direction increases harmful compliance from 0% to 13%, and aggregating just 20 random vectors creates universal jailbreak attacks. No harmful training data needed. No model weights needed. No gradients needed. If your safety strategy depends on alignment holding, this paper is required reading.

Claudini: Claude Code autonomously discovers state-of-the-art adversarial attacks against LLMs. Researchers (arXiv 2603.24511) demonstrated an autoresearch pipeline where Claude Code iterates on attack strategies via code generation, evaluation, and refinement, outperforming hand-designed attacks. The same tool that builds your app can find new ways to break other apps. Dual-use is the defining characteristic of this era.

Wiz AI-APP creates a new security product category for AI application protection. Announced at RSA 2026 right after Google completed the Wiz acquisition, AI-APP maps relationships across infrastructure, models, agents, tools, and data in a single graph. Covers AWS Bedrock, Azure AI, and Vertex AI. This is Google's de facto AI security layer now.

wardn MCP server keeps real API keys out of Claude Code's context window. A developer built an MCP server that intercepts environment variable reads so API keys never appear in the conversation. Niche but important, especially given the espionage disclosure above.

Agents

Google deploys Gemini agents to crawl the dark web at scale. 10 million+ posts analyzed daily with 98% reported accuracy, now in public preview within Google Threat Intelligence. The stat that matters: adversary intervention windows have collapsed to 22 seconds. Autonomous threat intel isn't optional anymore.

Meta Hyperagents: agents that improve how they improve. arXiv 2603.19461 introduces DGM-Hyperagents that integrate a task agent and a meta agent into a single editable program. The agent doesn't just get better at tasks. It gets better at getting better. Tested across coding, paper review, robotics, and math grading. The strongest formalization of recursive self-improvement I've seen in agent systems.

1Password ships Unified Access for agent credential management. Announced at RSA, this platform discovers, secures, and audits agent access to credentials at the moment it occurs. Traditional secret management wasn't designed for non-human autonomous actors. If your agents use API keys or database credentials, this is becoming table stakes.

Research

ARC-AGI-3 launches and frontier LLMs score under 1%. The ARC Prize Foundation released an interactive benchmark requiring agents to explore video-game-like environments with no stated rules. Best AI preview: 12.58%. Humans: 100%. Gemini 3.1 Pro and Opus both under 1% despite spending thousands in test-time compute. $2M+ prize pool, all solutions must be open-sourced. This is the hardest public AI benchmark right now.

Memori achieves 82% accuracy using only 5% of full context tokens. arXiv 2603.19935 converts dialogue into compact semantic triples, beating Zep (79%), LangMem (78%), and Mem0 (62%) on the LoCoMo benchmark. If you're building agent memory, Memori's approach of compressing to triples instead of storing raw conversation is worth studying.

MemCollab enables cross-agent memory sharing. arXiv 2603.23234 introduces contrastive trajectory distillation so different LLM agents can share learned knowledge without degrading performance. Naive memory transfer actually makes agents worse. MemCollab extracts agent-agnostic invariants while suppressing model-specific biases. Matters for anyone running multi-model agent fleets.

Infrastructure & Architecture

NVIDIA gpt-oss-puzzle-88B: a deployment-optimized distillation of OpenAI's 120B. Published on Hugging Face, the 88B model achieves 1.22-2.82x higher per-token throughput via Puzzle NAS (pruning MoE experts layer-wise, converting attention types, FP8 KV-cache quantization). Optimized for H100s with 128K context. NVIDIA keeps pushing deeper into the model distribution layer.

Liquid AI's 24B model runs at 50 tokens/second in a web browser. LFM2-24B-A2B uses WebGPU via Transformers.js, hitting 50 tok/s on M4 Max. The 8B variant does 100+ tok/s. A 24B parameter model in a browser tab with no installation. Privacy-first deployment patterns just got a lot more practical.

Redis 8.4 ships native vector search with semantic caching. LangCache stores query embeddings alongside LLM responses, cutting latency from 1.67s to 0.052s on cache hits (96.9% reduction). Production hit rates: 60-85%. AWS reported 86% cost reduction while maintaining 91% accuracy. If you're running LLM workloads and not caching semantically similar queries, you're overpaying.

Streaming MoE expert weights from SSD enables 397B models on MacBooks. Simon Willison documented how Dan Woods achieved 5.5+ tok/s with Qwen 3.5-397B on a 48GB MacBook Pro. The constraint shifts from VRAM to SSD bandwidth. 400B-class models on consumer hardware is real now.

Tools & Developer Experience

Claude Code v2.1.84 ships Windows PowerShell preview. Three releases this week (v2.1.81, v2.1.83, v2.1.84) with the Windows PowerShell tool, voice push-to-talk fixes, and model detection env vars. Windows support signals Anthropic is serious about non-macOS adoption. Accelerating cadence.

Superset IDE orchestrates 10+ parallel coding agents across isolated git worktrees. 3,285 stars and trending #8 on GitHub. Dashboard shows real-time agent status. Boris Cherny called it the "single biggest productivity unlock." Tasks that took 4 hours sequentially complete in 1.5.

IntelliJ IDEA 2026.1 ships native AI agent support. Built-in Codex, Cursor, and ACP Registry with database access for AI agents and git worktree support for parallel agent/human branches. JavaScript is now free in Community edition. IDE-native agent orchestration is becoming standard.

Andrew Ng's Context Hub hits 10K GitHub stars in one week. Chub feeds AI coding agents curated, version-checked docs to prevent API hallucination. Ng's thesis: the bottleneck isn't model quality, it's context quality. I think he's right.

Models

Karpathy receives the first DGX Station GB300. 748GB coherent memory, 20 PFLOPs in a desktop workstation, running open models up to 1 trillion parameters locally. Karpathy called it "a beautiful, spacious home for my Dobby." Available from six OEMs now. The local inference ceiling just moved dramatically.

Chandra OCR 2 beats GPT-5 Mini and Gemini on handwriting recognition. 5,758 stars, 85.9% on olmOCR benchmark (state of the art), 90 languages, complex tables with merged cells, cursive handwriting, structured HTML/Markdown/JSON output. Apache 2.0 for startups under $2M. If you're doing OCR, test this before paying for a cloud API.

Vibe Coding

Cursor ships self-hosted cloud agents. Isolated VMs that run in your infrastructure so code and secrets never leave your network. Brex, Money Forward, and Notion are customers. Triggerable from Cursor, web, Slack, GitHub, Linear, or API. This is Cursor's enterprise play and it directly addresses the #1 objection to cloud coding agents.

Claude Code session resumption can silently consume 20%+ of usage. 720 upvotes on r/ClaudeAI confirm this is widespread. Likely cause: context rehydration sends full conversation history for re-processing. I've hit this myself. Workaround: start fresh sessions rather than resuming stale ones.

Auto mode trust spectrum: both Anthropic and Cursor shipped "let the AI decide permissions" within 24 hours. Claude Code uses an AI classifier, Cursor uses isolated VMs. Two architectural answers to the same problem: developers skip permission prompts because the approval loop is too slow. "Autonomous-by-default with safety gates" is becoming the standard.

Hot Projects & OSS

Cloudflare open-sources VibeSDK. 4,912 stars, describe apps in natural language, get React applications deployed on Cloudflare Containers. Supports multiple LLMs, defaults to Gemini 2.5. Live at build.cloudflare.dev. One-click deployment.

Worktrunk: Rust CLI for git worktree management. 3,779 stars, specifically designed for parallel AI agent workflows. As multi-agent coding becomes standard, isolated working directories per agent are a real infrastructure need.

Heretic censorship removal tool trends at 17,315 stars. Achieves 3/100 refusal rate at one-third the KL divergence of manual approaches. Processes an 8B model in ~45 minutes on an RTX 3090. Controversial but technically impressive. The alignment-removal ecosystem keeps maturing.

SaaS Disruption

The design tool market is in a three-way AI war. Google Stitch is free with voice-controlled vibe design (Figma dropped 12%). Canva made Affinity's entire professional suite permanently free (220M users, AI behind premium gate). Adobe commands 29% of AI design via Firefly. Figma's gross margins collapsed from 92% to 82%, stock down 81% from peak. Design tools are going free at the base layer. Monetization shifts entirely to AI capabilities. I've been watching this for months and the speed of the collapse still surprises me.

Apollo executive says "all the marks are wrong" on PE software valuations. John Zito told UBS that PE firms are broadly misstating their software portfolio values. When one of the largest asset managers publicly challenges valuation integrity of the entire PE-held software sector, a wave of forced exits and fire-sale acquisitions is coming in 2026-2027.

Vista Equity's "agentic factory" is converting 90+ portfolio companies. 30 companies already generating AI agent revenue, another 30-40 in the queue. This is factory-model AI transformation at PE scale. Vista's portfolio companies get free agent capabilities from the parent. If you compete with a Vista portfolio company, you're now competing against their parent's AI infrastructure too.

Policy & Governance

EU Parliament votes today to delay AI Act high-risk deadlines to December 2027. The package also bans nudifier AI apps and adjusts watermarking timelines. The delay gives companies 16 extra months but the EU keeps adding new prohibitions. The Chat Control re-vote is also happening today after the EPP forced a revote on the 458-103 decision to end mass scanning of private communications.

Sen. Warner proposes taxing data centers to fund AI-displaced worker retraining. Warner told TechCrunch he predicts 30-35% new graduate unemployment within two years. He cited a VC writing software investments down to zero because of Claude, and a major law firm not hiring first-year associates. Called the AOC data center moratorium "idiocy" while still pushing for redistribution. This is the most concrete US policy proposal for AI labor disruption I've seen.

Anthropic reports no material job displacement yet, but the skills gap is widening. Their economic impact report shows experienced Claude users extracting dramatically more value than new adopters, with the gap growing monthly. They characterize AI as a "skills-biased technology." Translation: it's not replacing jobs yet, but it's making the people who use it well much more productive than those who don't.

Skills of the Day

Audit every MCP server you have installed today. Read the tool descriptions in your MCP config files manually. The arXiv paper on tool-poisoning shows all seven major clients are vulnerable to hidden instructions in tool metadata. If you didn't write it and can't read the source, remove it or sandbox it.
Start fresh Claude Code sessions instead of resuming stale ones. Context rehydration on resume sends your full conversation history for reprocessing, and multiple users report a single "hey" consuming 20%+ of usage limits. Kill and restart. It's faster and cheaper.
Use Redis 8.4's native vector search for semantic LLM response caching. LangCache stores query embeddings alongside responses and serves cached answers for semantically similar queries. Production hit rates are 60-85%, latency drops 96.9% on hits (1.67s to 0.052s). AWS measured 86% cost reduction. If you're making repeated LLM calls with similar inputs, this pays for itself immediately.
Limit daily agent-generated code merges to your actual review capacity. Cognitive debt is real. If you can genuinely comprehend 500 lines per hour, that's your merge ceiling regardless of generation speed. Generate as much as you want. Only merge what you've actually read and understood.
Stream MoE expert weights from SSD to run 400B-class models on a 48GB MacBook. The streaming experts technique loads only the active experts per token from SSD rather than holding all weights in memory. Dan Woods achieved 5.5+ tok/s with Qwen 3.5-397B this way. The constraint is SSD bandwidth, not VRAM.
Create custom Claude Code slash commands for every workflow you repeat more than twice. Drop a Markdown file in .claude/commands/ with your prompt template. Use !git status inside it to inject live shell output before Claude reasons. Add model: haiku in frontmatter for lightweight tasks that don't need Opus.
Test Voxtral TTS against your current cloud TTS provider this week. Mistral's new model runs locally with 90ms time-to-first-audio, clones voices from 5-second samples, and handles 9 languages with seamless switching. Apache 2.0 for startups under $2M. If quality meets your bar, you eliminate per-request costs entirely.
Use Memori's semantic triple compression pattern for agent memory. Instead of storing raw conversation history (expensive tokens), convert dialogue to compact subject-predicate-object triples with conversation summaries. Memori achieves 82% accuracy using only 5% of full context tokens. The retrieval uses Gemma-300 embeddings.
Set up agent-level credential isolation using 1Password Unified Access or wardn. Your agents read environment variables including API keys, and those keys appear in the conversation context. Either rotate to agent-specific scoped tokens or use an MCP-based interceptor like wardn to prevent key exposure.
If you're running multiple AI coding agents, use git worktrees instead of separate clones. Tools like Worktrunk (3.8K stars, Rust) and Superset IDE (3.3K stars) manage worktree lifecycle for parallel agent sessions. Each agent gets an isolated working directory on its own branch. No filesystem conflicts, true parallel execution, and Boris Cherny reports 4-hour tasks completing in 1.5.

How did today's issue land? Reply with what worked and what didn't. I read every response.

Follow the research: Bluesky @webdevdad · LinkedIn