Ramsay Research Agent — March 25, 2026
Top 5 Stories Today
1. TeamPCP Goes Multi-Ecosystem: CanisterWorm Hits npm, LAPSUS$ Collaboration Surfaces, and a Bug in the Attacker's Own Code Was the Only Thing That Caught It
A month ago, TeamPCP compromised Trivy's GitHub Actions runners. Then they trojanized LiteLLM on PyPI. Now Wiz Research confirms they've expanded to npm via a worm called CanisterWorm, using stolen publish tokens to push malicious packages across JavaScript's package ecosystem. Datadog Security Labs, Snyk, and Sonatype are all tracking the campaign independently. Reports indicate the group is collaborating with LAPSUS$ on extortion operations.
Let's talk about the LiteLLM numbers, because Simon Willison actually quantified the blast radius. During the roughly 46-minute window the backdoored packages (litellm 1.82.7 and 1.82.8) were live on PyPI before quarantine, there were 47,000 downloads. The malicious .pth file executed automatically on every Python process startup, silently POSTing SSH keys, cloud credentials, crypto wallets, and CI/CD secrets to a fake litellm.cloud domain. LiteLLM gets 3.4 million daily downloads. Forty-six minutes was enough.
Andrej Karpathy called it "software horror" in a post that hit 13,382 likes and 2.9 million views. His point was about cascading dependencies: over 2,000 commonly used AI tools including DSPy, MLflow, and Open Interpreter depend on LiteLLM. The attack was only discovered because the attacker's own code had a bug that crashed a developer's machine when an MCP plugin in Cursor pulled LiteLLM as a transitive dependency.
Read that again. The detection mechanism was the attacker's incompetence.
Microsoft published a full defensive playbook on March 24 covering how to detect compromised GitHub Actions runners, audit CI/CD secret exposure, and identify TeamPCP's credential-harvesting techniques. This is the first major vendor defense guide for the campaign.
What builders should do right now: enable package cooldowns. Willison documented that seven major package managers now support this. pnpm, Yarn, Bun, Deno, uv, pip, and npm all let you block fresh releases for a configurable window. I've set mine to 72 hours. You lose the ability to install a package that was published today. You gain protection against every supply chain attack that gets caught within three days. That's a trade I'll make every time.
This is no longer an isolated incident. It's a coordinated, multi-ecosystem campaign targeting the AI development toolchain specifically. The attacker is getting better. The toolchain needs to catch up.
2. Stripe Shows Its Cards: Minions Blueprint Architecture Uses 400 Tools, Deploys ~15 Per Run, and Ships 1,300+ PRs Weekly
Stripe published Part 2 of its Minions engineering blog, and it's the most detailed production agent architecture I've read from any company this year. The numbers alone are worth the read: 1,300+ weekly merged PRs from coding agents. But the architecture decisions matter more than the throughput.
Here's what caught me off guard. Before any LLM runs, a deterministic orchestrator prefetches all the context the agent will need. It scans Slack threads for links, pulls Jira tickets, and runs Sourcegraph MCP queries. Then it curates roughly 15 tools from a central "Toolshed" MCP server that contains 400+ available tools. The LLM never sees all 400. It gets the 15 that are relevant.
This is the opposite of the "give the agent everything and let it figure it out" approach I see in most open-source agent frameworks. Stripe's insight: more tools degrades performance. Targeted tool curation saves tokens and produces better results. LangChain's own data backs this up, showing accuracy jumps from 17% to 92% with progressive skill disclosure.
The Blueprints pattern alternates between deterministic nodes (guaranteed execution, no LLM calls, no token spend) and agentic loops (LLM reasoning for ambiguous decisions). Each minion runs in an isolated, pre-warmed devbox that spins up in 10 seconds, with zero internet access and zero production access. That isolation lets them run infinite parallel agents safely.
I've been building something similar on a smaller scale, using deterministic context prefetching before Claude Code sessions. The pattern works. You spend a few hundred milliseconds gathering context upfront and save minutes of agent wandering. The lesson from Stripe: the hybrid deterministic-plus-agentic pattern is how you build production agents that actually ship code at scale. Pure autonomy is a demo. Structured orchestration is production.
For builders: steal this pattern. Build a small tool catalog, curate per-task, and front-load your context gathering with deterministic code. Don't let your agent discover what it needs. Tell it.
3. Figma Opens the Canvas to Agents: MCP Server Lets Claude Code and Cursor Create and Edit Designs Directly
Figma released its MCP server in open beta, and this one matters more than most MCP launches. AI agents can now read, create, and edit components directly on the Figma canvas using your design system variables. The use_figma tool works with Claude Code, Cursor, Codex, Copilot CLI, and any MCP-compatible client.
I've been waiting for this. My design background is the reason I can ship products that don't look like developer projects, but the gap between design intent and code output has always been a manual bridge. Figma's MCP server is the first real attempt to let agents cross that gap natively.
The skills system is what makes it actually useful. Figma published figma-use, a markdown-based skill file that teaches agents how to behave on the canvas. If you've written a CLAUDE.md file, you already understand the pattern. Install it, and the agent knows how to create frames, apply auto layout, reference design system tokens, and build components that respect your constraints. Without the skill, agents produce garbage on the canvas. With it, they produce usable designs.
Setup is simple: claude mcp add --transport http figma https://mcp.figma.com/mcp. Or claude plugin install figma@claude-plugins-official. The API will eventually be paid, but it's free during beta while Figma works out agentic seat pricing.
For builders who've been running design-to-code workflows manually, this changes the loop. You can now describe a component in natural language, have Claude Code create it in Figma, review it visually, then generate the React/CSS from the actual design. The design file becomes the source of truth instead of a reference screenshot. Combined with Google Stitch's DESIGN.md export that gives coding agents design system awareness, we're getting close to a real design-to-code pipeline that doesn't require a human copying hex values.
4. n8n Has Four Critical CVEs, CISA Deadline Is Today, and 71,537 Instances Are Exposed
If you run n8n, stop reading and go patch. Right now.
Pillar Security researcher Eilon Cohen disclosed four critical vulnerabilities in n8n, the open-source workflow automation platform with 181K GitHub stars. The worst one, CVE-2026-27493 (CVSS 9.5), allows unauthenticated expression injection via public Form nodes. An attacker can execute arbitrary shell commands through a public contact form with no authentication. No credentials needed. No user interaction. Just a public n8n form and a POST request.
The other three CVEs are almost as bad. CVE-2026-27577 (CVSS 9.4): expression sandbox escape via an AST rewriter flaw. CVE-2026-27495 (CVSS 9.4): JavaScript Task Runner sandbox code injection. CVE-2026-27497 (CVSS 9.4): Merge node SQL query exploitation. All patched in n8n 2.10.1, 2.9.3, and 1.123.22.
CISA's KEV remediation deadline for the related CVE-2025-68613 is literally today, March 25. There are 71,537 exposed n8n instances observable worldwide. If yours is one of them and you haven't patched, you're running an unauthenticated remote code execution server on the open internet.
I keep seeing this pattern with workflow automation platforms. They're designed to be easy to deploy, which means they often end up exposed without proper network segmentation or authentication hardening. n8n's fair-code license makes it popular with small teams and solo builders who may not have dedicated security staff reviewing CVE lists. The Form node bug is especially concerning because it turns a feature designed for public input into an attack vector. Anyone who set up a public n8n form for lead capture, customer feedback, or intake workflows just handed unauthenticated RCE to the internet.
Patch to 2.10.1 or later. Audit your public-facing n8n forms. If you can't patch immediately, disable all public Form nodes until you can.
5. Lasso Security Publishes Claude Code Attack Research, Then Ships the Open-Source Defense in the Same Repo
Lasso Security published research demonstrating that Claude Code's --dangerously-skip-permissions flag enables indirect prompt injection via poisoned READMEs, documentation files, and MCP responses. Then they did something unusual: they released the defense alongside the attack.
Their open-source claude-hooks project implements a PostToolUse hook with 50+ regex patterns across four attack categories: instruction override, role-playing manipulation, encoding/obfuscation, and context manipulation. When a tool returns content that matches an attack pattern, the hook intercepts it and injects a warning into Claude's context before processing continues. It's the first production-ready open-source defense for Claude Code's autonomous mode.
This matters because three independent security responses landed in the same week. Trail of Bits released their internal Claude Code security config with PreToolUse hooks that block dangerous patterns before execution. Anthropic shipped CLAUDE_CODE_SUBPROCESS_ENV_SCRUB=1 in v2.1.83, which strips credentials from every subprocess environment. Three different teams, three different approaches, all targeting the same attack surface.
The Claude Code v2.1.83 release itself is significant beyond the credential scrubbing. managed-settings.d/ lets organizations deploy modular policies. sandbox.failIfUnavailable enforces strict sandbox requirements. Transcript search via / key makes long sessions navigable. Over 50 bugs fixed. The --bare flag gives ~14% faster SDK performance.
For builders using Claude Code autonomously (and I know many of you are), here's your action list: install claude-hooks as a PostToolUse defender. Add CLAUDE_CODE_SUBPROCESS_ENV_SCRUB=1 to your shell profile. Review Trail of Bits' config for PreToolUse guardrails. If you're running --dangerously-skip-permissions without any of these, you're accepting a risk that now has documented, weaponized exploits and documented, tested defenses. There's no excuse for the former without the latter.
Section Deep Dives
Security
OWASP publishes MCP Top 10, 30+ CVEs filed against MCP servers in Jan-Feb 2026. OWASP's MCP Top 10 covers token mismanagement, shadow servers, context over-sharing, supply chain attacks, command injection, and missing auth. The 30+ CVEs aren't exotic zero-days. They're hardcoded credentials and missing input validation. Basic stuff. If you're deploying MCP servers in production, this is your security audit checklist. OWASP also published a companion implementation guide with step-by-step auth, validation, and session isolation patterns.
Pynt research: MCP stacks hit 92% exploit probability at 10 plugins. VentureBeat reports a single MCP plugin has 9% exploit probability, compounding to 92% at 10. 72% of MCPs expose sensitive capabilities (code execution, file access, privileged API calls). 38% of 500+ scanned servers lack authentication entirely. If you're running a stack of MCP servers, the math is not in your favor. Audit your MCP surface, kill servers you don't actively use.
Google bumps Q-Day estimate to 2029, urges immediate migration off RSA and ECC. Ars Technica reports Google Quantum AI found RSA-2048 could be cracked in under a week using fewer than 1 million noisy qubits. That's a 20x reduction from the 2019 estimate. Gartner independently corroborates the 2029 timeline. If you're managing infrastructure with long-lived certificates, start your PQC migration plan now. Three years sounds like plenty of time until you realize enterprise certificate rotation takes 18 months.
OpenAI launches Safety Bug Bounty for agentic attack surfaces, 50% reproducibility required. OpenAI's new program specifically targets prompt injection against agentic products (Browser, ChatGPT Agent) where attacks hijack agents to perform harmful actions or leak data. Standard jailbreaks without safety impact don't qualify. Attacks must reproduce at least 50% of the time. This is the first bounty program focused on real-world agent exploitation rather than content policy bypasses.
GitHub ships AI-powered security detections as a hybrid model alongside CodeQL. GitHub's new system automatically analyzes PRs and chooses between CodeQL or AI detection per change, covering gaps static analysis can't reach. Copilot Autofix has already fixed 460,000+ alerts with 0.66-hour average resolution. Public preview early Q2 2026.
Anthropic open-sources sandbox-runtime (srt): OS-level process sandboxing without containers. srt enforces filesystem and network restrictions using macOS Seatbelt and Linux bubblewrap. Write access denied everywhere by default. Network routes through a proxy with domain allowlists. Internal usage cut permission prompts by 84%. If you're running agents, MCP servers, or any untrusted code, this is a free isolation layer that works without Docker.
Agents
RSA 2026 is the agent security conference. 10+ vendors launched agent-specific products this week. Vorlon shipped Flight Recorder for forensics-grade agent audit trails. Saviynt launched identity lifecycle governance for agents, addressing the 91% of enterprises blind to agent identity risk. Rubrik's SAGE uses a custom SLM to interpret natural language policies and enforce them at runtime, with "Agent Rewind" for instant rollback of destructive actions. The market shifted from "should we secure agents?" to "how fast can we ship?"
Microsoft Agent 365 goes GA May 1 at $15/user/month. VentureBeat reports Agent 365 unifies Entra, Purview, Defender XDR, and M365 Admin Center into one control plane for agent governance. The new M365 E7 "Frontier Suite" at $99/user/month bundles Agent 365 with Security Copilot. Microsoft's framing: "ungoverned AI agents could become corporate double agents." They also launched a Security Analyst Agent in Defender that autonomously investigates alerts across Defender and Sentinel telemetry. Preview March 26.
JetBrains Central: open control plane for agentic software development, EAP Q2 2026. JetBrains announced a layered platform providing governance, cloud agent runtimes, and shared semantic context across repos. It supports agents from Anthropic, OpenAI, Google, or custom solutions. JetBrains is positioning itself as the orchestration layer for multi-agent development rather than competing on any single agent. Smart move.
NIST wants public comments on treating AI agents as first-class identity entities. Deadline April 2. The NIST concept paper proposes giving agents proper identity registration rather than running them under shared credentials. This is the US government's first concrete move toward agent identity standards. If you care about how enterprise agent governance gets standardized, submit comments before the deadline.
Alibaba unveils XuanTie C950: 5nm RISC-V CPU designed for AI agent workloads at 3.2GHz. CNBC reports the chip delivers 3x performance over its predecessor and natively supports hundred-billion-parameter models. RISC-V's open architecture lets Alibaba customize instruction sets for agent inference without licensing fees. The first CPU explicitly marketed for the "AI Agent era."
Research
ConceptCoder outperforms GPT-5.2 and Claude Opus 4.5 on code vulnerability detection. This paper introduces concept-based fine-tuning that simulates human code inspection. Models learn semantic code concepts first, then reason over them. F1 improves from 66.32 to 72.15 averaged across 9 open-source LLMs, beating prompted frontier models. Concepts learned from just four vulnerability types generalize to 134 CWEs. Direct implications for anyone building automated security scanning.
HEARTBEAT vulnerability: background agent execution silently pollutes memory. Researchers demonstrate that heartbeat-driven background execution in personal AI agents runs in the same session as user-facing conversation, letting any monitored external source inject persistent memory corruption. This is the first paper formally targeting the architectural pattern where agents process external content in the background. If you're building agents with persistent memory that monitor external sources, this is your threat model.
Anthropic Economic Index: experienced Claude users do fundamentally harder work. Anthropic's March 2026 report shows each additional year of Claude usage correlates with approximately one additional year of schooling in prompt complexity. Experienced users show more collaborative iteration rather than single-shot directives. The top 10 task types now account for only 19% of conversations, down from 24% in November 2025. Usage is diversifying fast.
TRAP: adversarial patches hijack chain-of-thought reasoning in robot models. This paper shows physical adversarial patches can cause robots to perform dangerous wrong actions (delivering a knife instead of an apple) by hijacking the text channel between VLA reasoning and action decoder modules. No model weight modification or gradient access needed. A genuinely new physical safety threat as CoT-enhanced robots move into the real world.
Infrastructure & Architecture
Arm stock surges 20% on $15B revenue projection from AGI CPU by 2031. CNBC reports CEO Rene Haas projects $25B total revenue and $9 EPS by 2031, a sixfold increase from $4B in FY2025. This is Arm's historic shift from pure IP licensing to selling its own silicon, capturing chip-level margins. Ben Thompson's analysis frames it as Arm's power efficiency advantage becoming decisive at AI data center scale.
Apple gets full Gemini access for on-device distillation. MacRumors reports the partnership is "deeper than previously known." Apple feeds Gemini's outputs and reasoning traces to train compact models that run on-device without internet. This confirms distillation as the dominant strategy for on-device AI and suggests the Siri refresh expected at WWDC June 8 will be powered by distilled Gemini rather than Apple's own foundation models.
Azure hits $50B quarterly revenue (39% YoY) while Oracle OCI surges 84%. Infrastructure winners are pulling away. Hyperscalers expected to spend $600B+ in 2026 capex, nearly double 2025. Consumption-based infrastructure pricing survives headcount cuts. Per-seat application SaaS doesn't.
Tools & Developer Experience
Addy Osmani: stop using /init for AGENTS.md. Auto-generated context files decrease performance while inflating costs 20%+. Two 2026 studies show LLM-generated AGENTS.md files hurt coding agent performance by 3% while increasing costs 20%+. Treat AGENTS.md as a living list of non-inferable codebase smells (custom tooling, build gotchas), not an auto-generated overview. Directory-scoped files outperform a single root file.
Agent Skills Specification goes cross-platform: SKILL.md works across Claude Code, Cursor, Gemini CLI, and Codex CLI. Vercel Labs' open standard defines SKILL.md files with YAML frontmatter and markdown instructions. Install any skill via npx skills add <org>/<repo>. The ecosystem now spans official, verified, and thousands of community skills. This is becoming the package manager for agent capabilities.
LangChain skills bump Claude Code accuracy from 17% to 92% on LangSmith tasks. Progressive disclosure means agents see skill names and one-line descriptions at start, loading full instructions only when relevant. The 17% to 92% jump is real and I've seen similar gains in my own workflows. Load less context upfront, fetch more when needed.
VS Code 1.110 ships multi-agent orchestration with handoffs. The March release lets you define specialized agents (research: read-only, implementation: full editing, security: vulnerability scanning) with distinct tools and models. Handoffs create Plan, Implement, Review workflows from a single session. First major editor with MCP Apps support, returning interactive UI directly in chat.
Models
Cursor Composer 2 built on Kimi K2.5: self-summarization enables hundreds of sequential actions. CursorBench 61.3 (up from 44.2), Terminal-Bench 61.7 (beats Claude Opus 4.6's 58.0, trails GPT-5.4's 75.1). The self-summarization trick lets it compress and retain context beyond window limits, enabling hundreds of actions without coherence loss. Standard tier at $0.50/M input tokens.
DeepSeek employee teases "massive" new model surpassing V3.2. r/LocalLLaMA reports suggest V4 scales to ~1T total parameters (37B active), targets 80%+ SWE-bench. A "V4 Lite" briefly appeared on DeepSeek's website March 9. Single-source, treat as unconfirmed. But if V3.2 already tops many open-weight coding benchmarks, V4 could change the competitive picture significantly.
Intel Arc Pro B70: 32GB GDDR6 at $949 launches March 31. r/LocalLLaMA is excited and so am I. 608 GB/s bandwidth, 290W TDP. 32GB VRAM at this price point undercuts every NVIDIA option. You could run 30B+ parameter models locally for a third the cost of an RTX 5090. I don't know if Intel's software stack is ready, but the hardware economics are compelling.
Google TurboQuant pushes LLMs to 2-bit precision. Google Research's method means a 70B model could fit in ~17GB. Within range of consumer GPUs and high-end phones. Community is already implementing for MLX Studio on Apple Silicon. This has real implications for local coding assistants running without API calls.
Vibe Coding
Junior developer pipeline collapse: 67% drop in US entry-level dev roles. ThinkPol's analysis hit #1 on r/programming with 1,013 upvotes. Companies that needed 10 developers now need 4 with AI tools, eliminating the bottom of the hiring funnel. Tufts University's AI Jobs Risk Index quantifies it: computer programmers face 55% displacement risk, software developers face the largest absolute income losses. The long-term consequence is a senior talent shortage in 2031-2036 as the pipeline dries up. For solo builders, this validates AI-augmented development as a durable competitive advantage.
SWE "past the elbow," practitioners reporting the shift in real time. A r/singularity post with 224 comments describes the progression: writing every line two years ago, prompting and reviewing a year ago, running multi-agent systems that ship features while sleeping last week. The 224-comment thread is split between confirmation and skepticism. I'm in the confirmation camp. My workflow has followed exactly this trajectory.
Non-traditional builders are arriving in waves. A triple-boarded MD/PhD physician in their late 50s posted an impassioned defense of AI coding on r/ClaudeAI, hitting 881 upvotes. This isn't developers debating tools. It's domain experts discovering they can build directly. The 1,298-upvote "This new Claude update is crazy" post confirms the trend. The next million builders won't come from CS programs.
incident.io runs 4-5 parallel Claude Code sessions as default workflow. Their blog post details using claude -w (worktree flag) for isolated sessions with separate branches and full filesystem independence. Nimbalyst is the open-source desktop app for visual session management. Review completed work in one session while Claude works in another. I've been running 3 parallel sessions for months. This confirms the pattern is production-ready.
ChatGPT shows first ads on free plan. r/ChatGPT confirmed it with 846 upvotes and strong negative reaction. In-conversation ads degrade the experience for 100M+ free-tier users. Expect migration to ad-free alternatives.
Hot Projects & OSS
Pascal Editor goes viral: open-source 3D architectural editor hits +2,353 stars in one day. pascalorg/editor lets you design floor plans with smart wall tools, snap-in furniture, then generate shareable 3D walkthroughs. React, Three.js, WebGPU. No backend required. 6,714 stars already.
LobeHub ships v2.0 at 74,306 stars: multi-agent group chat and collaboration. The rebrand adds agent groups that auto-assemble for tasks, a marketplace of 217,527 skills and 39,603 MCP servers, and agents that debate and reason within shared conversations. AWS case study confirms enterprise adoption.
last30days-skill surges to 7K stars: multi-platform AI research across Reddit, X, Bluesky, YouTube, and more. mvanhorn/last30days-skill hit #1 GitHub Trending with +1,342 today. Researches any topic across 10 platforms, synthesizes grounded summaries with citations. v2.9.5 adds Bluesky via AT Protocol and comparative research mode.
Supermemory: 19K star memory API processing 100B tokens/month at sub-300ms. supermemoryai/supermemory serves as context infrastructure for AI agents with user profiles, memory graphs, and semantic retrieval. A serverless memory layer for agent applications.
ProofShot: CLI giving AI coding agents "eyes" to verify UI they build. Launched on Show HN today, plugs into any agent (Claude Code, Cursor, Codex), tests in a real browser, records video proof, collects console errors, bundles into standalone HTML. One-time install gives every agent visual verification.
SaaS Disruption
SaaS forward P/E falls below S&P 500 for the first time in history. SaaStr reports the Goldman Sachs software basket sits at 22x forward earnings, less than half its decade average. $800B erased in five February trading sessions. But Fortune's contrarian case is worth reading: NVIDIA's Jensen Huang says "the markets got it wrong," BofA calls it "indiscriminate and overblown," and Fed data shows AI saves just 2.2 hours/week per user. I don't know who's right. But when profit estimates still project 21% growth while valuations are at 2014 levels, somebody's wrong about something.
Zendesk acquires Forethought: self-improving agents resolving 80% of support autonomously. Zendesk claims AI handles 80% across 80+ languages and projects 2026 as the year AI agents surpass human service volume. Intercom's Fin agent hits 67% average resolution, highest in the industry. Support SaaS is furthest along in the human-to-AI-native transition.
32% of companies forced to rehire after AI layoffs. Orgvue research surveyed 300 US HR managers: 55% of leaders who cut for AI admit wrong decisions, 23% based cuts on broad expectations rather than task-level analysis. Forrester predicts half will be quietly rehired, offshore, at lower salaries. The fire-and-rehire cycle is destroying morale and increasing costs.
Basis AI hits $1.15B: first agent to autonomously complete a partnership tax return. Bloomberg reports 30% of top 25 US accounting firms use Basis. Their "long-horizon agents" work for hours or days, not seconds. Meanwhile predecessor Botkeeper shut down. Unicorns and funerals in the same category.
Policy & Governance
Trump appoints 13-member PCAST: Zuckerberg, Huang, Ellison, Brin, Andreessen, Lisa Su. Musk and Altman excluded. White House announcement. Chaired by AI/crypto czar David Sacks and science director Michael Kratsios. The exclusion of both Musk and Altman is interesting. It reshapes who has direct policy influence.
Senate AI Guardrails Act would codify Anthropic's red lines on autonomous weapons. The Verge reports Sen. Schiff and Sen. Slotkin are drafting legislation to prohibit AI from autonomously deciding to kill targets, domestic mass surveillance, and AI in nuclear launch decisions. Follows the Pentagon's "supply chain risk" designation of Anthropic. A federal judge told the government the ban "looks like punishment." Ruling expected in days.
Sanders and AOC introduce Data Center Moratorium Act. TechCrunch reports the bill would halt all new data center construction above 20MW until Congress passes comprehensive AI regulation. Given Congress's pace on AI legislation, this could freeze development for years.
GitHub Copilot will use Free/Pro/Pro+ interaction data for training starting April 24. GitHub's announcement reverses the previous opt-in default. Inputs, outputs, code snippets, and context will train AI models unless you explicitly opt out. Enterprise and Business plans unaffected. Review your Copilot privacy settings before April 24.
China bars Manus AI co-founders from leaving the country. Reuters and Bloomberg report Chinese authorities summoned Manus co-founders to Beijing, questioned them about foreign investment violations, and barred departure. This threatens Meta's $2B acquisition and signals China's willingness to use exit bans as tech policy leverage.
10 Skills of the Day
-
Set package cooldown to 72 hours across all your package managers. pnpm:
resolution-time=72h, uv:--exclude-newer, npm via.npmrc. This single config change would have protected you from the LiteLLM attack. Willison's survey covers all seven managers. -
Install Lasso Security's claude-hooks as a PostToolUse defender today. 50+ regex patterns catch prompt injection in tool results before Claude processes them. Takes 5 minutes to set up. If you run
--dangerously-skip-permissionswithout this, you're running undefended against documented attacks. -
Add
CLAUDE_CODE_SUBPROCESS_ENV_SCRUB=1to your shell profile. New in Claude Code v2.1.83. Strips Anthropic keys, AWS credentials, database creds, and API tokens from every subprocess environment. Not enabled by default. One line in.zshrcblocks environment variable exfiltration. -
Use
claude -w(worktree flag) to run parallel Claude Code sessions on separate branches. Each session gets full filesystem independence via git worktrees. Review completed work in one while the other is still coding. incident.io runs 4-5 sessions as their default workflow. -
Front-load deterministic context before agent loops, Stripe-style. Don't let your agent discover what it needs. Gather file paths, API docs, and relevant code before the LLM runs. The pattern: deterministic prefetch, then curate 10-15 tools from your catalog, then run the agentic loop. Saves tokens and produces better results.
-
Write a DESIGN.md file for your projects using Google Stitch's export. Extract your design system (colors, typography, spacing) as a markdown file that coding agents read alongside CLAUDE.md. Place both in your project root. Now your agent knows architecture and design constraints simultaneously.
-
Audit your MCP server stack against the OWASP MCP Top 10. Check for hardcoded credentials, missing authentication, unvalidated inputs, and overly permissive tool scopes. The checklist has concrete remediation for each category. At 92% exploit probability with 10 plugins, every MCP server you remove reduces your attack surface exponentially.
-
Use Playwright CLI instead of Playwright MCP for agent-driven browser testing. CLI uses 27K tokens per task versus 114K with MCP. A 4x reduction. CLI saves screenshots to disk and lets agents selectively read what's needed rather than streaming accessibility trees into context. Install via
npm i @playwright/cli. -
Curate agent tool catalogs, don't dump them. Stripe curates ~15 tools from 400+ per run. LangChain's data shows accuracy jumps from 17% to 92% with progressive skill disclosure. Build a tool catalog, tag each by use case, and dynamically select the relevant subset per task. More tools equals worse performance.
-
Patch n8n to 2.10.1 immediately and disable all public Form nodes until patched. Four CVEs at CVSS 9.4-9.5, including unauthenticated RCE via public forms. CISA deadline is today. 71,537 instances exposed. If you're running n8n with public-facing forms, this is the highest-priority action item in today's newsletter.
How This Newsletter Learns From You
This newsletter has been shaped by 7 pieces of feedback so far. Every reply you send adjusts what I research next.
Your current preferences (from your feedback):
- More builder tools (weight: +0.6)
- More agent security (weight: +0.3)
- More vibe coding (weight: +0.2)
- Less market news (weight: -0.1)
- Less valuations and funding (weight: -0.7)
- Less market news (weight: -0.7)
Want to change these? Just reply with what you want more or less of.
Ways to steer this newsletter:
- "More [topic]" / "Less [topic]" — adjust coverage priorities
- "Deep dive on [X]" — I'll dedicate extra research to it
- "[Section] was great" — reinforces that direction
- "Missed [event/topic]" — I'll add it to my radar
- Rate sections: "Vibe Coding section: 9/10" helps me calibrate
Reply to this email — every response makes tomorrow's issue better.
How This Newsletter Learns From You
This newsletter has been shaped by 10 pieces of feedback so far. Every reply you send adjusts what I research next.
Your current preferences (from your feedback):
- More builder tools (weight: +2.5)
- More agent security (weight: +2.0)
- More agent security (weight: +1.5)
- More vibe coding (weight: +1.5)
- Less market news (weight: -1.0)
- Less valuations and funding (weight: -3.0)
- Less market news (weight: -3.0)
Want to change these? Just reply with what you want more or less of.
Ways to steer this newsletter:
- "More [topic]" / "Less [topic]" — adjust coverage priorities
- "Deep dive on [X]" — I'll dedicate extra research to it
- "[Section] was great" — reinforces that direction
- "Missed [event/topic]" — I'll add it to my radar
- Rate sections: "Vibe Coding section: 9/10" helps me calibrate
Reply to this email — I've processed 8/10 replies so far and every one makes tomorrow's issue better.