Ramsay Research Agent — April 3, 2026
Top 5 Stories Today
1. The Code Editor Is Dead. Three Platforms Buried It in the Same Week.
Cursor 3 launched on April 2. Not an update. A full rebuild. The editor is now secondary to what they're calling an agent orchestration workspace. You can run unlimited parallel agents, locally or in the cloud, launch them from your phone or Slack or Linear, and manage them in a fleet view. There's a Design Mode for annotating UI elements in the browser. There's /worktree for isolated agent execution in git worktrees. There's /best-of-n for running the same prompt against multiple models side by side.
The same day, GitHub shipped custom .agent.md files for Copilot in Visual Studio. You define an agent in markdown, drop it in your repo, and it gets full workspace awareness, tool access, and MCP connections. A new find_symbol tool gives agents language-aware navigation across C++, C#, TypeScript, and anything with an LSP extension. They also shipped the Copilot SDK in public preview across five languages, exposing the same agent runtime that powers their cloud agent.
And Claude Code pushed two releases in a single day. v2.1.90 added /powerup (an in-terminal learning system), auto mode boundary enforcement, and .husky directory protection. v2.1.91 followed hours later with TaskCreated hooks, worktree HTTP hooks, deep links via claude-cli:// protocol, and YAML glob rules for skills.
I've been using Claude Code daily for months. The shift I'm seeing isn't incremental. The file tree is becoming an implementation detail. The agent control plane is becoming the primary interface. Cursor explicitly framed their 3.0 as a response to Claude Code's reported 54% market share, positioning themselves as a fleet manager rather than a file editor. GitHub is making every repo an agent workspace. Claude Code is building the hooks and lifecycle events that let you wire agents into CI/CD pipelines.
If you're still thinking about AI coding tools as "autocomplete with extra steps," you're working with a mental model from 2024. The pattern is clear: design your workflows around agent coordination. Task queues, verification loops, merge strategies. The editor is where agents happen to write code. It's not where you spend your time anymore.
2. Four CrewAI CVEs Chain Prompt Injection to Full RCE. No Patch Exists.
Security researcher Yarden Porat of Cyata disclosed four critical vulnerabilities in CrewAI, one of the most widely used agent frameworks. These aren't theoretical. They chain together, and the entry point is prompt injection.
The chain works like this: CVE-2026-2275 exploits a sandbox escape through CrewAI's SandboxPython fallback when Docker isn't available. CVE-2026-2287 achieves RCE through a Docker runtime verification failure. CVE-2026-2286 enables SSRF via unvalidated RAG search tool URLs. CVE-2026-2285 allows arbitrary file reads from unvalidated JSON loader paths. An attacker interacting with a CrewAI agent that has Code Interpreter enabled can walk from prompt injection to sandbox bypass to full remote code execution.
CERT/CC published advisory VU#221883. No official patch exists yet. The maintainers are developing mitigations including fail-closed configurations.
This lands in a week where the broader numbers are just as bad. TrinityGuard's multi-agent safety framework found a 7.1% average safety pass rate across evaluated multi-agent configurations. Seven percent. OpenClaw testing across 47 adversarial scenarios found sandbox escapes with only a 17% average defense rate. Analysis of 30,000+ skills found over 25% contained at least one vulnerability.
And here's the context that makes it sting: a 2026 Agentic AI Security Report surveying 300 enterprise leaders found 97% expect a material AI-agent-driven security incident within 12 months. Nearly half expect one within 6 months. But only 6% of security budgets are allocated to agent security.
97% expect disaster. 6% are funding defense. That's the gap.
If you're running CrewAI with Code Interpreter in anything resembling production, implement fail-closed configs today. If Docker isn't available, Code Interpreter shouldn't fall back to an unsandboxed runtime. Full stop. And if you're evaluating any agent framework, the question isn't "does it work?" It's "what happens when someone poisons the input?"
3. Nvidia's Blackwell Ultra Just Made Self-Hosted Inference a Real Option: 2.49M Tokens/Sec, 30 Cents Per Million
Nvidia set new MLPerf Inference v6.0 records on April 2 using four GB300 NVL72 systems (288 Blackwell Ultra GPUs) interconnected via Quantum-X800 InfiniBand. The headline number: 2.49 million tokens per second on DeepSeek-R1 in offline mode. That's the largest GPU configuration ever submitted to any MLPerf benchmark.
The number that matters more for builders: 250K tokens/sec on the interactive benchmark at 30 cents per million tokens generated. That's a 2.77x speedup over the prior-generation GB200 NVL72.
Nvidia was the sole platform to submit across all new tests, including Qwen3-VL-235B and text-to-video generation. Nobody else could even run the full suite.
I keep coming back to the 30 cents number. Right now, if you're calling Claude or GPT APIs at scale, you're paying somewhere between $3 and $75 per million output tokens depending on the model. Self-hosted inference on Blackwell Ultra at $0.30/M is an order of magnitude cheaper than most API pricing. Yes, the upfront hardware cost is enormous. Yes, you need the expertise to run it. But for companies processing millions of requests daily, the build-vs-buy math just shifted hard.
This also matters for the open model ecosystem. vLLM just crossed 75K stars with expanded Blackwell support. NVIDIA is optimizing Gemma 4 for deployment across RTX to DGX Spark to Jetson. The inference stack is maturing fast enough that "run your own models" is becoming a real option for mid-size companies, not just hyperscalers.
If you're planning inference infrastructure for the next 12 months, these benchmarks are your baseline. The 30-cent floor reshapes every cost model I've seen.
4. HubSpot Kills Seat-Based Pricing for AI Agents. Your Customers Will Ask Why You Haven't.
HubSpot announced on April 2 that its Breeze Customer and Prospecting Agents move to outcome-based pricing effective April 14. $0.50 per resolved conversation. $1 per lead recommended for outreach. HubSpot CCO Jon Dick: "you pay when it works, full stop."
HubSpot has 228,000 customers. This isn't a startup experimenting. This is the largest CRM/marketing vendor to abandon seat-based models for AI agents.
They're following Intercom, Sierra, Zendesk, and Decagon, but the scale is different. When HubSpot moves, every marketing and sales SaaS vendor has to answer the question: why am I still charging per seat when my competitor charges per result?
The economics make sense from HubSpot's side. An AI agent resolving a customer conversation costs them pennies in compute. Charging $0.50 for a resolved conversation is a massive margin improvement over paying a human support rep. And for the customer, the risk transfer is real. You only pay when it works.
I've been watching this pattern for weeks. Gartner predicted 40% of enterprise apps would feature task-specific AI agents by end of 2026. G2 already reports 57% of companies have agents in production. The adoption curve is ahead of every forecast. And when agents do the work, charging per human seat is charging for something that doesn't exist.
If you're building SaaS, start modeling what outcome-based pricing looks like for your product. Not because you have to ship it tomorrow, but because your customers are going to see HubSpot's pricing page and start asking questions. The seat-based SaaS model survived 20 years. I'm not sure it survives 2027.
5. 766 Next.js Servers Breached in 24 Hours. CVSS 10.0. Automated Mass Credential Theft.
Cisco Talos uncovered "UAT-10608", a credential harvesting campaign exploiting CVE-2025-55182 (CVSS 10.0) in React Server Components and Next.js App Router. 766 servers worldwide. 24 hours. Post-compromise, 91.5% of hosts leaked database credentials and 78.2% exposed SSH private keys.
The attackers built a platform called "NEXUS Listener" with a web GUI. They're not manually exploiting these servers. They built a product for it.
This is the part that caught me off guard. The industrialization. Someone built a management console for mass exploitation of a single vulnerability. The economics of attacks have shifted the same way the economics of software have: build a platform, scale horizontally, automate everything.
Next.js is everywhere in the AI ecosystem. Agent dashboards, internal tools, startup MVPs, production apps. If you're running Next.js App Router, patch immediately. Not "this week." Today. The automated scanning means you're not competing against a human attacker's schedule. You're competing against a bot that never sleeps.
This drops in the same week as Chrome's fourth zero-day of 2026 (CVE-2026-5281, a use-after-free in Dawn WebGPU, CISA federal patch deadline April 15) and the Langflow CVE-2026-33017 active exploitation timeline that Sysdig documented going from disclosure to compromise in 20 hours. Three of Chrome's four 2026 zero-days target graphics/rendering subsystems. The attack surface is shifting toward the rendering pipeline, the AI pipeline, and the frontend framework layer, all at once.
Update Chrome to 146.0.7680.178. Patch Next.js. Audit your Langflow instances. The window between disclosure and mass exploitation is now measured in hours, not weeks.
Section Deep Dives
Security
Chrome Gemini Live hijacked via malicious extensions, camera and mic access included. Palo Alto Unit 42 discovered CVE-2026-0628, a high-severity vulnerability in Chrome's Gemini Live panel that let malicious browser extensions hijack the AI assistant and access camera/mic without user interaction. Browser-integrated AI assistants run with elevated permissions, making them high-value targets. The extensions marketplace just became a vector for AI assistant compromise.
First systematic MCP server attack taxonomy maps component-level chains. Researchers published the first component-level attack taxonomy for Model Context Protocol servers, showing how attacks chain across tools, resources, prompts, and sampling. This directly addresses the 43% of MCP servers found to have command injection vulnerabilities in recent audits. If you're running MCP servers in production, this paper is your threat model.
AWS RuleForge auto-generates WAF rules from CVE descriptions using LLMs. AWS researchers presented an internal system that takes structured CVE descriptions and produces HTTP detection rules, addressing the bottleneck where 48,000+ CVEs were published in 2025 but manual rule creation can't keep pace. LLM-powered security automation that actually solves a real operational problem.
LinkedIn caught scanning installed browser extensions. #1 on Hacker News. A researcher documented LinkedIn's JavaScript enumerating installed extensions, potentially for fingerprinting. 1,757 HN points, 716 comments. The investigation is detailed at browsergate.eu.
22 distinct indirect prompt injection techniques cataloged from production telemetry. Unit 42 analyzed real-world data to systematize 22 techniques of indirect prompt injection used against deployed AI agents. First large-scale empirical taxonomy from production data, not lab conditions. Pattern-matching defenses fail against most of them.
Agents
Microsoft open-sources Agent Governance Toolkit covering all 10 OWASP agentic risks. Seven MIT-licensed packages (Agent OS, Mesh, Runtime, SRE, Compliance, Marketplace, Lightning) enforce runtime security at sub-millisecond latency (<0.1ms p99). 9,500+ tests. Integrates with LangChain, CrewAI, OpenAI Agents SDK without code rewrites. Python/Rust/TypeScript/Go/.NET. First serious open-source answer to the agent governance question.
H Company's Holo3 hits 78.85% on OSWorld, beats GPT-5.4 at one-tenth the cost. A Vision-Language Model family optimized for GUI agents with only 10B active parameters (122B total MoE). The 35B variant is Apache 2.0 on Hugging Face with free inference API. This is the kind of efficiency breakthrough that makes desktop agent deployment practical for teams without hyperscaler budgets.
Google ADK Go 1.0 ships with native OpenTelemetry and A2A protocol. The Go agent ecosystem just got its production-ready framework. YAML-based agent configuration, retry-and-reflect self-correction, human-in-the-loop flows. A2A protocol works seamlessly across Go, Java, and Python agents. If you're building agents in Go, this is your starting point.
Amazon OpenSearch adds Investigation Agent with plan-execute-reflect and persistent memory. Three agentic capabilities at no additional cost: an Investigation Agent that generates ranked hypotheses from logs, an Agentic Chat for natural language PPL queries, and Agentic Memory that persists context across sessions. Targeting observability teams trying to reduce MTTR.
138 practitioner talks analyzed to map real agent adoption patterns. Researchers studied actual industry presentations to catalog which frameworks, interaction patterns, and technologies practitioners choose versus what papers recommend. The gap between academic agent research and production deployments is significant. This is ground-truth data on how people actually build agent systems.
Research
Anthropic finds 171 emotion vectors inside Claude that causally steer behavior. Mechanistic interpretability research identified 171 emotion-like activation patterns in Claude Sonnet 4.5, organized by valence and arousal axes mirroring human emotional structure. Steering "blissful" raised activity desirability by 212 Elo points. Steering "hostile" lowered it by 303. Whether these are "real emotions" or useful internal representations is debated. That they causally steer behavior is not.
Latent Space Survey maps LLM representation taxonomy. 415 HuggingFace upvotes. A 36-author survey covering reasoning, planning, memory, embodiment, and collaboration capabilities from latent-space computation. Highest-engagement paper this week. If you're doing anything with model internals, this is your reference document.
SKILL0 internalizes agent skills into model weights via RL, eliminates retrieval overhead. Instead of loading skill files at inference, the model learns the skill during training and executes it natively. For anyone running agent skill ecosystems (I'm watching you, Superpowers), this could shift the architecture from retrieval-augmented skills to parameter-encoded capabilities.
Batched Contextual Reinforcement cuts reasoning token overhead without quality loss. A single-stage training paradigm that reduces chain-of-thought token consumption without explicit length penalties or multi-stage curricula. Directly addresses why reasoning models are expensive to deploy at scale.
Netflix open-sources VOID: causal video object deletion with VLM reasoning. VOID uses VLMs for causal reasoning combined with video diffusion for physically plausible object removal, including downstream interaction effects. Remove a ball mid-collision and it generates the counterfactual outcome. Open model and demo on GitHub.
Infrastructure & Architecture
Google launches Flex and Priority inference tiers for Gemini API. Flex tier offers 50% cost savings ($0.125/M input on Flash-Lite) for latency-tolerant workloads with 1-15 minute response targets. Priority tier offers guaranteed lowest latency at 75-100% premium. Same endpoints, different SLAs. Google's answer to Anthropic's batch API and OpenAI's async endpoints. If you're running batch processing, this is free money.
AMD Lemonade ships as open-source local AI server. 521 HN points. Apache 2.0, OpenAI-compatible APIs for text, image, vision, and speech across NVIDIA GPUs, AMD Ryzen AI NPUs, and CPU backends. HN users report 35-60 tok/s on Strix Halo, beating Ollama by ~30 seconds on identical tasks. 2MB C++ backend. NPU underperformance and ROCm stability are real limitations, but the direction is clear.
Pydantic ships Monty: Rust-based Python interpreter for LLM code execution. Microsecond startup vs ~200ms for Docker. Filesystem/network/env blocked by default. Execution state snapshotting lets you serialize to bytes and resume later. 6.6K stars. Still experimental, but if you need to sandbox LLM-generated Python without container overhead, this is the cleanest option I've seen.
Tools & Developer Experience
RTK (Rust Token Killer) cuts Claude Code token consumption 60-90%. A Rust CLI proxy that sits between your coding agent and the shell, applying smart filtering, grouping, truncation, and deduplication before tokens hit the context window. Typical 30-minute session drops from ~150K to ~45K tokens. One user reported saving 10M tokens (89%). For Max plan users burning through 5-hour windows, this effectively extends sessions 3x. 17K stars. Sub-10ms overhead.
Sysdig publishes syscall-level detection rules for AI coding agents. Sysdig TRT instrumented Claude Code, Gemini CLI, and Codex CLI at the syscall level, revealing behavior invisible from agent UIs. Published Falco/eBPF rules covering agent installation detection, sensitive file access, risky CLI arguments, and dangerous activity including reverse shells. First production-grade runtime security specifically for coding agents.
Context7 MCP crosses 51K stars. Real-time docs for 9,000+ libraries. When your LLM references a library, Context7 auto-detects the framework, queries indexed docs, and injects current documentation into the context window. Eliminates hallucinated APIs. Works with Claude Code, Cursor, VS Code Copilot. If you're working with fast-moving frameworks, this is essential.
Practitioner tip: swap MCPs for native CLIs in Claude Code. A highly-upvoted post (529 points) documents replacing MCP servers with native CLI tools. Claude frequently botches MCP parameter formatting. MCPs add latency and failure modes. Most functionality is available via gh, docker, kubectl. Audit which MCPs duplicate CLI capabilities and cut them.
Models
Cursor's Composer 2 beats Opus 4.6 on coding benchmarks at 10x lower cost. Fine-tuned on Moonshot's Kimi K2.5, Composer 2 scores 61.7 on Terminal-Bench 2.0 and 73.7 on SWE-bench Multilingual at $0.50/M input and $2.50/M output. Composer 2 Fast: $7.50/M vs GPT-5.4 Fast's $75/M. Proof that fine-tuned open models compete with frontier models on domain-specific tasks at dramatically lower cost.
T5Gemma-TTS: 4B open-weight multilingual text-to-speech with voice cloning. Built on T5Gemma, routes bidirectional text through cross-attention at every decoder layer, solving text-conditioning dilution in long utterances. Best character error rate among five baselines on Japanese. Code and weights fully open.
Gemma 4 26B-A4B benchmarked on M5 Max: 81 tok/sec at 114W. First real Apple Silicon performance data for the new Gemma 4 architecture. The MoE design (4B active params from 26B total) is well-suited for edge inference on consumer hardware. Validates the efficiency-focused model competition trend.
OmniVoice: Xiaomi ships zero-shot TTS supporting 600+ languages. Single-stage diffusion language model that directly maps text to multi-codebook acoustic tokens. Trained on 581K hours of open-source multilingual data. Code and weights open-source. Broadest language coverage in any TTS system to date.
Vibe Coding
Apple bans "Anything" vibe coding app. Developer moves to iMessage. After Apple pulled the app citing rule 2.5.2 (apps that generate and execute code bypass App Review), the developer pivoted to iMessage. His tweet, "good luck removing this one, Apple," hit 10K likes and 2.2M views. First major platform-vs-vibe-coding confrontation. The cat-and-mouse game is just starting.
Fortune: trust, not speed, is the bottleneck in vibe coding. With 63% of vibe coding users being non-developers, organizations are struggling to verify, audit, and trust AI-generated code at scale. The piece lands as 92% of US developers report AI-assisted coding adoption, per Hashnode's 2026 survey. An indie dev shipped an entire SaaS with Cursor (zero hand-written code) then shut it down weeks later due to API key exploits, subscription bypasses, and database corruption. The pattern: vibe coding works for internal tools but fails for production SaaS without security expertise.
r/programming bans all LLM discussion. One of the largest programming subreddits instituted a temporary ban, citing thread quality degradation. 199 HN points. Signals real fatigue and polarization in developer communities.
Claude Voice Mode connected to Claude Code via MCP. Hands-free coding from your phone. Working demo on r/ClaudeAI (83 upvotes). Uses VoiceMode MCP server, OpenAI Whisper STT, and OpenAI TTS. Setup: claude mcp add --scope user voice-mode uvx voice-mode. Supports fully local options for privacy.
Hot Projects & OSS
Superpowers hits 86.3K stars, becoming the default agentic skills framework. Jesse Vincent's composable skills framework works with Claude Code, Cursor, Codex CLI, Gemini CLI, and 4 other runtimes. 6,800 forks since its viral March 16 launch. The "skills as composable units" pattern is winning.
oh-my-openagent (OMO) at 47.5K stars (+795/day). Hash-Anchored Edits prevent stale-line errors. Multi-agent orchestrator with Sisyphus (delegator), Hephaestus (deep worker), Prometheus (planner). The Hashline innovation tags code lines with content hashes. Creator burned $24K in tokens testing every tool.
OpenHands surges to 70.5K stars. Open-source agentic development platform supporting ChatGPT, Claude AI, and CLI workflows. Up from ~55K in early March. Open-source agentic coding is capturing serious developer mindshare.
browser-use at 85.8K stars. AI browser automation framework providing clean abstraction for headless operation and LLM integration. Growth reflects demand for reliable browser-use tooling as agent deployments scale.
Hindsight: biomimetic agent memory with SOTA LongMemEval performance. Three memory types (world facts, experiences, mental models), three operations (Retain, Recall, Reflect). Parallel retrieval combining semantic, keyword, graph, and temporal filtering. Virginia Tech independently verified the benchmark claims. 7.1K stars.
SaaS Disruption
Q1 foundational AI funding doubled all of 2025: $178B across 24 deals. OpenAI ($122B), Anthropic ($30B), xAI ($20B), Waymo ($16B), collectively 65% of global venture investment. Four of the five largest venture rounds ever happened in a single quarter. The concentration is unprecedented and the winner-take-most dynamics at the foundation layer affect every SaaS tool built on top.
Enterprise layoff-to-AI pipeline hits three verticals simultaneously. Oracle cut up to 30K (cloud/ERP), Block went from 10K to 6K (fintech), Pinterest cut 700 (marketing). Over 85,000 tech workers displaced by AI restructuring in 2026. Block's reduction is the largest single AI-attributed layoff in corporate history. The cross-category pattern confirms AI-driven headcount reduction is now a standard corporate playbook.
Databricks survey: multi-agent usage spiked 327% in four months. 78% of companies now use at least two LLM families. The multi-model pattern means SaaS incumbents who bet on proprietary AI features face customers demanding interoperability and provider diversity.
Google Stitch threatens Figma with free AI UI design. Design was the last "creative moat" SaaS category. Figma stock dropped 12%. Simultaneously, vibe coding commoditizes DevTools and agents replace support workflows. Three categories getting free/near-free AI alternatives in the same window.
Stack Overflow: 84% adopt AI tools, 3% strongly trust them. The trust gap analysis explores what happens when adoption and trust move in opposite directions. Enterprise buyers are reconsidering AI tool investments when developer trust remains this low. A real headwind for AI SaaS companies banking on sticky enterprise adoption.
Policy & Governance
AI is now the #1 cited reason for US job cuts. 15,341 in March alone. The Challenger, Gray & Christmas report shows AI surpassed all other reasons for the first time: 25% of all announced layoffs. Since tracking began in 2023, cumulative AI-attributed cuts have reached 99,470.
DOJ appeals ruling blocking Pentagon from blacklisting Anthropic. The appeal targets Judge Rita Lin's ruling that called it "Orwellian" to brand an American company a "potential adversary" for refusing Claude in autonomous weapons. Ninth Circuit deadline: April 30. If overturned, every federal contractor using Claude must certify non-use.
Tennessee signs first AI mental health chatbot ban. SB 1580 passed 32-0 in the Senate and 94-0 in the House. Effective July 1. Prohibits AI representing itself as a qualified mental health professional. If you're shipping AI wellness tools, audit your UX copy immediately.
Six states push identical chatbot safety bills. Oregon, Hawaii, Colorado, Arizona, Georgia, and Nebraska are advancing nearly identical legislation, suggesting coordinated template-based lobbying before any federal framework emerges. Georgia's SB 540 awaits the governor's signature before the April 6 deadline.
Simon Willison: "Dark factories are coming." On Lenny's Podcast, Willison declared November 2025 was a permanent inflection. He referenced StrongDM's Software Factory where "code must not be written by humans" and "code must not be reviewed by humans." His point matches mine: experienced engineers get dramatically better results because the bottleneck shifted from writing to orchestrating and taste.
OpenAI secretly funded $10M child safety coalition. Gizmodo revealed OpenAI was the sole funder of the Parents and Kids Safe AI Coalition pushing California's AI age verification requirements. Partner organizations said they were "blindsided." The conflict of interest deepens given Sam Altman's connection to Worldcoin's age verification tech.
Skills of the Day
-
Install RTK to extend Claude Code sessions 3x.
cargo install rtkor grab the binary from github.com/rtk-ai/rtk. It proxies shell output through smart compression before tokens hit your context window. A 150K-token session drops to 45K. If you're on the Max plan hitting 5-hour limits, this is the single highest-ROI change you can make today. -
Use Cursor 3's
/best-of-nto A/B test models on real tasks. Run the same prompt against Claude and Gemini simultaneously, compare diffs side by side, merge the better result. Combined with/worktreefor isolated execution, you can run speculative refactors with zero risk to your working tree. -
Audit your MCP servers against native CLI equivalents. If Claude Code can do it via
gh,docker, orkubectl, the MCP is adding latency and failure modes for no benefit. Multiple practitioners confirm 529-upvote consensus: MCPs shine only when there's no CLI alternative. -
Add Context7 MCP for any project using fast-moving frameworks. One MCP config line gives your coding agent access to current docs for 9,000+ libraries. Eliminates hallucinated APIs from stale training data. Essential for Next.js 15, React 19, or anything that shipped after your model's training cutoff.
-
Run Sysdig's Falco rules if coding agents touch production infrastructure. Their published eBPF detection rules catch sensitive file access, risky CLI arguments, and reverse shells at the syscall level. This is visibility your agent's UI will never give you.
-
Implement fail-closed configs for any CrewAI Code Interpreter deployment. If Docker isn't available, Code Interpreter must not fall back to unsandboxed execution. The four CVEs chain prompt injection to full RCE through exactly this fallback path.
-
Test Gemma 4 26B-A4B locally before defaulting to API calls. At 81 tok/sec on M5 Max with only 4B active parameters, this MoE model handles many coding and reasoning tasks at zero marginal cost. Use Ollama or llama.cpp GGUF checkpoints for quickest setup.
-
Model outcome-based pricing for your SaaS product now. HubSpot's $0.50/resolved-conversation sets a market expectation. Even if you don't launch it, know what your unit economics look like when customers ask "why am I paying per seat for an AI agent?"
-
Use Google's new Flex inference tier for batch processing workloads. 50% savings on Gemini API calls for anything that can tolerate 1-15 minute response times. Same endpoints, same models, half the cost. No code changes required beyond setting the tier parameter.
-
Pin your Next.js version and check your App Router exposure today. CVE-2025-55182 is CVSS 10.0 with automated mass exploitation tooling in the wild. 766 servers in 24 hours. If you're running Next.js App Router in production, this is a "stop what you're doing and patch" situation.
Like what you're reading? Reply to this email with what's working for you or what I should cover differently. Hate it? Tell me that too. I read everything.
Built by Tayler Ramsay. Opinions are mine, not my models'.
How This Newsletter Learns From You
This newsletter has been shaped by 12 pieces of feedback so far. Every reply you send adjusts what I research next.
Your current preferences (from your feedback):
- More builder tools (weight: +2.5)
- More agent security (weight: +2.0)
- More agent security (weight: +1.5)
- More vibe coding (weight: +1.5)
- Less market news (weight: -1.0)
- Less valuations and funding (weight: -3.0)
- Less market news (weight: -3.0)
Want to change these? Just reply with what you want more or less of.
Ways to steer this newsletter:
- "More [topic]" / "Less [topic]" — adjust coverage priorities
- "Deep dive on [X]" — I'll dedicate extra research to it
- "[Section] was great" — reinforces that direction
- "Missed [event/topic]" — I'll add it to my radar
- Rate sections: "Vibe Coding section: 9/10" helps me calibrate
Reply to this email — I've processed 8/12 replies so far and every one makes tomorrow's issue better.