Seroter's Daily Reading — #755 (April 2, 2026)

Audio summary of Richard Seroter reading list 755: Gemma 4 under Apache 2.0, Spec-Driven Development, Cursor 3, voice AI debugging with Google ADK, executive presence pitfalls, Amex AI deployment, MCP observability, AI-generated code plateauing at 30%, and more.

Episode image

🎧 Listen to this episode (8.9 minutes)

📰 Original post on seroter.com


This is Seroter’s Daily Reading, number 755, from April 2nd, 2026.

Richard’s list today is dominated by AI tooling and developer workflows, with a few pieces that step back and ask bigger questions about how we build software and how we organize ourselves. Let’s get into it.

The headliner is Google’s release of Gemma 4, their most capable open model family yet. These come in four sizes, from a tiny 2 billion effective parameter model that runs on phones and Raspberry Pis, all the way up to a 31 billion parameter dense model that currently sits at number three on the Arena AI text leaderboard. The big deal here isn’t just performance. It’s the license. Gemma 4 ships under Apache 2.0, which is a major shift from previous Gemma releases that had custom terms. These models support multimodal inputs including vision and audio, handle context windows up to 256 thousand tokens, and are trained on over 140 languages. They’re also built for agentic workflows with native function calling and structured JSON output. Google is clearly positioning this as the open counterpart to their proprietary Gemini lineup, and the combination of frontier-class performance with a truly open license is significant for anyone building local-first or on-device AI.

Next up, Mofi wrote a piece on what he calls Spec-Driven Development. The core argument is that we’ve entered a fifth generation of programming where AI generates the code, but unstructured AI coding creates fragility. He cites research showing that undirected AI coding increases code complexity by 41 percent and can introduce security vulnerabilities in over 90 percent of outputs. The proposed solution is to make formal specifications the source of truth instead of code itself. The spec gets written in machine-readable formats like OpenAPI or YAML, and AI becomes a kind of constrained compiler that generates code from those specs. There’s a three-tier maturity model ranging from writing specs first but discarding them, up to a world where humans never edit code directly. It’s a thought-provoking framework, even if most teams are nowhere near that third tier yet.

Cursor announced version 3, and it’s a big step toward what they call “the third era of software development.” The new interface is built from scratch around agents rather than files. You can run multiple agents in parallel across different repositories, seamlessly hand off work between local and cloud environments, and manage everything from a unified workspace. The traditional IDE with its file tree and editor tabs is becoming a secondary surface. The direction is clear: the developer’s job is increasingly about supervising and directing agents rather than writing code line by line. Cursor is betting hard on that future, and version 3 looks like their most concrete move yet.

Annie Wang wrote a deep technical post about building a real-time voice AI stylist with Google’s Agent Development Kit, or ADK. The interesting part isn’t the stylist itself. It’s the debugging story. She hit an annoying lag problem that turned out to have nothing to do with the network or the model’s speed. The real issue was how she was signaling conversation turns. In voice AI, there’s a critical distinction between streaming data and signaling that a turn is complete. ADK has two methods for this, and using the wrong one silently causes the model to wait for a turn boundary that never comes. Everything looks like latency, but it’s actually a logic error. It’s a great example of how real-time voice AI is fundamentally different from text-based chat, and how the documentation doesn’t always make the sharp edges obvious.

Harvard Business Review ran a piece on when executive presence backfires. The gist is that the traits we associate with executive presence, confidence, decisiveness, commanding attention, can actually work against leaders when overdone or applied in the wrong context. It’s the kind of nuanced leadership advice that’s easy to nod along with but hard to act on in practice.

American Express shared how they’re deploying AI across the company. About 11,000 engineers are using AI tools and have cut coding times by more than 30 percent. Travel advisors in 19 countries use AI for recommendations, and the sales team is using generative AI for real-time prospect leads and automated follow-ups. CEO Steve Squeri called it “a deliberate redesign of how we operate.” Notably, they emphasize that the goal isn’t headcount reduction. It’s about deeper customer relationships and better tools. Though, as the article points out, other companies like Block and Meta are making very different choices.

The MCP Toolbox for Databases team, in collaboration with Agnost AI, shipped built-in observability features including distributed tracing for all transport types, latency histograms, and active session counters. They now follow the OpenTelemetry MCP semantic conventions. The practical value here is debugging. When your AI agent is slow, is it the network, the server, or the database query? With proper traces, you can answer that in seconds instead of hours. As more of our architectures include MCP components, this kind of observability tooling is going to become essential.

DX published their quarterly data on AI-generated code, and the number has ticked up from 22 to 27.4 percent of merged code being AI-authored. That’s based on developer self-reporting across over 500 organizations. The stability around 30 percent is interesting. Despite significantly more capable models shipping in late 2025, the percentage hasn’t jumped dramatically. The likely explanation is that most teams haven’t fully adapted their workflows yet. Daily users show the biggest increase, while weekly and monthly users barely moved. The tools got better, but the habits haven’t caught up.

Google published a tutorial on building an AI Meeting Prep Agent with Workspace Studio. It’s a no-code setup that triggers 15 minutes before any calendar event, scans your emails and documents for context about the attendees and topic, and sends a briefing to Google Chat. It’s the kind of practical workflow automation that sounds simple but could genuinely save time for people drowning in back-to-back meetings.

The Dart and Flutter team shared how they’re thinking about AI in 2026. They’ve carved their developer audience into three personas: the traditional developer who wants AI to stay out of the way, the AI-assisted developer using agents for boilerplate, and the AI-first developer building apps with natural language. Their principles include keeping Dart human-readable first, being agent-agnostic through open standards like MCP, and solving what they call the “verification tax,” the time developers spend auditing AI-generated code.

Docker published advice on defending your software supply chain, driven by recent high-profile breaches. And finally, the Google Security team wrote about their continuous approach to mitigating indirect prompt injections in Google Workspace, detailing how they protect users when AI assistants process untrusted content from emails and documents.

Those are your twelve for today. The throughline this time is the tension between AI’s growing capability and the human systems trying to keep up, whether that’s developer workflows, organizational structures, or security models. The tools keep getting better. The question is whether our processes evolve fast enough to use them well. That’s reading list 755. See you next time.


Articles covered in this episode:

  1. Gemma 4: Byte for byte, the most capable open models — Google Blog
  2. Spec Driven Development — Mofi, Google Cloud / Medium
  3. Meet the new Cursor — Cursor Blog
  4. I Built a Real-Time Voice AI Stylist with Google ADK — Annie Wang, Google Cloud / Medium
  5. When Executive Presence Backfires — Harvard Business Review
  6. How Amex deploys AI tools — CIO Dive
  7. From Lag to Lightning: Optimizing MCP Toolbox with Built-in Observability — MCP Toolbox / Medium
  8. AI-generated merged code holds steady at ~30% — DX Newsletter
  9. Google Workspace Studio Tutorial: Building an AI Meeting Prep Agent — Google Cloud / Medium
  10. How Dart and Flutter are thinking about AI in 2026 — Flutter Blog
  11. Defending Your Software Supply Chain — Docker Blog
  12. Google Workspace’s continuous approach to mitigating indirect prompt injections — Google Security Blog

Source: Richard Seroter’s Daily Reading List #755


No comments yet.