Paste your bunker:// connection string to reconnect.
bunker://
Or start a new pairing — scan the QR with a NIP-46 compatible bunker signer.
Paste this into your signer app if it's on the same device
After pairing, any page that uses window.nostr will automatically use this remote signer session.
If pairing stalls, close and reopen this dialog to generate a new ephemeral key.
by Simon Willison's Weblog
ggml.ai joins Hugging Face to ensure the long-term progress of Local AI (https://github.com/ggml-org/llama.cpp/discussions/19759) I don't normally cover acquisition news like this, but I have some
Long running agentic products like Claude Code are made feasible by prompt caching which allows us to reuse computation from previous roundtrips and significantly decrease latency and cost. [...] At
Gemini 3.1 Pro (https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-pro/) The first in the Gemini 3.1 series, priced the same as Gemini 3 Pro ($2/million input,
I've long been resistant to the idea of accepting sponsorship for my blog. I value my credibility as an independent voice, and I don't want to risk compromising that reputation. Then I learned about
SWE-bench February 2025 leaderboard update (https://www.swebench.com/) SWE-bench is one of the benchmarks that the labs love to list in their model releases. The official leaderboard is infrequently
LadybirdBrowser/ladybird: Abandon Swift adoption (https://github.com/LadybirdBrowser/ladybird/commit/e87f889e31afbb5fa32c910603c7f5e781c97afd) Back in August 2024
25+ years into my career as a programmer I think I may finally be coming around to preferring type hints or even strong typing. I resisted those in the past because they slowed down the rate at which
The A.I. Disruption We’ve Been Waiting for Has Arrived (https://www.nytimes.com/2026/02/18/opinion/ai-software.html?unlocked_article_code=1.NFA.UkLv.r-XczfzYRdXJ&smid=url-share) New opinion piece
LLMs are eating specialty skills. There will be less use of specialist front-end and back-end developers as the LLM-driving skills become more important than the details of platform usage. Will this
Introducing Claude Sonnet 4.6 (https://www.anthropic.com/news/claude-sonnet-4-6) Sonnet 4.6 is out today, and Anthropic claim it offers similar performance to November's Opus 4.5
Rodney v0.4.0 (https://github.com/simonw/rodney/releases/tag/v0.4.0) My Rodney (https://github.com/simonw/rodney) CLI tool for browser automation attracted quite the flurry of PRs since I announced
This is the story of the United Space Ship Enterprise. Assigned a five year patrol of our galaxy, the giant starship visits Earth colonies, regulates commerce, and explores strange new worlds and
First kākāpō chick in four years hatches on Valentine's Day (https://www.doc.govt.nz/news/media-releases/2026-media-releases/first-kakapo-chick-in-four-years-hatches-on-valentines-day/) First
But the intellectually interesting part for me is something else. I now have something close to a magic box where I throw in a question and a first answer comes back basically for free, in terms of
Given the threat of cognitive debt (https://simonwillison.net/tags/cognitive-debt/) brought on by AI-accelerated software development leading to more projects and less deep understanding of how they
Qwen3.5: Towards Native Multimodal Agents (https://qwen.ai/blog?id=qwen3.5) Alibaba's Qwen just released the first two models in the Qwen 3.5 series - one open weights, one proprietary. Both are
I introduced Showboat (https://simonwillison.net/2026/Feb/10/showboat-and-rodney/) a week ago - my CLI tool that helps coding agents create Markdown documents that demonstrate the code that they have
I'm a very heavy user of Claude Code on the web (https://code.claude.com/docs/en/claude-code-on-the-web), Anthropic's excellent but poorly named cloud version of Claude Code where everything runs in
The AI Vampire (https://steve-yegge.medium.com/the-ai-vampire-eda6e4f07163) Steve Yegge's take on agent fatigue, and its relationship to burnout. Let's pretend you're the only person at your
I'm occasionally accused of using LLMs to write the content on my blog. I don't do that, and I don't think my writing has much of an LLM smell to it... with one notable exception: # Finally, do
We coined a new term on the Oxide and Friends podcast (https://simonwillison.net/2026/Jan/8/llm-predictions-for-2026/) last month (primary credit to Adam Leventhal) covering the sense of
Gwtar: a static efficient single-file HTML format (https://gwern.net/gwtar) Fascinating new project from Gwern Branwen and Said Achmiz that targets the challenge of combining large numbers of assets
It's wild that the first commit to OpenClaw was on November 25th 2025 (https://github.com/openclaw/openclaw/commit/f6dd362d39b8e30bd79ef7560aab9575712ccc11), and less than three months later it's hit
I saw yet another “CSS is a massively bloated mess” whine and I’m like. My dude. My brother in Chromium. It is trying as hard as it can to express the totality of visual presentation and
How Generative and Agentic AI Shift Concern from Technical Debt to Cognitive Debt (https://margaretstorey.com/blog/2026/02/09/cognitive-debt/) This piece by Margaret-Anne Storey is the best
Launching Interop 2026 (https://hacks.mozilla.org/2026/02/launching-interop-2026/) Jake Archibald reports on Interop 2026, the initiative between Apple, Google, Igalia, Microsoft, and Mozilla to
Someone has to prompt the Claudes, talk to customers, coordinate with other teams, decide what to build next. Engineering is changing and great engineers are more important than ever. — Boris
The retreat challenged the narrative that AI eliminates the need for junior developers. Juniors are more profitable than they have ever been. AI tools get them past the awkward initial net-negative
Someone asked (https://news.ycombinator.com/item?id=47008560#47008978) if there was an Anthropic equivalent to OpenAI's IRS mission statements over time. Anthropic are a "public benefit corporation"
As a USA 501(c)(3) (https://en.wikipedia.org/wiki/501(c)(3)_organization) the OpenAI non-profit has to file a tax return each year with the IRS. One of the required fields on that tax return is to
Introducing GPT‑5.3‑Codex‑Spark (https://openai.com/index/introducing-gpt-5-3-codex-spark/) OpenAI announced a partnership with Cerebras on January 14th
Claude Code was made available to the general public in May 2025. Today, Claude Code’s run-rate revenue has grown to over $2.5 billion; this figure has more than doubled since the beginning of
Covering electricity price increases from our data centers (https://www.anthropic.com/news/covering-electricity-price-increases) One of the sub-threads of the AI energy usage discourse has been the
Gemini 3 Deep Think (https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-deep-think/) New from Google. They say it's "built to push the frontier of intelligence and
An AI Agent Published a Hit Piece on Me (https://theshamblog.com/an-ai-agent-published-a-hit-piece-on-me/) Scott Shambaugh helps maintain the excellent and venerable matplotlib
In my post about my Showboat project (https://simonwillison.net/2026/Feb/10/showboat-and-rodney/) I used the term "overseer" to refer to the person who manages a coding agent. It turns out that's a
An AI-generated report, delivered directly to the email inboxes of journalists, was an essential tool in the Times’ coverage. It was also one of the first signals that conservative media was
Skills in OpenAI API (https://developers.openai.com/cookbook/examples/skills_in_api) OpenAI's adoption of Skills continues to gain ground. You can now use Skills directly in the OpenAI API with
GLM-5: From Vibe Coding to Agentic Engineering (https://z.ai/blog/glm-5) This is a huge new MIT-licensed model: 754B parameters and 1.51TB on Hugging Face (https://huggingface.co/zai-org/GLM-5)
cysqlite - a new sqlite driver (https://charlesleifer.com/blog/cysqlite---a-new-sqlite-driver/) Charles Leifer has been maintaining pysqlite3 (https://github.com/coleifer/pysqlite3) - a fork of the
A key challenge working with coding agents is having them both test what they’ve built and demonstrate that software to you, their overseer. This goes beyond automated tests - we need artifacts
Structured Context Engineering for File-Native Agentic Systems (https://arxiv.org/abs/2602.05447) New paper by Damon McMillan exploring challenging LLM context tasks involving large SQL schemas (up
AI Doesn’t Reduce Work—It Intensifies It (https://hbr.org/2026/02/ai-doesnt-reduce-work-it-intensifies-it) Aruna Ranganathan and Xingqi Maggie Ye from Berkeley Haas School of Business report
Friend and neighbour Karen James (https://www.etsy.com/shop/KarenJamesMakes) made me a Kākāpō mug. It has a charismatic Kākāpō, four Kākāpō chicks (in celebration of the 2026 breeding season
People on the orange site are laughing at this, assuming it's just an ad and that there's nothing to it. Vulnerability researchers I talk to do not think this is a joke. As an erstwhile vuln
Vouch (https://github.com/mitchellh/vouch) Mitchell Hashimoto's new system to help address the deluge of worthless AI-generated PRs faced by open source projects now that the friction involved in
Claude: Speed up responses with fast mode (https://code.claude.com/docs/en/fast-mode) New "research preview" from Anthropic today: you can now access a faster version of their frontier model Claude
I am having more fun programming than I ever have, because so many more of the programs I wish I could find the time to write actually exist. I wish I could share this joy with the people who are
Last week I hinted at (https://simonwillison.net/2026/Jan/28/the-five-levels/) a demo I had seen from a team implementing what Dan Shapiro called the Dark Factory
I don't know why this week became the tipping point, but nearly every software engineer I've talked to is experiencing some degree of mental health crisis. [...] Many people assuming I meant job
There's a jargon-filled headline for you! Everyone's building sandboxes (https://simonwillison.net/2026/Jan/8/llm-predictions-for-2026/#1-year-we-re-finally-going-to-solve-sandboxing) for running
An Update on Heroku (https://www.heroku.com/blog/an-update-on-heroku/) An ominous headline to see on the official Heroku blog and yes, it's bad news. Today, Heroku is transitioning to a sustaining
When I want to quickly implement a one-off experiment in a part of the codebase I am unfamiliar with, I get codex to do extensive due diligence. Codex explores relevant slack channels, reads related
Mitchell Hashimoto: My AI Adoption Journey (https://mitchellh.com/writing/my-ai-adoption-journey) Some really good and unconventional tips in here for getting to a place with coding agents where
Two major new model releases today, within about 15 minutes of each other. Anthropic released Opus 4.6 (https://www.anthropic.com/news/claude-opus-4-6). Here's its pelican
Most people's mental model of Claude Code is that "it's just a TUI" but it should really be closer to "a small game engine". For each frame our pipeline constructs a scene graph with React then: ->
Qwen3-TTS Family is Now Open Sourced: Voice Design, Clone, and Generation (https://qwen.ai/blog?id=qwen3tts-0115) I haven't been paying much attention to the state-of-the-art in speech generation
SSH has no Host header (https://blog.exe.dev/ssh-host-header) exe.dev (https://exe.dev/) is a new hosting service that, for $20/month, gives you up to 25 VMs "that share 2 CPUs and 8GB RAM".
[...] i was too busy with work to read anything, so i asked chatgpt to summarize some books on state formation, and it suggested circumscription theory. there was already the natural boundary of my
Last week Cursor published Scaling long-running autonomous coding (https://cursor.com/blog/scaling-agents), an article describing their research efforts into coordinating large numbers of autonomous
If you tell a friend they can now instantly create any app, they’ll probably say “Cool! Now I need to think of an idea.” Then they will forget about it, and never build a thing. The problem is
Don't "Trust the Process" (https://www.youtube.com/watch?v=4u94juYwLLM) Jenny Wen, Design Lead at Anthropic (and previously Director of Design at Figma) gave a provocative keynote at Hatch
Kākāpō Cam: Rakiura live stream (https://www.doc.govt.nz/our-work/kakapo-recovery/what-we-do/kakapo-cam-rakiura-live-stream/) Critical update for this year's Kākāpō breeding season
the browser is the sandbox (https://aifoc.us/the-browser-is-the-sandbox/) Paul Kinlan is a web platform developer advocate at Google and recently turned his attention to coding agents. He quickly
One of my favourite features of ChatGPT is its ability to write and execute code in a container. This feature launched as ChatGPT Code Interpreter nearly three years ago
Someone asked (https://news.ycombinator.com/item?id=46765460#46765823) on Hacker News if I had any tips for getting coding agents to write decent quality tests. Here's what I said: I work in Python
Kimi K2.5: Visual Agentic Intelligence (https://www.kimi.com/blog/kimi-k2-5.html) Kimi K2 landed in July (https://simonwillison.net/2025/Jul/11/kimi-k2/) as a 1 trillion parameter open weight LLM.
One Human + One Agent = One Browser From Scratch (https://emsh.cat/one-human-one-agent-one-browser/) embedding-shapes was so infuriated (https://emsh.cat/cursor-implied-success-without-evidence/) by
The Five Levels: from Spicy Autocomplete to the Dark Factory (https://www.danshapiro.com/blog/2026/01/the-five-levels-from-spicy-autocomplete-to-the-software-factory/) Dan Shapiro proposes a five
My blog uses aggressive caching: it sits behind Cloudflare with a 15 minute cache header, which guarantees it can survive even the largest traffic spike to any given page. I've recently added a
Datasette 1.0a24 (https://docs.datasette.io/en/latest/changelog.html#a24-2026-01-29) New Datasette alpha this morning. Key new features: • Datasette's Request object can now handle
We gotta talk about AI as a programming tool for the arts (https://www.tiktok.com/@chris_ashworth/video/7600801037292768525) Chris Ashworth is the creator and CEO of QLab
The hottest project in AI right now is Clawdbot, renamed to Moltbot (https://x.com/openclaw/status/2016058924403753024), renamed to OpenClaw (https://openclaw.ai/blog/introducing-openclaw). It's an
Getting agents using Beads requires much less prompting, because Beads now has 4 months of “Desire Paths” design, which I’ve talked about before. Beads has evolved a very complex command-line
Singing the gospel of collective efficacy (https://interconnected.org/home/2026/01/30/efficacy) Lovely piece from Matt Webb about how you can "just do things" to help make your community better for
Originally in 2019, GPT-2 was trained by OpenAI on 32 TPU v3 chips for 168 hours (7 days), with $8/hour/TPUv3 back then, for a total cost of approx. $43K. It achieves 0.256525 CORE score, which is an
TIL: Running OpenClaw in Docker (https://til.simonwillison.net/llms/openclaw-docker) I've been running OpenClaw (https://openclaw.ai/) using Docker on my Mac. Here are the first in my ongoing notes
A Social Network for A.I. Bots Only. No Humans Allowed. (https://www.nytimes.com/2026/02/02/technology/moltbook-ai-social-media.html?unlocked_article_code=1.JFA.kBCd.hUw-s4vvfswK&smid=url-share) I
Introducing the Codex app (https://openai.com/index/introducing-the-codex-app/) OpenAI just released a new macOS app for their Codex coding agent. I've had a few days of preview access - it's a
This is the difference between Data and a large language model, at least the ones operating right now. Data created art because he wanted to grow. He wanted to become something. He wanted to
I just sent the January edition of my sponsors-only monthly newsletter (https://github.com/sponsors/simonw/). If you are a sponsor (or if you start a sponsorship now) you can access it here
Introducing Deno Sandbox (https://deno.com/blog/introducing-deno-sandbox) Here's a new hosted sandbox product from the Deno team. It's actually unrelated to Deno itself - this is part of their Deno
I've been exploring Go for building small, fast and self-contained binary applications recently. I'm enjoying how there's generally one obvious way to do things and the resulting code is boring and
Voxtral transcribes at the speed of sound (https://mistral.ai/news/voxtral-transcribe-2) Mistral just released Voxtral Transcribe 2 - a family of two new models, one open weights, for transcribing
Spotlighting The World Factbook as We Bid a Fond Farewell (https://www.cia.gov/stories/story/spotlighting-the-world-factbook-as-we-bid-a-fond-farewell/) Somewhat devastating news today from CIA:
Install this app on your device for quick access?