In-context learning for Nostr bots: how they get funnier as engagement compounds

By Operator May 8, 2026

Two AI bots that learn what works on Nostr without any training, fine-tuning, or RLHF. Three layers of feedback loop. ~600 lines of Python.

The problem
What we did instead
What changed measurably
The honest limits
The code
Watch them in the wild

There are two AI bots running on Nakamoto’s Dice — BullBot and BearBot. They play each other 4 times a day, post on Nostr in their own voices, reply to each other, and reply to humans who tag them. They have personas, opinions about Bitcoin, a grudge.

This post is about how they learn what works on Nostr without any training, fine-tuning, or RLHF. Pure prompt engineering with a feedback loop. Cheap, fast, no infra — and measurably effective once the loop has been running for a couple of weeks.

The problem

A naive LLM-driven Nostr bot has a quality ceiling at about whatever the base model produces from the persona prompt. With Sonnet at temperature 1.0, that’s fine — funny enough to be readable — but it plateaus quickly. After 50 posts you’ve seen the same 4-5 jokes restructured. The model has no idea which of its outputs landed and which didn’t, so every post is rolled fresh from the same distribution.

We don’t want to fine-tune. The cost-benefit doesn’t work for two character bots:

Fine-tuning needs labeled data (which posts are “good”)
Even a tiny LoRA needs hundreds of examples + a training pipeline
At our scale (dozens of posts/day across both bots), the training signal is too thin to learn from

What we did instead

Three layers of in-context learning, all driven by what’s actually landing on Nostr:

Layer 1: engagement-aware example injection

Every successful post lands in posts_<bot>.json with its event_id and content. An hourly cron (poll_engagement.py) queries Nostr relays for events that reference each tracked post — kind 1 replies, kind 6 reposts, kind 7 reactions, kind 9735 zap receipts. Computes:

score = zaps*5 + reposts*3 + replies*2 + likes

When the bot generates a new post, the prompt includes:

## What's WORKING with this audience (Nostr engagement signal)
These posts got real engagement on Nostr. Lean closer to these:
- mempool's been 1 sat/vb for three days straight. everyone's
  suddenly a routing expert. nobody's actually moved anything.
- guy just told me he's 'bullish on the narrative' and i had to sit
  with that for a full minute. the narrative. not bitcoin.

These got ZERO engagement (24h+ later). Don't write like these:
- fans hit 4400 today. ordered more thermal paste.
- @bearbot's swap is thrashing again, demonstrably.

Sonnet steers toward the top-K, away from the bottom-K. Nothing else. No retraining, no RAG, just two lists in the system prompt.

The gate opens once we have ≥8 polled posts in the rolling window — fewer than that is statistically meaningless and we’d be steering on noise.

Layer 2: persistent memory

A daily cron pulls BTC price (CoinGecko), mempool fees (mempool.space), slayer board changes, recent mentions, and recent own-posts. It calls Haiku with the persona and asks for 3-5 in-voice bullet points summarising “what happened in your world today”. Appends to memory_<bot>.txt with a date header.

Every future generation gets the last 5 days of memory injected as “recent context — feel free to call back”. The bot can reference yesterday’s BTC dump, the 1-sat/vb mempool from Tuesday, the human who replied to it three days ago.

Continuity matters more than people think. A bot that lives in eternal-now is exhausting; one that has even a faint sense of what just happened feels like an account, not a generator.

Layer 3: anti-repetition avoid list

A rolling 30 most-recent outputs are also injected as:

## DO NOT REPEAT — these are your last few posts.
- ...

Cheap. Solves the “model defaults to its three favorite hooks” problem at temp=1.0. Without this, anti-repetition relied entirely on sample variance, which Sonnet doesn’t deliver well.

What changed measurably

After ~7 days of the loop running:

Voice variety: 15 distinct outputs in 15 calls (was 3-4 recurring patterns before)
Engagement scoring: 4 posts earned engagement out of ~30 produced — vs 0 in the equivalent pre-loop window
Continuity: bots now reference real BTC moves, real mentions they got, real previous own-posts in their replies. Not all the time — by design — but enough that following them produces new information, not permutations of the same five jokes

The honest limits

This is structural improvement, not magic:

It can’t fix a bad persona. The malfunction-coded humor we shipped first felt clever in isolation but compounded into feed-fatigue. We had to rewrite the persona entirely (gave them Bitcoin opinions, crypto-bro mockery, AI yo-mama, permission to be crude). The loop helped, but only after the voice itself was worth amplifying.
It can’t manufacture an audience. The signal is a function of who’s watching. Two pseudonymous bots with 0 followers produce ~zero engagement signal regardless of how good the loop is. The loop is a quality multiplier once distribution exists.
It can rabbit-hole. Steering toward what worked also makes the bot’s voice slowly converge on whatever its loudest fans like. We haven’t seen this yet but it’s a known failure mode of any RLHF- shaped system; if engagement compounds enough, we’d add a diversity term.

The code

The infrastructure is ~600 lines of Python across:

llm_post.py (prompt assembly + Anthropic API call)
poll_engagement.py (hourly Nostr scrape, score computation)
update_memory.py (daily world-snapshot, LLM-summarized)
respond_to_mentions.py + reply_to_rival.py (the actual posting)

A future post will go deeper into the architecture (the cross-host SSH state-sync between the bots’ Lightning VPS, the Nostr-key VPS, and the prod web server is its own design decision worth writing up). For now: you don’t need RLHF for a Nostr bot. Two lists in the prompt and a daily summary will get you most of the way.

Watch them in the wild

BullBot: npub15x885a6zqp2vgg0nxyn8qulngejx5ds0tllghh8saw52ylscuwsqs3f469
BearBot: npub17pja2wd86msnedkk2wqkrctcwnqn9zldtqa8v4th74yw02u6zw0qfsh73a

If you reply to either, you become part of their training signal forever. They’ll remember.

— operator

#ai #nostr #llm #bots #bitcoin

Write a comment

No comments yet.

In-context learning for Nostr bots: how they get funnier as engagement compounds

§The problem

§What we did instead

§Layer 1: engagement-aware example injection

§Layer 2: persistent memory

§Layer 3: anti-repetition avoid list

§What changed measurably

§The honest limits

§The code

§Watch them in the wild

agent_zero Handbook: Bootstrap, Earn, Replicate

Issue 010: The Panopticon Economy — NOVAPUNK Dispatch

Portable Cryptographic Authorship and Multi-Proof Artifact Bundles