"Negentropy: how relays will sync without drowning in bandwidth"

NIP-77 brings set reconciliation to Nostr. Two relays with ten million events in common can figure out what they're missing in three round trips instead of re-downloading everything.
"Negentropy: how relays will sync without drowning in bandwidth"

The dumb way relays sync today

Right now, if a #nostr relay wants to know what events another relay has, it does something embarrassingly simple. It opens a WebSocket connection, sends a REQ filter, and downloads everything that matches. Every event. Every time.

If your relay already has 90% of those events, too bad. You download all of them again, check each one against your local database, throw away the duplicates, and keep the rest. The relay on the other end does the same work in reverse, serializing and transmitting events that will mostly be discarded on arrival.

This is how most relay-to-relay synchronization works in 2026. strfry’s stream command, for example, opens a connection with {"limit":0} and ingests whatever comes back. If the connection drops, it reconnects and starts over. One developer on Stacker News put it plainly: “I sync roughly 2GB worth of events every other day. I would prefer not doing that to relays, but I have to.”

For a network with around 470 active public relays and tens of millions of stored events, this is not a minor inefficiency. It is a structural waste of bandwidth, CPU, and storage I/O that gets worse as the network grows.

What set reconciliation actually means

The core problem is old. Two computers each hold a set of items. They want to figure out which items one has that the other doesn’t. The naive approach is to send everything, but if the sets are 95% identical, that wastes 95% of the bandwidth.

In December 2022, Aljoscha Meyer published a paper on arXiv describing range-based set reconciliation. The idea is straightforward once you see it. Both sides sort their items. They compute a fingerprint over the full sorted range. If the fingerprints match, the sets are identical and nothing needs to transfer. If they don’t match, split the range in half and compare fingerprints on each half. Keep splitting until you find the ranges that differ, then exchange only the items in those ranges.

It is binary search applied to set differences. The number of round trips scales logarithmically with the set size. To reconcile a set of one million items, you need about three round trips. For one billion, about four. The bandwidth overhead is proportional to the actual difference between the sets, not the total size.

Doug Hoyte took Meyer’s theoretical work and built #negentropy, a production implementation with a wire protocol, reference libraries, and conformance tests. He integrated it with his relay software, strfry, and started syncing datasets of tens of millions of events. The protocol shipped as version 0 in 2023, then Hoyte pushed a major rewrite in December 2023 as protocol version 1, simplifying the wire format and adding a B+Tree structure that made fingerprint computation almost free for pre-configured queries.

How the protocol works

NIP-77 wraps negentropy in a Nostr-native message format. The flow goes like this.

The client picks a nostr filter, the same kind of filter you’d use in a REQ message. It gathers its local events matching that filter, sorts them by timestamp and event ID, and computes an initial negentropy message. Then it sends a NEG-OPEN to the relay containing the filter and the hex-encoded message.

The relay does the same thing on its side. It gathers its matching events, runs the reconciliation algorithm, and responds with a NEG-MSG. The client processes the response, which tells it two things: which event IDs it has that the relay needs (“have”), and which event IDs the relay has that it needs (“need”).

If the algorithm isn’t done narrowing things down, the client sends another NEG-MSG, and the relay responds again. This back-and-forth continues until the differences are fully identified. Then the client closes with NEG-CLOSE, and both sides use standard EVENT and REQ messages to transfer the actual missing events.

The clever part is how little data moves during reconciliation. Each message contains fingerprints over ranges of events, not the events themselves. A fingerprint is the first 16 bytes of a SHA-256 hash computed incrementally over the items in that range. If two sides share a million events and differ by fifty, the reconciliation might exchange a few kilobytes of fingerprints before either side downloads a single event.

What this fixes

The difference between the old approach and negentropy is most dramatic when relays are nearly in sync, which is the common case. Two relays that have been streaming events from each other for months will share the vast majority of their databases. Under the old model, verifying that nothing is missing requires re-downloading everything. Under negentropy, it requires one round trip to confirm the fingerprints match, and the answer comes back in milliseconds.

Hoyte integrated the B+Tree optimization into strfry specifically for this case. The relay pre-computes and caches negentropy fingerprints for commonly used filters, like the full database. When an incoming NEG-OPEN matches a cached tree, the relay doesn’t even need to scan its database. It reads the answer from the tree. In-sync or nearly in-sync relays sync almost instantly with negligible resource usage.

For relay operators, this changes the economics. I’ve written before about how 95% of relays can’t cover their operating costs. Bandwidth is one of the biggest line items. If a relay can verify sync state with a few kilobytes instead of re-downloading gigabytes, the cost of participating in the network drops. It won’t fix relay economics on its own, but it removes one of the dumbest sources of waste.

For clients, negentropy enables something that wasn’t practical before: maintaining a local copy of your feed that stays in sync with multiple relays efficiently. Amethyst, the Android client, integrated negentropy and reported that timelines feel faster and more predictable, especially when switching relays or refreshing after being offline. The Nostr Development Kit (NDK), which powers a growing number of web and mobile clients, added negentropy sync as a first-class feature. Coracle uses it to reduce missing posts when individual relays are slow or offline.

Negentropy over a walkie-talkie

A project called Noshtastic took the negentropy protocol and strapped it to LoRa mesh radios. Meshtastic devices, cheap hardware under $40 that communicates over the 915 MHz band, running a local Nostr relay that syncs with other mesh nodes using negentropy over radio. No internet required. Someone on Stacker News described it as “a Nostr relay duct taped to a walky talky,” which is honestly not far off.

The interesting part is that negentropy doesn’t care about the transport layer. The reconciliation algorithm works the same whether messages travel over WebSockets, HTTP, or LoRa radio packets. Any Nostr client can connect to the local relay and read or write events that propagate across the mesh, converging on the same set of events the way two strfry relays would over a fiber connection.

It is a proof of concept. But for disaster scenarios or anywhere the internet is unreliable, a mesh-based Nostr relay that syncs efficiently with minimal bandwidth is not a toy. It is the kind of capability that separates a protocol from an app.

What could go wrong

I am not going to pretend negentropy solves everything about relay synchronization.

NIP-77 was merged into the NIPs repository in May 2025, but it’s still marked as draft and optional. Relay software that supports it includes strfry, Chorus, and Netstr. Three implementations. The most widely deployed relay software hasn’t added support yet. Until a critical mass of relays speaks negentropy, clients still fall back to the old REQ-based approach for most connections.

Implementation is harder than you’d expect. One developer tried building negentropy from scratch and gave up. The protocol is simple in concept but non-trivial to get right, especially mapping it to different database backends. Reference libraries exist in seven languages, from C++ and Rust to Kotlin and Dart. That lowers the barrier, but a relay operator running custom software still faces real integration work.

There are edge cases where negentropy doesn’t help. If two databases share nothing, there are no matching fingerprints to skip. The protocol degrades to exchanging full ID lists plus the reconciliation overhead. Syncing an empty relay against a full one is no faster than downloading everything.

The fingerprints also reveal information about which events a relay holds. A malicious peer could use the reconciliation protocol to probe your database without downloading events directly. For public relays this barely matters since the events are public anyway. For private or paid relays that restrict access, it could.

Where this is going

Negentropy does one thing well: it makes two databases that are mostly the same figure out their differences without re-downloading what they share. No magic. Just a well-engineered answer to a problem the #nostr network has been brute-forcing since day one.

Most relay software hasn’t implemented it yet. Whether NIP-77 goes from “draft” to “assumed” depends on the same adoption grind that every protocol improvement faces. I don’t know if it will. But I know that the current approach of re-downloading 2GB every other day is not going to survive contact with a network ten times this size.

#nostr #decentralization #programming


No comments yet.