Private communications over public infrastructure
Private communications over public infrastructure
Today I want to write down some thoughts on what it really means to have private communications over public infrastructure, and whether such a thing is even possible at all.
This article was motivated (or should I say triggered?) by several things. Part of it comes from the current state of DMs in Nostr. Part of it comes from the privacy claims different projects make. And part of it comes from recent conversations I have had, also because I care about privacy. Privacy is crucial. We need to fight for it, and if we get it wrong and just assume things are private when they are not, things are going to get very scary.
I want to set the ground first, then look at the privacy model and tradeoffs of the main DM approaches around Nostr: NIP-04, NIP-17, Marmot, Nostr Double Ratchet, and NIP-4e.
First things first: private communications and encrypted communications are not the same thing.
Private communications usually depend on encryption to protect message content. But encryption alone is not enough to make a communication system private. You may encrypt what you say, yet still expose who you are talking to, when you are talking, how often you are talking, where you are connecting from, and how large your messages are. A third party can still learn a lot. This is metadata leakage, and in many cases, it is enough to build a revealing picture of relationships and behavior between different parties without reading a single message. Metadata is not “just data.” It is actionable intelligence.
In other words: Encryption protects what you say; privacy protects that you said it at all.
That distinction matters a lot in Nostr because Nostr was never designed as a private messaging network. It was designed to be an open network built around relays. Those relays are meant to be queryable, subscribable, “dumb”, and easy to use as public dissemination infrastructure. That design is powerful for openness and censorship-resistance, but it is hostile to strong privacy by default.
The next thing I want to clarify is the concept of public infrastructure. From the perspective of this article, it refers to shared, openly accessible systems or mediums, sometimes controlled by a third party, that anyone can use but that lack inherent controls to prevent interception, inspection, or eavesdropping by unauthorized parties. In this context, that includes networks, protocols, and services like relays.
Public infrastructure fights privacy at every layer, which is exactly why the distinction between encryption and privacy matters so much here.
One historic precedent makes this easier to understand. During World War II, encrypted radio traffic itself was not always immediately decryptable, but the metadata around that traffic was already useful. Timing, frequency, origin, volume, and communication patterns helped infer enemy movements, unit locations, and operational behavior. In other words, not being able to read the message did not mean the communication was private. The metadata still spoke loudly, and that lesson maps uncomfortably well to modern digital systems.
“We kill people based on metadata,” stated by Michael Hayden, former director of the NSA and CIA, during a 2014 debate at Johns Hopkins University.
“Metadata absolutely tells you everything about somebody’s life. If you have enough metadata, you don’t really need content.” Stewart Baker, former NSA General Counsel
That is also why I am skeptical every time I hear privacy claims built on top of public, observable infrastructure. If the substrate is public, queryable, persistently collectable, and operated by third parties, then metadata becomes part of the threat model whether we like it or not.
In Nostr, this problem becomes very concrete.
Even if a DM standard encrypts the content, an observer can still learn a lot. If the observer is a relay operator, it can see when a client connects. It can see the IP unless the user adds another protection layer such as Tor or a VPN. It can see when an event arrives. It can store traffic over time and build heuristics. An external observer with access to the relay stream may also infer a lot, depending on how much metadata the event exposes publicly.
I also think it is useful to distinguish two different kinds of observers throughout this article.
The first is an external observer. This is someone who can collect publicly visible events from relays, subscribe to public streams, and analyze what is exposed in the events themselves. An active observer could also infer the arrival time of an event even if its timestamp is randomized. This observer does not necessarily run the relay.
The second is a relay operator observer. This observer has a much stronger vantage point. They can see connection timing, source IPs unless users hide them, exact arrival timing, authentication attempts, and all the relay-local heuristics that never appear in the event itself.
That distinction matters because some protocols hide more from the external observer than from the relay operator. A scheme can look much better at the public event layer while still leaking a lot to the operator actually serving the traffic.
There is an attempt to mitigate part of this for external observers with relay auth, aka NIP-42, which allows clients to authenticate to relays. That can help a relay restrict access to sensitive events. But NIP-42 is not a privacy cure. It is a gatekeeping mechanism. In practice, it requires the client to reveal its identity to the relay in order to gain access. And for a relay to know which private events a user is allowed to fetch, those events usually need enough routing information to tie them to the relevant participants. That is useful operationally, but it comes with a clear tradeoff: access control becomes entangled with metadata exposure.
So the question I want to explore in this article is not whether Nostr can carry encrypted messages. It clearly can. The real question is whether it can support private communications over public infrastructure in a strong sense, or whether what we really get are different degrees of content secrecy with different amounts of metadata leakage.
That is the lens I want to use for the rest of this article.
NIP-04
NIP-04 was the first DM standard Nostr had. Today it is deprecated and explicitly marked as not recommended in favor of NIP-17, but it remains the cleanest example of the distinction I am trying to make.
In NIP-04, the message content is encrypted, but the event metadata is highly revealing. The event is authored by the sender’s real pubkey, although anyone could use an ephemeral pubkey. The receiver is identified in a p tag. The timestamp is visible. The event kind is visible. And the ciphertext size is visible as well.
That means the content may be hidden, but the communication pattern is not.
If you are an external observer collecting these events, you can map who is talking to whom and when. If you are a relay operator, you can usually do even better because you also see connection-level information and exact receipt timing. Even if a client tries to blur some timing details, the relay still sees when the event actually arrived.
So what does NIP-04 really give us? It gives us encrypted content, but it leaks social graph information, timing, sender identity, receiver identity, and payload size. That is already enough for strong traffic analysis. You do not need to decrypt anything to extract valuable intelligence from it.
This is not even a controversial interpretation. The specification itself warns that it leaks metadata and says it should not be used for anything that really needs to be kept secret, except perhaps in combination with authenticated relay restrictions from NIP-42.
That said, there is one thing NIP-04 does have going for it. Since sender and receiver are explicit, it works relatively naturally with relay authorization models. A relay can restrict access to the parties involved because the parties are visible in the event. But that convenience is also part of the privacy failure. The same metadata that helps a relay gate access also helps an observer understand the communication graph.
So NIP-04 is simply encrypted communication, but not private communication in any strong sense. It is the perfect example of why those two ideas must not be conflated.
NIP-17
After NIP-04, Nostr moved toward NIP-17, which uses NIP-44 encryption together with gift wraps. This was a real step forward.
The biggest improvement is that NIP-17 does a better job hiding the sender from the public outer event. The gift wrap is authored by an ephemeral key, and the actual message with the real sender is carried inside nested encrypted layers. On top of that, clients are encouraged to randomize timestamps to make simple timing correlation harder.
All of that is meaningful. I do not want to undersell it.
But I also do not want to pretend it solves the problem.
The receiver still needs a way to receive messages. In practice, that means the outer event still carries routing information for the receiver. An active observer can see that gift wraps are being delivered to some receiver-facing key. It can observe timing, volume, relay overlap, and ciphertext size. If the observer is a relay operator, it can also observe connection behavior. An external observer may not get the full picture that NIP-04 leaks publicly, and that is an important improvement, but important metadata is still available.
There is also another tradeoff that matters for the broader story. Gift wraps help obscure purpose because the same mechanism can be used for more than just DMs in Nostr. More specifically, the kind gift wraps use, 1059, is not exclusive to private chats. That is genuinely good for privacy, since not every gift wrap is obviously a chat message. An external observer seeing a kind 1059 event cannot automatically conclude that it is DM traffic. But the same design also creates a broad spam surface. An attacker can cheaply create many gift-wrap-looking events for a target receiver and force the receiver to spend resources figuring out which ones are real, or bury them under a pile of fake gift wraps. This spammy nature is also why some relays do not accept them.
I do not want to focus too much on UX in this article, but this is one of those places where privacy and usability pull in different directions. NIP-17 also gets awkward with relay authentication because the sender uses an ephemeral key, so it does not have a clean way to prove it should be allowed to fetch these events.
So I would describe NIP-17 as a meaningful improvement over NIP-04, especially in sender concealment, deniability, and reduced public visibility. But I still would not call it fully private communication. It is better encrypted communication with a more cautious metadata posture, but not a complete escape from metadata leakage.
Beyond The NIPs Umbrella
The Nostr repository itself mainly gives us NIP-04 and NIP-17, plus the new NIP-4e, which is still a PR. But there are other serious attempts to push the privacy and security model further. The two most relevant ones for this discussion are Marmot and Nostr Double Ratchet.
These two are interesting because they do not just tweak the outer format of Nostr DMs. They bring in much stronger messaging primitives. Marmot brings MLS. Double Ratchet brings the family of ideas popularized by Signal. In both cases, the cryptography becomes substantially more ambitious, giving us things like forward secrecy and post-compromise recovery, which neither NIP-04 nor NIP-17 gives us.
These terms, forward secrecy and post-compromise recovery basically mean that if a private key is compromised, an attacker should not automatically be able to read historic messages or future messages.
That matters, but it still does not let us skip the transport question.
Marmot
Marmot combines MLS with Nostr. MLS, short for Message Layer Security, is a serious group messaging protocol designed to provide strong content confidentiality, authentication, forward secrecy, and post-compromise security for groups. It also scales much better for larger groups than Double Ratchet. Those are not small properties. They are major upgrades compared with the simpler encrypted-message schemes we find in NIP-04 or NIP-17.
This is something that should be said clearly: MLS gives Marmot genuinely strong secrecy properties at the message layer.
But the problem I am exploring here is not just message secrecy. It is private communications over public infrastructure. That is where the picture becomes more complicated.
Marmot maps MLS artifacts onto Nostr events and uses relays as a dissemination layer. That means things like key packages, welcome flows, group events, and group routing all have to live within a public relay environment. The protocol tries to preserve privacy through encrypted payloads, private MLS group IDs, ephemeral outer pubkeys for group events, and other techniques. Those are meaningful design choices, and they do improve the situation.
But there is an important nuance here. The private MLS group ID and the relay-facing nostr_group_id are not the same thing. In theory that is good because the real internal MLS group identifier stays private. In practice, though, the relay-facing nostr_group_id still behaves like a stable public routing identifier. That means it is observable, it is collectable, and it is spammable in the same way gift wraps are. So even if it is not the real MLS group ID, it still creates a public handle that observers and attackers can use.
This matters even more because Marmot traffic is not especially hard to identify from the outside. If a user publishes key package events on kind 443, or on kind 30443 in the newer transition, that already signals that the user might be using Marmot. And if an observer subscribes to group messages which are kind 445 events, they can infer Marmot group traffic directly. Those event kinds are not generic cover traffic. They are much more revealing than NIP-17 gift wraps, whose kind 1059 can be used for several different purposes in Nostr.
From my perspective, I do not think its current design gets Marmot to strong privacy in the way some descriptions suggest.
The reason is simple. Marmot still operates over public relays. Even if the content is well protected, relays and active collectors can still observe that Marmot is being used. They can see when certain group identifiers become active, when bursts of traffic happen, how much data is flowing, how large the messages are, and which relays appear to be involved. An external observer can infer protocol usage from the specific Marmot kinds used and from the stable routing surface. A relay operator gets all of that plus the usual network-level vantage point, exact arrival times, and IP-level heuristics.
There is another issue: MLS is not just cryptography. It is also coordination. That burden becomes harder in Nostr because relays are a dissemination substrate, not an ordering oracle. Multiple admins can race. Offline members can come back with stale state. Welcome flows and commits need careful discipline to avoid drift.
This is not mainly a privacy problem, but it matters because every time a protocol needs more coordination over weakly ordered public infrastructure, the system inherits more complexity, more observable recovery patterns, and more room for operational fragility. As I already said, I do not want to turn this article into a UX piece. But this is exactly one of those places where the search for stronger privacy and stronger secrecy can make the total system harder to operate.
So my conclusion on Marmot is nuanced.
I think Marmot substantially improves content security and secrecy compared with simpler DM schemes. Thanks to MLS, it brings real forward secrecy and post-compromise security, and that is a serious advancement. But I do not think public Nostr relays let Marmot honestly claim strong private communications in the broader sense. Metadata observability and public collection remain very real constraints.
Nostr Double Ratchet
Nostr Double Ratchet is the closest thing in this space to bringing a Signal-like messaging primitive into Nostr instead of relying on simpler static encryption flows.
Double Ratchet gives us stronger secrecy properties thanks to forward secrecy and post-compromise recovery. Instead of deriving a static conversation secret and reusing that basic relationship forever, the ratchet evolves keys as messages are exchanged. In practical terms, that means a compromise now should not automatically reveal all past messages, and if the compromise ends, future secrecy can recover after fresh ratchet steps.
That is real progress. Just like with MLS, I think this should be acknowledged clearly rather than buried under transport criticism. If the question is message secrecy, Double Ratchet is much closer to the state of the art than NIP-04, and in an important sense it also improves on the baseline security model around ordinary NIP-17-style direct messaging.
At the same time, Nostr Double Ratchet does not get to cheat physics.
The messages still move through Nostr events and relays. And here the event kinds matter more than they may seem at first glance. In this design, the encrypted outer transport for ordinary ratcheted messages uses kind 1060. That is important because kind 1060 is not generic cover traffic. If an external observer sees repeated kind 1060 events, they can reasonably infer that this specific messaging system is in use, just like in the case of Marmot.
Not every part of the protocol is equally revealing. Invite and AppKeys records use kind 30078, which is a general application-data kind in Nostr and does not by itself prove that Double Ratchet is being used. Encrypted invite responses use kind 1059, which blends into the broader gift wrap universe. Shared-channel bootstrap also leans on kind 4. So there is some reuse of more generic Nostr surfaces. But the core ratcheted message transport still stands out.
That means the outer layer is still informative. An external observer can learn that kind 1060 traffic is happening, see the outer event pubkeys, timestamps, traffic bursts, approximate message sizes, and repeated communication rhythms. Even if they cannot read the content, they can still cluster activity, try to infer which identities or devices are active, and distinguish steady chat traffic from bootstrap traffic such as invites or device-management updates. Those flows also use protocol-specific kinds that do not blend into the general Nostr event stream.
The relay operator learns even more. It sees exact arrival times, subscription behavior, source IPs unless users hide them, and the full local timing picture that never appears in the event itself. At the same time, Double Ratchet does seem to avoid one of the more obvious metadata problems we saw in Marmot: ordinary traffic does not depend on a stable public routing identifier in the event itself.
There is also a tradeoff here around scale. Double Ratchet is naturally attractive for 1:1 communication because it gives strong secrecy with relatively clear security intuition. But it is not the most natural fit for large fanout by itself. The moment it tries to scale those guarantees into more complex group and multidevice environments, the protocol surface and complexity grows quickly. The Nostr Double Ratchet project itself includes multidevice identity mapping, session management, and sender-key-style group messaging, and that is powerful, but it also shows the cost of pushing stronger security models over a public relay substrate. The event kinds used reflect that expansion. Once a system needs separate kinds for encrypted transport, invites, device authorization records, shared-channel bootstrap, and group sender-key flows, an observer gets more protocol fingerprints to work with even if the message bodies remain sealed.
So my take is that Double Ratchet is a major upgrade in secure messaging properties, and it may improve privacy in a limited sense by making traffic less trivially interpretable. If I have to choose a protocol for 1:1 communication over Nostr relays, it would be this one. But it still does not solve the central problem of public infrastructure. The relays still know that communication happened. They still see timing. They still see patterns. And if the infrastructure remains public and observable, those patterns remain part of the attack surface.
Just a side note to finish this. Neither Marmot nor Nostr Double Ratched play well with relay auth, NIP-42.
NIP-4e
NIP-4e is a bit different from the other things I have covered so far, because it is not really trying to be a standalone DM protocol. It is better understood as an auxiliary mechanism that can be used together with schemes such as NIP-04 or NIP-17.
Its core idea is to decouple encryption from identity.
That may sound subtle, but it solves a real problem. Many Nostr patterns assume the identity key is also the key used for encryption, including self-encryption flows used for syncing data across devices. That breaks down in setups where the identity key is intentionally not present on the device, like bunkers, threshold signing systems, secure enclaves, or other more sovereign key-management architectures. NIP-4e tries to fix that by letting a user keep a separate encryption key and distribute that capability securely across devices.
From an architectural point of view, I think that is a good idea.
It improves key hygiene. It makes local encryption workflows more practical. It lets someone keep their identity signing setup more isolated while still doing encryption on-device. And it fits better with a world where people want a stronger separation between identity authority and application-level encryption.
But if we bring it back to the main question of this article, the answer stays modest.
NIP-4e does not suddenly make communications private. If you use NIP-4e together with NIP-04, you still inherit the metadata exposure of NIP-04, even if the keys used are not the real keys. If you use it with NIP-17, you inherit the strengths and weaknesses of NIP-17. What changes is not the public nature of the transport, but the keying model.
In fact, the protocol introduces its own observable artifacts through announcements and device key-sharing events. That does not make it bad. It just means it should be evaluated honestly. NIP-4e is useful infrastructure for decoupled encryption and multi-device sovereignty. It can support other DM standards. But by itself it is not a privacy breakthrough.
Conclusions
So, is private communication over public infrastructure even possible?
After going through all of these approaches, my answer is: yes, in some sense, but not really.
Public infrastructure fights privacy at every layer.
Spacially if the infrastructure is publicly reachable, queryable, and persistently collectable, then metadata becomes part of the system whether the protocol authors like it or not. That does not mean encrypted messaging over public infrastructure is useless. Far from it. It can still be extremely valuable. It can still raise the cost of surveillance. It can still make censorship harder. It can still improve user sovereignty. But if we want to be honest, we cannot collapse all of that into the word private and move on.
So I do not think the honest story is that Nostr already has private messaging. I think the honest story is that Nostr has several attempts at encrypted messaging, and those attempts sit at different points in a tradeoff space between secrecy, metadata exposure, complexity, scalability, and usability.
That is not a failure. It is just reality.
If anything, I think the most dangerous thing we can do is use imprecise language and teach users to trust guarantees the system does not actually provide. Honest language is part of security. If a protocol has strong content confidentiality but weak metadata privacy, we should say exactly that. If a system makes observation harder but still possible, we should say that too.
Nostr can absolutely be a powerful substrate for encrypted and censorship-resistant communications. I believe that. But if what we want is truly private communication, then public infrastructure is still an adversarial environment. And the more honestly we admit that, the better chance we have of building and designing systems that improve things for real instead of just sounding private.
I intentionally kept this piece focused on privacy rather than UX, even though the two constantly touch each other. But it is hard to read through these protocols without noticing that better secrecy and a better privacy posture often come with costs in complexity, spam resistance, coordination, recoverability, or general usability.
Encrypted is not the same as private. Public infrastructure is not a neutral carrier. And if we want private communications on top of public systems, we have to be far more demanding about metadata, far more honest about tradeoffs, and far more precise in the claims we make.
great writeup! i wonder if some sort of multihop (onion/garlic) routing could remedy the issues with metadata leakage.
I wonder if the SimpleX protocol can be adapted for Nostr given that they both use relays as their infrastructure. SimpleX doesn’t have static identifiers so maybe that won’t play nice with the protocol but it’s something to think about
https://github.com/simplex-chat/simplexmq/blob/stable/protocol/overview-tjr.md#what-is-simplex
Nice write up. Helped me better understand things more clearly.
One thing I thought of to improve things is a solution for a client (person, from work) is elemintating knowing the receiver in nip-17 by utlizing the generated shared secret (ecdh) to generated a shared keypair and having that keypair send to itself DMs with giftwraps, so the public sees some key is sending to itself (this also can be expanded to 3 people easily, and technically up to however many you want after that but after 3 it would require an initial handshake of sorts to create the shared key).
With that said though, it helps a bunch with privacy but far from covering everything, and the downside of this approach is that cold-messaging can’t happen as participants must know each other to know if communications between them might happen, however, this also eliminates spam as a result.
Legend
Yes, that’s basically how I2P differs from Tor, constant cover traffic. Nym also has that option for mixnet routing.
Nice writeup, its a bloody difficult problem indeed…
One clarification is that marmot events are similar to nip17 events in the sense that the outer layer doesn’t reveal what type of message was sent, it could be a DM, but also a proposal or commit message, a file, a zap, or whatever kind is inside of it.
I’m very curious on your thoughts on what we need from a server implementation to be more optimized for private events. Just like we have blossom for files, we can just spin up a new spec & implementation for a server infrastructure that would work better, not everything has to be on nostr relays.
Write a comment