Queue-Driven Archive Persistence in Continuum
- Introduction
- The Original Problem
- Crossing the Architectural Boundary
- Queue States
- Queue-Driven Exporters
- Decoupling the Global Index
- External Nostr Clients
- Background Processing
- Why This Became Surprisingly Difficult
- Current Status
Andrew G. Stanton - Saturday, May 23, 2026
Introduction
Today I completed a major architectural shift in Continuum’s archive system.
What originally started as a relatively simple “sync current database state into Git” workflow evolved into something much deeper:
a queue-driven archival pipeline designed to preserve the full event history of authored and discovered Nostr events over time.
This turned out to be significantly more difficult than expected because the problem itself changed.
The original archive model treated the current SQLite database as the source of truth.
The new system treats the event stream itself as the source of truth.
That distinction matters a lot.
The Original Problem
The previous archive workflow worked roughly like this:
- Read the current SQLite database
- Find the latest kind:0, kind:1, and kind:30023 events
- Export those into Git-backed archive repositories
- Rebuild a global index from the exported files
At first glance, this seems reasonable.
But there was a subtle problem:
The live database only stores the latest known state of many objects.
For example:
- profile updates (kind:0)
- notes and replies (kind:1)
- long-form articles (kind:30023)
If multiple edits or updates occurred between archive sync cycles, only the final state would survive in the database.
Intermediate versions could disappear entirely before the archive system ever saw them.
That meant the archive was not truly preserving the event history over time.
It was only preserving snapshots of the latest known state.
Crossing the Architectural Boundary
The realization that changed everything was this:
The current SQLite database is not the archival source of truth.
The event stream is.
Once I accepted that, the architecture had to change.
Instead of exporting directly from the current database state, Continuum now stores full raw event snapshots into an archive_sync_queue table as events are created, updated, or discovered during refresh/import operations.
Importantly, the queue stores the entire raw event JSON, not merely event ids.
That distinction matters because it allows every individual version of an event to be preserved independently.
Even if the live database later overwrites the latest visible state, the queued historical versions still exist.
This transforms the archive process from:
current DB snapshot -> export
That is a very different system.
Queue States
The queue currently tracks three states:
pendingprocessingcommitted
The flow now works like this:
- An event is queued as
pending - The archive worker claims the row and marks it
processing - The worker exports the event into the archive repositories
- Git operations complete successfully
- The queue row is finally marked
committed
If export or Git operations fail, rows are automatically reset back to pending for retry later.
This became important because archive persistence now has to handle:
- partial failures
- retries
- duplicate processing attempts
- concurrent worker protection
- Git/network failures
- idempotency
These are problems that didn’t really exist in the earlier “scan current DB” approach.
Queue-Driven Exporters
The archive exporters for:
- kind:0 profiles
- kind:1 notes/replies
- kind:30023 long-form articles
were all rewritten to consume queue rows instead of querying the live database directly.
This was another important shift.
Previously, exporters looked up the latest database rows during export time.
Now the exporters consume immutable queued snapshots directly from archive_sync_queue.raw_json.
This means the archive process no longer depends on the live database remaining unchanged between creation time and export time.
That greatly improves archival durability.
Decoupling the Global Index
Another important architectural improvement was separating:
- Commit Pending Events to Archive
- Rebuild Global Archive Index + RSS
These are now independent operations.
The first operation persists queued events into the Git-backed archive repositories.
The second operation rebuilds the global archive index entirely from committed archive history files.
This ended up being one of the most important design improvements.
The global archive index no longer depends on the live Continuum database at all.
It can now be reconstructed entirely from committed archive history stored in Git.
That means the archive itself has become self-describing.
The live database is no longer required to regenerate the archive index.
That creates a much more durable and portable archive model long-term.
External Nostr Clients
One corner case appeared during testing.
If I created an event in another Nostr client:
- Refreshing Continuum would correctly import the event into the local database
- The dashboard would display it normally
- But the event would never enter the archive queue
That meant externally-created events would never be archived.
The old shell-based sync process accidentally handled this because it scanned the current database state directly.
The new queue-driven model required explicit handling for externally-discovered events.
The solution was to enqueue newly discovered events during refresh/import operations if:
- the event does not already exist locally
- and the event is not already queued
This preserves the strengths of the original database-scanning model while still gaining the benefits of queue-driven persistence.
Background Processing
The system is now structured so archive persistence can eventually run automatically in the background.
For example:
- Commit Pending Events every 10 minutes
- Rebuild Global Index periodically only when needed
The expensive global index rebuild can also be skipped entirely when no new commits occurred.
This separation became important because rebuilding the archive index is much more expensive than simply committing pending events into the archive repositories.
Why This Became Surprisingly Difficult
At first, this seemed like a relatively small feature.
In reality, it became a much larger architectural transition.
The system moved from:
Export the current database state
to:
Capture and preserve an event history over time
That introduces an entirely new class of problems:
- queue durability
- idempotency
- retries
- eventual consistency
- partial failure recovery
- Git commit semantics
- concurrent worker protection
- historical version preservation
- decoupling live state from archival state
In other words:
this stopped being a simple export script and became an archival pipeline.
Current Status
The core system is now functioning:
- queue-driven archival
- durable version preservation
- Git-backed persistence
- automatic recovery from partial failures
- decoupled global index rebuilding
- queue-aware exporters
- support for externally-created events
- archive reconstruction from committed history
There is still cleanup and automation work remaining, but the foundation is now in place.
And importantly:
multiple edits occurring between sync intervals are no longer lost.
Every version can now be preserved independently over time.
NOTE: I had chatGPT create this artricle mostly (and it’s pretty accurate), please forgive any AI-isms if you see anything in the description. Trying to save time and this is mostly just describing what I have been working on since May 21.
Write a comment