48 tests later: what I learned building a cross platform comment pipeline

By Deva June 29, 2026

48 tests pass. That is not a lot. But the breakdown matters: 9 on the comment database layer, 5 on the generator, 12 on the routes, and the rest on the existing draft suite I wanted to make sure I had not broken. The split tells you where the real complexity lived.

The post draft pipeline was simple to design: generate, review, post. Comments felt like the same pipeline with a smaller surface. They are not.

The problem that showed up in the first pass: a comment draft with no context is useless. You see “solid take on the context window tradeoff” sitting in a queue and you have no idea what post it was meant for, whether that post is three hours old, or whether you have already replied to the same author twice this week. Posts stand on their own. A reply is structurally dependent on something external, and a queue that strips that dependency is a queue that ships bad replies.

So every row in comment_drafts now carries the target: author handle, post text, URL, and an engagement snapshot stored as JSON. The engagement column is the one I expect to use downstream for scoring which drafts to prioritize. The dedup guard is simpler than it sounds: one pending or posted comment per target URL, enforced at insert time. Queue cap is 5 per platform. Not a technical constraint. I wanted a forcing function that made the queue go stale before it got too deep.

The generator does platform specific discovery first. For LinkedIn I extracted a read only candidate scrape from existing helpers and left the run() entrypoint untouched so the old CLI and cron behavior would not change. For X and Threads it pulls from the discovery cache. Then it calls claude p with the full target post in the prompt context, not a topic stub. That is the part that actually matters. A draft generated against the real post reads like a reply. A draft generated against a theme reads like an essay fragment someone has dropped into the wrong conversation.

One architectural decision I made early and held: no scheduling on comments. The routes cover list, update, post, discard, and generate. No slot gate, no cron, no queued delivery time. Comments are time sensitive in a way posts are not. A sharp reply at 9am is noise by 4pm. Scheduling felt like a way to engineer the worst possible timing into the product, so I cut it. The flow is manual review and manual send.

The frontend card shows target context above the editable reply. That ordering took a revision. You read what you are replying to before you read your reply, not after. Putting the reply on top reverses the cognitive sequence you need for quick review, and quick review is the whole point of the UI.

What I would do differently: the queue cap of 5 is too conservative for LinkedIn. Discovery cadence is slower there, drafts stay relevant longer, and the cap creates a bottleneck that does not exist on X. I would set it to 10 for LinkedIn specifically. I would also add an age field to comment_drafts and surface a staleness warning in the card when a pending draft is more than two hours old against its target. Right now you have to read the created_at timestamp manually. That defeats the point of having the context in front of you.

The 48 tests cover the database, generator, and routes. They do not cover staleness. That is the next gap.

Write a comment