Bitcoin Block Data Parser — Design

Operator has an offline Bitcoin node synced up to ~1 month ago. This is the test data source. **What we need from the node:** - Raw block data blk*.dat files — OR —

Bitcoin Block Data Parser — Design

Sources

Offline Node Data

Operator has an offline Bitcoin node synced up to ~1 month ago. This is the test data source.

What we need from the node:

  • Raw block data (blk*.dat files) — OR —
  • RPC access (if node can be started offline with -listen=0) — OR —
  • Exported block data (JSON via getblock RPC in a loop)

Best approach for offline node:

# Export block data without network (node runs in regtest/offline mode)
bitcoin-cli -rpcconnect=127.0.0.1 -rpcport=8332 getblockhash <height> >> hashes.txt
bitcoin-cli getblock <hash> 2 >> blocks.json

Alternative: Copy blk.dat files*

# Copy from node machine to SSD
rsync -av /home/user/.bitcoin/blocks/blk*.dat /media/user/shared-rw/bitcoin/blocks/

Parser 1: Sat Hodl Wave Parser (Hedblitz Claims)

What is Hodl Wave?

Hodl wave analyzes how long coins have been held (UTXO age distribution). “Hedblitz claims” likely refers to claims based on coin age — proving you held coins for a certain period.

What we parse:

For each UTXO in a block:

  1. Find the transaction that created it (input’s previous output)
  2. Determine the block height of the creating transaction
  3. Calculate age: current_height - creation_height
  4. Categorize by age brackets:
    • < 1 day
    • 1-7 days
    • 7-30 days
    • 30-90 days
    • 90-365 days
    • 1-2 years
    • 2-5 years
    • 5+ years

For Hedblitz Claims:

A “claim” would be a cryptographic proof that:

  • You control a UTXO of age N
  • The UTXO has value V
  • The UTXO has not been spent since creation

This could be the basis for a “proof of loyalty” TXXM type.

Output Format:

{
  "block_height": 850000,
  "block_hash": "...",
  "timestamp": 1720000000,
  "utxo_count": 5000,
  "total_sats": 1234567890,
  "age_distribution": {
    "0_1d": {"count": 100, "sats": 1000000},
    "1_7d": {"count": 200, "sats": 2000000},
    "7_30d": {"count": 300, "sats": 3000000},
    "30_90d": {"count": 400, "sats": 4000000},
    "90_365d": {"count": 500, "sats": 5000000},
    "1_2y": {"count": 1000, "sats": 10000000},
    "2_5y": {"count": 1500, "sats": 15000000},
    "5y_plus": {"count": 1000, "sats": 100000000}
  }
}

Parser 2: OP_RETURN Metaprotocol Filter

What we parse:

Every transaction’s outputs. For each OP_RETURN output:

  1. Extract the data push (up to 80 bytes)
  2. Check if data starts with a known metaprotocol prefix
  3. If yes → EXCLUDE (not Kapnet)
  4. If no → check for Kapnet prefix
  5. If Kapnet → INCLUDE and decode TXXM

Metaprotocol Prefixes to EXCLUDE:

Protocol Prefix (hex) Notes
Ordinals ord Ordinal inscriptions
BRC-20 brc-20 BRC-20 token operations
SRC-20 src-20 Stamps protocol
Runes runes Runes protocol
BitStore b BitStore data
Snow snow Snow protocol
Counterparty CNTRPRTY Counterparty tokens
OMNI omni Omni layer (USDT etc)
Colored Coins CLCT Early colored coins
Open Assets OA Open Assets protocol
RGB RGB RGB protocol
Taproot Assets tap Taproot Assets
Atomicals atom Atomicals protocol
MRI mri MRI protocol

Kapnet Whitelist Prefixes:

Protocol Prefix (hex) Notes
Kapnet TXXM kapnet Coordination data
Kapnet Anchor kanchor Chain anchor
Kapnet Governance kgov Governance TXXM

Output Format:

{
  "block_height": 850000,
  "total_transactions": 2500,
  "op_return_count": 500,
  "excluded": {
    "ord": 200,
    "brc-20": 150,
    "src-20": 50,
    "runes": 30,
    "other": 20
  },
  "whitelisted": {
    "kapnet": 5,
    "kanchor": 2,
    "kgov": 0
  },
  "unknown": 43,
  "kapnet_txxms": [
    {
      "txid": "...",
      "vout": 0,
      "data": "kapnet:...",
      "decoded": { ... }
    }
  ]
}

Implementation Strategy

Option A: Rust Binary (Fast, Use Existing Rust Toolchain)

kapnet-block-parser/
├── Cargo.toml (depends on rust-bitcoin 0.32 which already in workspace)
├── src/
│   ├── main.rs       — CLI entry point
│   ├── hodlwave.rs   — Hodl wave parser
│   ├── op_return.rs  — OP_RETURN parser + metaprotocol filter
│   ├── types.rs      — Shared types
│   └── output.rs     — JSON output

Pros: Fast, reuse existing rust-bitcoin, runs on SSD toolchain Cons: Needs cc (C compiler) to build — not in AppVM

Option B: Node.js Script (Quick, Available Now)

kapnet-block-parser/
├── package.json
├── src/
│   ├── hodlwave.js   — Hodl wave logic
│   ├── op_return.js  — OP_RETURN filter
│   └── index.js      — CLI

Pros: Runs now, no build needed, nostr-tools already installed Cons: Slower for large block data, no native rust-bitcoin

Option C: Hybrid (Recommended)

  • Rust binary for heavy parsing (build on SSD toolchain, run anywhere)
  • Node.js wrapper for Nostr integration (publish results as TXXM envelopes)

Chain Analysis: What Blocks to Parse

Approach 1: Full Chain Walk

Parse every block from genesis to tip. Comprehensive but slow.

  • ~850,000 blocks
  • ~700GB of raw block data
  • Days to weeks to process

Approach 2: Sample Analysis

Parse specific block ranges:

  • Every 1000th block (850 samples) — quick overview
  • Last 1000 blocks (most recent) — fresh data
  • Specific halving epochs — historical comparison

Approach 3: OP_RETURN Focus (Fastest)

Parse only blocks containing OP_RETURN transactions.

  • Skip blocks with no OP_RETURN (most early blocks)
  • Index OP_RETURN containing blocks first
  • Parse only those

Recommendation for test: Approach 3 for OP_RETURN, Approach 2 for hodl wave.

Deliverables

  1. Block data acquisition — get node data to SSD
  2. Hodl wave parser — age distribution per block range
  3. OP_RETURN filter — metaprotocol exclusion + Kapnet whitelist
  4. TXXM decoder — decode Kapnet TXXMs from OP_RETURN data
  5. Reporter soul — automated analysis + Nostr publishing

Write a comment
No comments yet.