Hosting a Foundational Store
This guide explains how to host a Foundational Store for Substreams data processing pipelines. Foundational Stores provide persistent, fork-aware key-value storage that can be populated by Substreams modules and served to other modules.
What is a Foundational Store?
A Foundational Store is a high-performance, multi-backend key-value storage system designed for Substreams ingestion and serving. It provides:
Fork-aware storage: Handles blockchain reorganizations automatically
Multiple backends: Supports Badger (embedded) and PostgreSQL
Block-level versioning: Every entry tagged with block number for historical queries
High-performance serving: gRPC API for data retrieval
Streaming ingestion: Continuous processing via Substreams sink
Foundational Stores are typically populated by Substreams modules that extract and transform blockchain data, then serve that data to other Substreams modules for efficient lookups.
Hosting a Foundational Store
To host a Foundational Store, you need:
A Substreams module that produces the data to store
The
foundational-storebinaryA storage backend (Badger or PostgreSQL)
Step 1: Prepare Your Data Source
Your Substreams module should output data in the format expected by the Foundational Store. The data is stored as key-value pairs with block-level versioning.
Example: The SPL Initialized Account module tracks Solana SPL token account initializations and stores account-to-owner mappings.
Step 2: Build the Foundational Store Binary
git clone https://github.com/streamingfast/substreams-foundational-store
cd substreams-foundational-store
go build -o foundational-store ./cmd/foundational-storeStep 3: Run the Server
Start the server with your Substreams module as the data source:
./foundational-store server \
--dsn "badger:///path/to/data" \
--type-url "your.module.Type" \
--manifest-path "../your-substreams-module/substreams.yaml" \
--output-module-name "your_output_module" \
--endpoint "your-blockchain.streamingfast.io:443" \
--start-block 1000000Configuration Options:
--dsn: Storage backend (Badger:badger:///path, PostgreSQL:postgres://user:pass@host/db)--type-url: Protobuf type URL for stored values--manifest-path: Path to your Substreams manifest--output-module-name: Name of the output module in your Substreams--endpoint: StreamingFast endpoint for blockchain data--start-block: Block to start ingestion from
Step 4: Monitor and Maintain
The server exposes:
gRPC API on port 50051 (default)
Prometheus metrics on port 9102 (default)
Health checks via gRPC reflection
Monitor ingestion progress through cursor files and Prometheus metrics.
Using SinkEntries for Data Ingestion
Foundational Stores ingest data through SinkEntries messages, which allow batch operations with conditional insertion.
SinkEntries Structure
message SinkEntries {
repeated Entry entries = 1;
bool if_not_exist = 2;
}entries: Array of key-value pairs to storeif_not_exist: Controls insertion behavior
Conditional Insertion with if_not_exist
The if_not_exist flag determines how entries are inserted:
When
false(default): Entries are inserted regardless of existing keys, potentially overwriting previous valuesWhen
true: Entries are only inserted if the key doesn't already exist in the store
The check is performed against both the in-memory cache and persistent storage to ensure consistency.
Example: Producing Data for a Foundational Store
The spl-initialized-account module shows how to produce data that populates a Foundational Store.
It processes Solana transactions and outputs account-owner mappings as SinkEntries:
#[substreams::handlers::map]
fn map_initialize_account(txn: Transaction) -> Result<SinkEntries, Error> {
let mut entries = vec![];
// Extract account initialization data
let account = txn.accounts[0];
let owner = extract_owner_from_instruction(&txn)?;
// Create entry with conditional insertion to prevent duplicates
entries.push(Entry {
key: Key { bytes: account.to_bytes().to_vec() },
value: Any::pack(&AccountOwner {
account: account.clone(),
owner,
})?,
});
Ok(SinkEntries {
entries,
if_not_exist: true, // Only store if account not already initialized
})
}This data is then ingested by the Foundational Store sink for later querying.
Best Practices
Hosting
Configure appropriate batch sizes and flush intervals for your throughput needs
Implement proper backup strategies for your storage backend
Performance Tuning
Adjust
--batch-sizeand--max-batch-timebased on your data volumeUse SSD storage for Badger backend
Last updated
Was this helpful?

