Manifests Reference
StreamingFast Substreams manifests reference
This reference documentation provides a guide for all fields and values used in a Substreams manifest.
Tip: When writing and checking your substreams.yaml file, it may help to check your manifest against our JSON schema to ensure there are no problems. JSON schemas can be used in Jetbrains and VSCode. Our manifest schema can be seen here.
Manifests overview
In simple terms, a Substreams manifest (substreams.yaml) is a configuration file (a YAML file) for your Substreams. The manifest file is used for defining properties specific to the current Substreams module and identifying the dependencies between the inputs and outputs of modules. For example, the following manifest receives a raw Ethereum block as input (sf.ethereum.type.v2.Block) and outputs a custom object (eth.example.MyBlock).
modules:
- name: map_block
kind: map
initialBlock: 12287507
inputs:
- source: sf.ethereum.type.v2.Block
output:
type: proto:eth.example.MyBlockAmong other things, the manifest allows you to define:
How many modules your Substreams uses, along with their corresponding inputs and outputs.
The schema(s) (i.e. the data model) your Substreams uses.
How you will consume the data emitted by your Substreams (SQL, Webhooks...).
specVersion
specVersionExcerpt pulled from the example Substreams manifest.
specVersion: v0.1.0Use v0.1.0 for the specVersion field.
package
packageExcerpt pulled from the example Substreams manifest.
package:
name: module_name_for_project
version: v0.5.0
doc: |
Documentation heading for the package.
More detailed documentation for the package.package.name
package.nameThe package.name field is used to identify the package.
The package.name field infers the filename when the pack command is run by using substreams.yaml as a flag for the Substreams package.
The content of the name field must match the regular expression: ^([a-zA-Z][a-zA-Z0-9_]{0,63})$. For consistency, use the snake_case naming convention.
The regular expression ruleset translates to the following:
64 characters maximum
Separate words by using
_Starts by using
a-zorA-Zand can contain numbers thereafter
package.version
package.versionThe package.version field identifies the package for the Substreams module.
package.url
The package.url field identifies and helps users discover the source of the Substreams package.
package.doc
The package.doc field is the documentation string of the package. The first line is used by the different UIs as a short-form description.
This field should be written in Markdown format.
imports
importsThe imports section allows you to import third-party Substreams packages. It adds local references to modules in those packages, and pulls WASM code, Protobuf and modules into the current Package.
Relying on imports rather than copying source code from third-party packages allows you to leverage server-side caches, and lower your costs.
Example:
imports:
sol: https://spkg.io/streamingfast/solana-explorer-v0.2.0.spkg
# or:
ethereum: substreams-ethereum-v1.0.0.spkg
token: ../eth-token/substreams.yaml
...
modules:
...
inputs:
- map: sol:map_block_without_votes
# replacing:
# inputs:
# - source: sf.solana.type.v1.BlockNote the : separator that signifies to use the imported namespace, as defined under imports.
The filename can be absolute or relative or a remote path prefixed by http:// or https://. It can also be an IPFS reference.
protobuf
protobufThe protobuf section points to the Google Protocol Buffer (protobuf) definitions used by the Rust modules in the Substreams module.
protobuf:
files:
- google/protobuf/timestamp.proto
- pcs/v1/pcs.proto
- pcs/v1/database.proto
importPaths:
- ./proto
- ../../external-protoThe Substreams packager loads files in any of the listed importPaths.
Protobufs and modules are packaged together to help Substreams clients decode the incoming streams. Protobufs are not sent to the Substreams server in network requests.
Learn more about Google Protocol Buffers in the official documentation provided by Google.
protobuf.descriptorSets
protobuf.descriptorSetsThe descriptorSets field allows you to import precompiled Protocol Buffer definitions from the Buf Schema Registry (BSR). This is useful when you want to consume protobuf types from external packages without copying .proto files into your project.
Descriptor sets are precompiled binary representations of protobuf schemas that can be directly loaded by Substreams, enabling efficient type resolution and validation.
Format 1: Separate version field
protobuf:
descriptorSets:
- module: buf.build/streamingfast/substreams-sink-sql
version: v0.1.0Format 2: Inline version with @ notation
protobuf:
descriptorSets:
- module: buf.build/streamingfast/[email protected]Available Fields:
module(required): The full path to the Buf module in the formatbuf.build/organization/repository. Can optionally include the version using@versionnotation.version(optional): Either a valid semantic version (e.g.,v0.1.0,v1.2.3) orlatest.symbols(optional): An array of specific protobuf symbols to import from the descriptor set. If omitted, all types from the descriptor set are available.localPath(optional): Local filesystem path where the descriptor set should be cached or stored.
Complete Example with All Fields:
protobuf:
descriptorSets:
- module: buf.build/streamingfast/substreams-sink-sql
version: v1.0.0
symbols:
- sf.substreams.sink.sql.v1.Service
- sf.substreams.sink.sql.v1.Table
localPath: ./proto-cache/sink-sql.binpb
- module: buf.build/streamingfast/[email protected]
symbols:
- sf.substreams.entity.v1.EntityChangesVersion Validation Rules:
Important: When using inline @version notation:
Versions must be valid semantic versions (e.g.,
v1.0.0,v0.2.5)When using inline
@versionnotation:Only semantic versions are allowed (e.g.,
[email protected])@latestis not allowed, useversion: latestas a separate field or omit the version
You cannot specify the version both inline (with
@) and as a separate field, choose one formatTo use the latest version, either:
Omit the version field entirely
Use
version: latestas a separate field
binaries
binariesThe binaries field specifies the WASM binary code to use when executing modules.
The modules[].binary field uses a default value of default.
binaries:
default:
type: wasm/rust-v1
file: ./target/wasm32-unknown-unknown/release/my_package.wasm
other:
type: wasm/rust-v1
file: ./snapshot_of_my_package.wasmImportant: Defining the default binary is required when creating a Substreams manifest.
See the binary field under modules to see its use.
binaries[name].type
binaries[name].typeThe type of code and implied virtual machine for execution. There is only one virtual machine available that uses a value of: wasm/rust-v1.
binaries[name].file
binaries[name].fileThe binaries[name].file field references a locally compiled WASM module. Paths for the binaries[name].file field are absolute or relative to the manifest's directory. The standard location of the compiled WASM module is the root directory of the Substreams module.
Tip: The WASM file referenced by the binary field is picked up and packaged into an .spkg when invoking the pack and run commands through the substreams CLI.
network
The network field specifies the blockchain where the Substreams will be executed.
network: solanaor
network: ethereumimage
The image field specifies the icon displayed for the Substreams package, which is used in the Substreams Registry. The path is relative to the folder where the manifest is.
image: ./ethereum-icon.pngsink
The sink field specifies the sink you want to use to consume your data (for example, a database).
Sink module
moduleSpecifies the name of the module that emits the data to the sink. For example, db_out or graph_out.
Sink type
typeSpecifies the service used to consume the data. For example, sf.substreams.sink.sql.v1.Service for databases.
Sink config
configSpecifies the configuration specific to every sink. This field is different for every sink.
Database Config
sink:
module: db_out
type: sf.substreams.sink.sql.v1.Service
config:
schema: "./schema.sql"
engine: clickhouse
postgraphile_frontend:
enabled: false
pgweb_frontend:
enabled: false
dbt_config:
enabled: true
files: "./path/to/folder"
run_interval_seconds: 300schema: SQL file specifying the schema.engine:postgresorclickhouse.postgraphile_frontend.enabled: enables or disables the Postgraphile portal.pgweb_frontend.enabled: enables or disables the PGWeb portal.dbt_config: specifies the configuration of dbt engine.enabled: enables or disabled the dbt engine.files: path to the dbt models.run_interval_seconds: execution intervals in seconds.
modules
modulesThis example shows one map module, named events_extractor and one store module, named totals :
- name: events_extractor
kind: map
initialBlock: 5000000
binary: default # Implicit
inputs:
- source: sf.ethereum.type.v2.Block
- store: myimport:prices
output:
type: proto:my.types.v1.Events
doc:
This module extracts events
Use in such and such situations
- name: totals
kind: store
updatePolicy: add
valueType: int64
inputs:
- source: sf.ethereum.type.v2.Block
- map: events_extractorModule name
nameThe identifier for the module, prefixed by a letter, followed by a maximum of 64 characters of [a-zA-Z0-9_]. The same rules applied to the package.name field applies to the module name, including the convention to use snake_case names.
The module name is the reference identifier used on the command line for the substreams run command. The module name is also used in the inputs defined in the Substreams manifest.
The module name also corresponds to the name of the Rust function invoked on the compiled WASM code upon execution. The module name is the same #[substreams::handlers::map] as defined in the Rust code. Maps and stores both work in the same fashion.
Important: When importing another package, all module names are prefixed by the package's name and a colon. Prefixing ensures there are no name clashes across multiple imported packages and almost any name can be safely used for a module name.
Module initialBlock
initialBlockThe initial block for the module is where Substreams begins processing data for a module. The runtime never processes blocks prior to the one for any given module.
If all the inputs have the same initialBlock, the field can be omitted and its value is inferred by its dependent inputs.
initialBlock becomes mandatory when inputs have different values.
Module kind
kindThere are two module types for modules[].kind:
mapstore
Module updatePolicy
updatePolicySpecifies the merge strategy for two contiguous partial stores produced by parallelized operations.
The values for modules[].updatePolicy are defined using specific rules stating:
set, the last key wins the merge strategyset_if_not_exists, the first key wins the merge strategyappend, concatenates two keys' valuesadd, sum the two keys' valuesmin, min between two keys' valuesmax, max between two keys' valuesset_sum, eithersetthe value orsumthe two keys' values
Module valueType
valueTypeTip: The module updatePolicy field is only available for modules of kind: store.
Specifies the data type of all keys in the store, and determines what WASM imports are available to the module and are able to write to the store.
The values for modules[].valueTypes can use various types including:
bigfloatbigintint64bytesstringproto:path.to.custom.protobuf.Model
Tip: The module valueType field is only available for modules of kind: store.
Module binary
binaryAn identifier referring to the binaries section of the Substreams manifest.
The modules[].binary field overrides which binary is used from the binaries declaration section. This means multiple WASM files can be bundled in the Package.
modules:
- name: hello
binary: other
...The default value for binary is default. Therefore, a default binary must be defined under binaries.
Module inputs
inputsinputs:
- params: string
- source: sf.ethereum.type.v2.Block
- store: my_store
mode: deltas
- store: my_store # defaults to mode: get
- map: my_mapThe inputs field is a list of input structures. One of three keys is required for every object.
The key types for inputs include:
sourcestore,used to definemodekeysmapparams
You can find more details about inputs in the Developer Guide's section about Modules.
Module output
outputoutput:
type: proto:eth.erc721.v1.TransfersThe value for type is always prefixed using proto: followed by a definition specified in the protobuf definitions, and referenced in the protobuf section of the Substreams manifest.
Tip: The module output field is only available for modules of kind: map.
Module doc
docThis field should contain Markdown documentation of the module. Use it to describe how to use the params, or what to expect from the module.
params
paramsThe params mapping changes the default values for modules' parameterizable inputs.
modules:
...
params:
module_name: "default value"
"imported:module": "overridden value"You can override those values with the -p parameter of substreams run.
When rolling out your consuming code -- in this example, Python -- you can use something like:
my_mod = [mod for mod in pkg.modules.modules if mod.name == "store_pools"][0]
my_mod.inputs[0].params.value = "myvalue"which would be inserted just before starting the stream.
Params that are defined under networks do not need to be repeated here (their value will be overwritten)
network
networkThe network field specifies the default network to be used with this Substreams. It will help the client choose an endpoint if necessary, and will be used as the default value when applying the values defined under networks.
networks
networksThe networks allows specifying per-network params and initialBlock for each module:
networks:
mainnet:
initialBlock:
mod1: 200
lib:mod1: 400
params:
mod2: "addr=0x1234"
sepolia:
[...]You can override values for modules imported from other .spkg.
Every local module specified under networks must have a value for each network
Last updated
Was this helpful?

