Manifests Reference
StreamingFast Substreams manifests reference
This reference documentation provides a guide for all fields and values used in a Substreams manifest.
Tip: When writing and checking your substreams.yaml
file, it may help to check your manifest against our JSON schema to ensure there are no problems. JSON schemas can be used in Jetbrains and VSCode. Our manifest schema can be seen here.
Manifests overview
In simple terms, a Substreams manifest (substreams.yaml
) is a configuration file (a YAML file) for your Substreams. The manifest file is used for defining properties specific to the current Substreams module and identifying the dependencies between the inputs
and outputs
of modules. For example, the following manifest receives a raw Ethereum block as input (sf.ethereum.type.v2.Block
) and outputs a custom object (eth.example.MyBlock
).
Among other things, the manifest allows you to define:
How many modules your Substreams uses, along with their corresponding inputs and outputs.
The schema(s) (i.e. the data model) your Substreams uses.
How you will consume the data emitted by your Substreams (SQL, Webhooks...).
specVersion
specVersion
Excerpt pulled from the example Substreams manifest.
Use v0.1.0
for the specVersion
field.
package
package
Excerpt pulled from the example Substreams manifest.
package.name
package.name
The package.name
field is used to identify the package.
The package.name
field infers the filename when the pack
command is run by using substreams.yaml
as a flag for the Substreams package.
The content of the name
field must match the regular expression: ^([a-zA-Z][a-zA-Z0-9_]{0,63})$
. For consistency, use the snake_case
naming convention.
The regular expression ruleset translates to the following:
64 characters maximum
Separate words by using
_
Starts by using
a-z
orA-Z
and can contain numbers thereafter
package.version
package.version
The package.version
field identifies the package for the Substreams module.
Note: Thepackage.version
must respect Semantic Versioning, version 2.0
package.url
The package.url
field identifies and helps users discover the source of the Substreams package.
package.doc
The package.doc
field is the documentation string of the package. The first line is used by the different UIs as a short-form description.
This field should be written in Markdown format.
imports
imports
The imports
section allow you to import third-party Substreams packages. It adds local references to modules in those packages, and pull in the WASM code, Protobuf and modules into the current Package.
Relying on imports rather than copying source code from third-party packages allows you to leverage server-side caches, and lower your costs.
Example:
Note the :
separator that signifies to use the imported namespace, as defined under imports
.
The filename can be absolute or relative or a remote path prefixed by http://
or https://
. It can also be an IPFS reference.
protobuf
protobuf
The protobuf
section points to the Google Protocol Buffer (protobuf) definitions used by the Rust modules in the Substreams module.
The Substreams packager loads files in any of the listed importPaths
.
Note: The imports
section of the manifest also affects which .proto
files are used in the final Substreams package.
Protobufs and modules are packaged together to help Substreams clients decode the incoming streams. Protobufs are not sent to the Substreams server in network requests.
Learn more about Google Protocol Buffers in the official documentation provided by Google.
binaries
binaries
The binaries
field specifies the WASM binary code to use when executing modules.
The modules[].binary
field uses a default value of default
.
Important: Defining the default
binary is required when creating a Substreams manifest.
See the binary
field under modules
to see its use.
binaries[name].type
binaries[name].type
The type of code and implied virtual machine for execution. There is only one virtual machine available that uses a value of: wasm/rust-v1
.
binaries[name].file
binaries[name].file
The binaries[name].file
field references a locally compiled WASM module. Paths for the binaries[name].file
field are absolute or relative to the manifest's directory. The standard location of the compiled WASM module is the root directory of the Substreams module.
Tip: The WASM file referenced by the binary
field is picked up and packaged into an .spkg
when invoking the pack
and run
commands through the substreams
CLI.
network
The network
field specifies the blockchain where the Substreams will be executed.
or
image
The image
field specifies the icon displayed for the Substreams package, which is used in the Substreams Registry. The path is relative to the folder where the manifest is.
sink
The sink
field specifies the sink you want to use to consume your data (for example, a database or a subgraph).
Sink module
module
Specifies the name of the module that emits the data to the sink. For example, db_out
or graph_out
.
Sink type
type
Specifies the service used to consume the data. For example, sf.substreams.sink.subgraph.v1.Service
for subgraphs, or sf.substreams.sink.sql.v1.Service
for databases.
Sink config
config
Specifies the configuration specific to every sink. This field is different for every sink.
Database Config
schema
: SQL file specifying the schema.engine
:postgres
orclickhouse
.postgraphile_frontend.enabled
: enables or disables the Postgraphile portal.pgweb_frontend.enabled
: enables or disables the PGWeb portal.dbt_config
: specifies the configuration of dbt engine.enabled
: enables or disabled the dbt engine.files
: path to the dbt models.run_interval_seconds
: execution intervals in seconds.
Subgraph Config
schema
: path to the GraphQL schema.subgraph_yaml
: path to the Subgraph manifest.
modules
modules
This example shows one map module, named events_extractor
and one store module, named totals
:
Module name
name
The identifier for the module, prefixed by a letter, followed by a maximum of 64 characters of [a-zA-Z0-9_]
. The same rules applied to the package.name
field applies to the module name
, including the convention to use snake_case
names.
The module name
is the reference identifier used on the command line for the substreams
run
command. The module name
is also used in the inputs
defined in the Substreams manifest.
The module name
also corresponds to the name of the Rust function invoked on the compiled WASM code upon execution. The module name
is the same #[substreams::handlers::map]
as defined in the Rust code. Maps and stores both work in the same fashion.
Important: When importing another package, all module names are prefixed by the package's name and a colon. Prefixing ensures there are no name clashes across multiple imported packages and almost any name can be safely used for a module name
.
Module initialBlock
initialBlock
The initial block for the module is where Substreams begins processing data for a module. The runtime never processes blocks prior to the one for any given module.
If all the inputs have the same initialBlock
, the field can be omitted and its value is inferred by its dependent inputs
.
initialBlock
becomes mandatory when inputs have different values.
Module kind
kind
There are two module types for modules[].kind
:
map
store
Module updatePolicy
updatePolicy
Specifies the merge strategy for two contiguous partial stores produced by parallelized operations.
The values for modules[].updatePolicy
are defined using specific rules stating:
set
, the last key wins the merge strategyset_if_not_exists
, the first key wins the merge strategyappend
, concatenates two keys' valuesadd
, sum the two keys' valuesmin
, min between two keys' valuesmax
, max between two keys' valuesset_sum
, eitherset
the value orsum
the two keys' values
Module valueType
valueType
Tip: The module updatePolicy
field is only available for modules of kind: store
.
Specifies the data type of all keys in the store
, and determines what WASM imports are available to the module and are able to write to the store
.
The values for modules[].valueTypes
can use various types including:
bigfloat
bigint
int64
bytes
string
proto:path.to.custom.protobuf.Model
Tip: The module valueType
field is only available for modules of kind: store
.
Module binary
binary
An identifier referring to the binaries
section of the Substreams manifest.
The modules[].binary
field overrides which binary is used from the binaries
declaration section. This means multiple WASM files can be bundled in the Package.
The default value for binary
is default
. Therefore, a default
binary must be defined under binaries
.
Module inputs
inputs
The inputs
field is a list of input structures. One of three keys is required for every object.
The key types for inputs
include:
source
store,
used to definemode
keysmap
params
You can find more details about inputs in the Developer Guide's section about Modules.
Module output
output
The value for type
is always prefixed using proto:
followed by a definition specified in the protobuf definitions, and referenced in the protobuf
section of the Substreams manifest.
Tip: The module output
field is only available for modules of kind: map
.
Module doc
doc
This field should contain Markdown documentation of the module. Use it to describe how to use the params, or what to expect from the module.
params
params
The params
mapping changes the default values for modules' parameterizable inputs.
You can override those values with the -p
parameter of substreams run
.
When rolling out your consuming code -- in this example, Python -- you can use something like:
which would be inserted just before starting the stream.
Params that are defined under networks
do not need to be repeated here (their value will be overwritten)
network
network
The network
field specifies the default network to be used with this Substreams. It will help the client choose an endpoint if necessary, and will be used as the default value when applying the values defined under networks
.
networks
networks
The networks
allows specifying per-network params
and initialBlock
for each module:
You can override values for modules imported from other .spkg.
Every local module specified under networks
must have a value for each network
Last updated