Run a Substreams
Last updated
Last updated
Running a Substreams means executing a Substreams package and consuming the data emitted. The data produced by a Substreams can be consumed in a variety of ways, such as using the CLI, a SQL database or a simple files.
A typical Substreams local project contains Protobufs (1
), modules (2
) and a manifest (substreams.yaml
) file (3
):
If you are currently working on a local project, you can run your Substreams by specifying the location of the manifest:
The --start-block
flag specifies the starting block of the Substreams (i.e. at which block the Substreams will start indexing data)
Note: In Solana, --start-block
specifies the slot, not the block numnber.
run
First, start the substreams
CLI passing it a run
command.
The server address is required by Substreams to connect to for data retrieval. The data provider for Substreams is located at the address, which is a running Firehose instance.
-e mainnet.eth.streamingfast.io:443
Inform Substreams where to find the substreams.yaml
configuration file.
Note: The substreams.yaml
configuration file argument in the command is optional if you are within the root folder of your Substreams and your manifest file is named `substreams.yaml.
The map_transfers
module is defined in the manifest and it is the module run by Substreams.
Start mapping at the specific block 12292922
by using passing the flag and block number --start-block 12292922
.
Cease block processing by using --stop-block +1.
The +1
option requests a single block. In the example, the next block is 12292923
.
Messages are printed to the terminal for successfully installed and configured Substreams setups.
The substreams
run
command outputs:
If you have an spkg
file, you can run it by providing a path to the file:
Substreams has two mode when executing your module(s) either development mode or production mode. Development and production modes impact the execution of Substreams, important aspects of execution include:
The time required to reach the first byte.
The speed that large ranges get executed.
The module logs and outputs sent back to the client.
Differences between production and development modes include:
Forward parallel execution is enabled in production mode and disabled in development mode
The time required to reach the first byte in development mode is faster than in production mode.
Specific attributes of development mode include:
The client will receive all of the executed module's logs.
It's possible to request specific store snapshots in the execution tree.
Multiple module's output is possible.
In most cases, you will run production mode, using a Substreams sink. Development mode is enabled by default in the CLI unless the -p
flag is specified.
Examples: (given the dependencies: [block] --> [map_pools] --> [store_pools] --> [map_transfers])
Running the substreams run substreams.yaml map_transfers
command executes in development mode and only prints the map_transfers
module's outputs and logs.
Running the substreams run substreams.yaml map_transfers --debug-modules-output=map_pools,map_transfers,store_pools
command executes in development mode and only prints the outputs of the map_pools
, map_transfers
, and store_pools
modules.
Running the substreams run substreams.yaml map_transfers -s 1000 -t +5 --debug-modules-initial-snapshot=store_pools
command executes in development mode and prints all the entries in the store_pools
module at block 999, then continues with outputs and logs from the map_transfers
module in blocks 1000 through 1004.