ERC-4626: scanning vault data
Here is an example how to read the ERC-4626 vault historical data. We need this data in order to run other notebooks here,
which then analyse the performance of the vaults. The tutorial shows how to create a local
vault databases with list of vaults across all chains. It is based on open source pipeline in web3-ethereum-defi repository.
You need
Expert Python knowledge to work with complex Python projects
JSON-RPC archive nodes for various chains, e.g. from dRPC
UNIX or Windows Subsystem for Linux (WSL) environment
Preferably
screenortmuxto or similar utility to run long running processes in the background on serversSome hours of patience
This is a three step scripted process:
- For each chain
Discover ERC-4626 vaults the chain
Scan their historical prices
- And afterwards
Note
Pipeline open source code must be updated to accommodate new chains, with chain ids, names and such.
Scanning vaults for a single chain
Discovering vaults
To scan a single chain first we need to discover the vaults on the chain. This is done by scan-vaults.py script.
# Point to HTTPS RPC server for your chain
export JSON_RPC_URL=...
python scripts/erc-4626/scan-vaults.py
This script will create file ~/.tradingstrategy/vaults/vault-db.pickle with the vaults found on the chain,
plus all other vaults across other chains we have scanned so far.
The console output looks like:
Scanning historical prices
After discovering the vaults on a chain, we scan their historical performance.
This is done by scan-prices.py script. It will read the vaults from the database file created by the previous step.
then use JSON-RPC archive nodes polling to extract historical prices and parameters like performance fees.
Scan process is stateful - It can resume, you can rerun the script and it will rescan from where the scan ended last time - Using the state, we filter out vaults that are not interesting, e.g. vaults that become
dead after certain point of time, to keep the amount of JSON-RPC calls lower. This will mean that some vault data might be incorrectly discarded if it does not pass our filters for being a viable vault.
The default scan is set to 1h interval.
This will write
~/tradingstrategy/vaults/vault-prices-1h.parquetfile with the historical prices~/tradingstrategy/vaults/vault-reader-state-1h.parquetto store the latest block scanned for each vault
export JSON_RPC_URL=...
python scripts/erc-4626/scan-prices.py
Output looks like:
Scanning vault historical prices on chain 999: Hyperliquid
Chain Hyperliquid has 12 vaults in the vault detection database
After filtering vaults for non-interesting entries, we have 6 vaults left
Loading token metadata for 6 addresses using 8 workers: 0%| | 0/1 [00:00<?, ?it/s]
Preparing historical multicalls for 6 readers using 12 workers: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:02<00:00, 2.92 readers/s]
Reading historical vault price data for chain 999 with 12 workers, blocks 68,843 - 2,206,919: 3it [00:02, 1.15it/s, Active vaults=2, Last block at=2025-03-11 01:12:36]
Token cache size is 802,816
Scan complete
{'chain_id': 999,
'chunks_done': 1,
'existing': True,
'existing_row_count': 119592,
'file_size': 1164518,
'output_fname': PosixPath('/Users/moo/.tradingstrategy/vaults/vault-prices.parquet'),
'rows_deleted': 0,
'rows_written': 15}
Cleaning data
The raw vault data contains a lot of abnormalities like almost infinite profits, broken smart contracts, missing names and so on.
Cleaning only supports stablecoin-nominated vaults, i.e. vaults that have denomination token in stablecoin. Cleaning process currently discards the data for other denonimations. If you need to access e.g. ETH-nominated vaults, you need to clean the data yourselfs
Denormalise vault data to a single Parquet/Dataframe that can be handled without
vault-db.picklefile, in any programming environmentWe calculate 1h returns for each vault
We calculate rolling returns and such performance metrics
The script will
- Read ~/tradingstrategy/vaults/vault-prices-1h.parquet
- Write ~/tradingstrategy/cleaned-vaults/vault-prices-1h.parquet
python scripts/erc-4626/clean-prices.py
Scanning all chains
There is`scan-vaults-all-chains.sh <https://github.com/tradingstrategy-ai/web3-ethereum-defi/blob/master/scripts/erc-4626/scan-vaults-all-chains.sh>`__ shell script to scan vaults across multiple chains.
You need to feed it multiple RPC endpoints like:
export JSON_RPC_ETHEREUM=...
export JSON_RPC_BASE=...
SCAN_PRICES=true scripts/erc-4626/scan-vaults-all-chains.sh
Further reading
See ERC-4626 API API documentation.