hyperliquid.vault_scanner

Documentation for eth_defi.hyperliquid.vault_scanner Python module.

Hyperliquid vault scanner with DuckDB storage.

This module provides functionality for scanning all Hyperliquid vaults and storing historical snapshots in a DuckDB database for tracking TVL, PnL, and other metrics over time.

Example usage:

from pathlib import Path
from eth_defi.hyperliquid.vault_scanner import scan_vaults

# Scan all vaults and store in database (uses default path)
scan_vaults()

Module Attributes

AGE_THRESHOLD

Age threshold for disabling low TVL vaults Only vaults older than this threshold can be disabled for low TVL

Functions

calculate_total_pnl(pnl_all_time)

Calculate the total PnL from the all-time PnL history array.

scan_vaults(session[, db_path, stats_url, ...])

Scan all Hyperliquid vaults and store snapshots in DuckDB.

Classes

ScanDisabled

Reasons why a vault may be excluded from scanning.

VaultSnapshot

A point-in-time snapshot of a Hyperliquid vault's state.

VaultSnapshotDatabase

DuckDB database for storing Hyperliquid vault snapshots over time.

HYPERLIQUID_VAULT_METADATA_DATABASE = PosixPath('/home/runner/.tradingstrategy/hyperliquid/vaults.duckdb')

Default path for Hyperliquid vault metadata database

MIN_TVL_THRESHOLD = Decimal('1000')

Minimum TVL threshold in USD for scanning vaults Vaults below this threshold AND older than AGE_THRESHOLD will be marked as disabled with ScanDisabled.not_enough_tvl

AGE_THRESHOLD = datetime.timedelta(days=30)

Age threshold for disabling low TVL vaults Only vaults older than this threshold can be disabled for low TVL

class ScanDisabled

Bases: enum.Enum

Reasons why a vault may be excluded from scanning.

Stored as VARCHAR in DuckDB using the enum value (snake_case string).

not_enough_tvl = 'not_enough_tvl'

Vault TVL is below the threshold for scanning

manual = 'manual'

Vault has been manually disabled from scanning

class VaultSnapshot

Bases: object

A point-in-time snapshot of a Hyperliquid vault’s state.

Contains the key metrics we want to track over time for each vault.

snapshot_timestamp: datetime.datetime

When this snapshot was taken

vault_address: eth_typing.evm.HexAddress

Vault’s blockchain address

name: str

Vault display name

leader: eth_typing.evm.HexAddress

Vault manager/operator address

is_closed: bool

Whether vault is closed for deposits

relationship_type: str

Vault relationship type (normal, child, parent)

create_time: datetime.datetime | None

Vault creation timestamp

tvl: decimal.Decimal

Total Value Locked (USD)

apr: float | None

Annual Percentage Rate (as decimal, e.g., 0.15 = 15%)

total_pnl: decimal.Decimal | None

All-time PnL (sum of pnl_all_time array)

follower_count: int | None

Number of followers/depositors in the vault Note: Hyperliquid API returns at most 100 followers, so this value maxes out at 100

scan_disabled_reason: eth_defi.hyperliquid.vault_scanner.ScanDisabled | None

Reason why this vault is disabled from future scans, or None if enabled

__init__(snapshot_timestamp, vault_address, name, leader, is_closed, relationship_type, create_time, tvl, apr, total_pnl, follower_count, scan_disabled_reason=None)
Parameters
Return type

None

calculate_total_pnl(pnl_all_time)

Calculate the total PnL from the all-time PnL history array.

The pnl_all_time array contains cumulative PnL values. The last value represents the current total all-time PnL.

Parameters

pnl_all_time (list[str] | None) – List of PnL values as strings from VaultSummary.pnl_all_time

Returns

Total all-time PnL as Decimal, or None if no data

Return type

decimal.Decimal | None

class VaultSnapshotDatabase

Bases: object

DuckDB database for storing Hyperliquid vault snapshots over time.

Stores point-in-time snapshots of vault metrics including TVL, PnL, APR, and follower count. Each snapshot is keyed by timestamp and vault address.

Example:

from pathlib import Path
from eth_defi.hyperliquid.vault_scanner import VaultSnapshotDatabase

db = VaultSnapshotDatabase(Path("vaults.duckdb"))

# Query recent snapshots
df = db.get_latest_snapshots()
print(df)

db.close()

Initialise the database connection.

Parameters

path – Path to the DuckDB file. Parent directories will be created if needed.

__init__(path)

Initialise the database connection.

Parameters

path (pathlib.Path) – Path to the DuckDB file. Parent directories will be created if needed.

insert_snapshot(snapshot)

Insert a single vault snapshot into the database.

Parameters

snapshot (eth_defi.hyperliquid.vault_scanner.VaultSnapshot) – VaultSnapshot to insert

insert_snapshots(snapshots)

Bulk insert vault snapshots into the database.

Parameters

snapshots (Iterator[eth_defi.hyperliquid.vault_scanner.VaultSnapshot]) – Iterator of VaultSnapshot objects to insert

get_latest_snapshots()

Get the most recent snapshot for each vault.

Returns

DataFrame with the latest snapshot for each vault address

Return type

pandas.DataFrame

get_vault_history(vault_address)

Get all snapshots for a specific vault.

Parameters

vault_address (eth_typing.evm.HexAddress) – The vault’s blockchain address

Returns

DataFrame with all snapshots for the vault, ordered by timestamp

Return type

pandas.DataFrame

get_snapshots_at_time(timestamp)

Get all vault snapshots at a specific timestamp.

Parameters

timestamp (datetime.datetime) – The snapshot timestamp to query

Returns

DataFrame with all vault snapshots at that timestamp

Return type

pandas.DataFrame

get_snapshot_timestamps()

Get all unique snapshot timestamps in the database.

Returns

List of snapshot timestamps, ordered from oldest to newest

Return type

list[datetime.datetime]

get_count()

Get total number of snapshot records in the database.

Returns

Total count of snapshot records

Return type

int

get_vault_count()

Get number of unique vaults in the database.

Returns

Count of unique vault addresses

Return type

int

get_disabled_vault_addresses()

Get vault addresses that have scan_disabled_reason set in their latest snapshot.

Returns

Set of vault addresses that should be skipped during scanning

Return type

set[eth_typing.evm.HexAddress]

save()

Force a checkpoint to ensure data is written to disk.

close()

Close the database connection.

is_closed()

Check if the database connection is closed.

Return type

bool

scan_vaults(session, db_path=PosixPath('/home/runner/.tradingstrategy/hyperliquid/vaults.duckdb'), stats_url='https://stats-data.hyperliquid.xyz/Mainnet/vaults', fetch_follower_counts=True, timeout=30.0, limit=None, max_workers=16)

Scan all Hyperliquid vaults and store snapshots in DuckDB.

This function fetches all vault summaries from the Hyperliquid API, calculates key metrics (TVL, PnL, etc.), and stores a timestamped snapshot for each vault in the database.

Example:

from eth_defi.hyperliquid.session import create_hyperliquid_session
from eth_defi.hyperliquid.vault_scanner import scan_vaults

session = create_hyperliquid_session()
db = scan_vaults(session)

# Get latest snapshot for each vault
df = db.get_latest_snapshots()
print(f"Scanned {len(df)} vaults")

db.close()
Parameters
  • session (requests.sessions.Session) – HTTP session for API requests. Use eth_defi.hyperliquid.session.create_hyperliquid_session() to create one.

  • db_path (pathlib.Path) – Path to the DuckDB database file. Defaults to ~/.tradingstrategy/hyperliquid/vaults.duckdb.

  • stats_url (str) – Hyperliquid stats-data API URL for vault listing

  • fetch_follower_counts (bool) – If True, fetch detailed vault info to get follower counts. This requires an additional API call per vault and is slower.

  • timeout (float) – HTTP request timeout in seconds

  • limit (int | None) – Limit the number of vaults to scan. Internal testing only.

  • max_workers (int) – Maximum number of parallel workers for fetching vault details. Defaults to 16.

Returns

VaultSnapshotDatabase instance with the newly inserted snapshots

Return type

eth_defi.hyperliquid.vault_scanner.VaultSnapshotDatabase

Note

The session’s rate limiter restricts requests to 1/second by default. Having many parallel workers does not speed up processing - they will queue behind the rate limiter. With ~8000 vaults and 1 req/sec, a full scan takes approximately 2-3 hours. If you encounter 429 errors after retries are exhausted, the Hyperliquid API is rate limiting you beyond what the client-side limiter can prevent.