Skip to main content

Getting Started with HyperSync

HyperSync is Envio's high-performance blockchain data engine that provides up to 2000x faster access to blockchain data compared to traditional RPC endpoints. This guide will help you understand how to effectively use HyperSync in your applications.

Quick Start Video

Watch this quick tutorial to see HyperSync in action:

Core Concepts

HyperSync revolves around two main concepts:

  1. Queries - Define what blockchain data you want to retrieve
  2. Output Configuration - Specify how you want that data formatted and delivered

Think of queries as your data filter and the output configuration as your data processor.

Building Effective Queries

Queries are the heart of working with HyperSync. They allow you to filter for specific blocks, logs, transactions, and traces.

Query Structure

A basic HyperSync query contains:

query = hypersync.Query(
from_block=12345678, # Required: Starting block number
to_block=12345778, # Optional: Ending block number
field_selection=field_selection, # Required: What fields to return
logs=[log_selection], # Optional: Filter for specific logs
transactions=[tx_selection], # Optional: Filter for specific transactions
traces=[trace_selection], # Optional: Filter for specific traces
include_all_blocks=False, # Optional: Include blocks with no matches
max_num_blocks=1000, # Optional: Limit number of blocks processed
max_num_transactions=5000, # Optional: Limit number of transactions processed
max_num_logs=5000, # Optional: Limit number of logs processed
max_num_traces=5000 # Optional: Limit number of traces processed
)

Field Selection

Field selection allows you to specify exactly which data fields you want to retrieve. This improves performance by only fetching what you need:

field_selection = hypersync.FieldSelection(
# Block fields you want to retrieve
block=[
BlockField.NUMBER,
BlockField.TIMESTAMP,
BlockField.HASH
],

# Transaction fields you want to retrieve
transaction=[
TransactionField.HASH,
TransactionField.FROM,
TransactionField.TO,
TransactionField.VALUE
],

# Log fields you want to retrieve
log=[
LogField.ADDRESS,
LogField.TOPIC0,
LogField.TOPIC1,
LogField.TOPIC2,
LogField.TOPIC3,
LogField.DATA,
LogField.TRANSACTION_HASH
],

# Trace fields you want to retrieve (if applicable)
trace=[
TraceField.ACTION_FROM,
TraceField.ACTION_TO,
TraceField.ACTION_VALUE
]
)

Filtering for Specific Data

For most use cases, you'll want to filter for specific logs, transactions, or traces:

Log Selection Example

# Filter for Transfer events from USDC contract
log_selection = hypersync.LogSelection(
address=["0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48"], # USDC contract
topics=[
["0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef"] # Transfer event signature
]
)

Transaction Selection Example

# Filter for transactions to the Uniswap V3 router
tx_selection = hypersync.TransactionSelection(
to=["0xE592427A0AEce92De3Edee1F18E0157C05861564"] # Uniswap V3 Router
)

Processing the Results

HyperSync provides multiple ways to process query results:

Stream to Parquet Files

Parquet is the recommended format for large data sets:

# Configure output format
config = hypersync.StreamConfig(
hex_output=hypersync.HexOutput.PREFIXED,
event_signature="Transfer(address indexed from, address indexed to, uint256 value)"
)

# Stream results to a Parquet file
await client.collect_parquet("data_directory", query, config)

Stream to JSON Files

For smaller datasets or debugging:

# Stream results to JSON
await client.collect_json("output.json", query, config)

Process Data in Memory

For immediate processing:

# Process data directly
async for result in client.stream(query, config):
for log in result.logs:
# Process each log
print(f"Transfer from {log.event_params['from']} to {log.event_params['to']}")

Tips and Best Practices

Performance Optimization

  • Use Appropriate Batch Sizes: Adjust batch size based on your chain and use case:

    config = hypersync.ParquetConfig(
    path="data",
    hex_output=hypersync.HexOutput.PREFIXED,
    batch_size=1000000, # Process 1M blocks at a time
    concurrency=10, # Use 10 concurrent workers
    )
  • Enable Trace Logs: Set RUST_LOG=trace to see detailed progress:

    export RUST_LOG=trace
  • Paginate Large Queries: HyperSync requests have a 5-second time limit. For large data sets, paginate results:

    current_block = start_block
    while current_block < end_block:
    query.from_block = current_block
    query.to_block = min(current_block + 1000000, end_block)
    result = await client.collect_parquet("data", query, config)
    current_block = result.end_block + 1

Network-Specific Considerations

  • High-Volume Networks: For networks like Ethereum Mainnet, use smaller block ranges or more specific filters
  • Low-Volume Networks: For smaller chains, you can process the entire chain in one query

Complete Example

Here's a complete example that fetches all USDC Transfer events:

import hypersync
from hypersync import (
LogSelection,
LogField,
BlockField,
FieldSelection,
TransactionField,
HexOutput
)
import asyncio

async def collect_usdc_transfers():
# Initialize client
client = hypersync.HypersyncClient(
hypersync.ClientConfig(
url="https://ethereum.hypersync.xyz",
bearer_token="your-token-here", # Get from https://docs.envio.dev/docs/HyperSync/api-tokens
)
)

# Define field selection
field_selection = hypersync.FieldSelection(
block=[BlockField.NUMBER, BlockField.TIMESTAMP],
transaction=[TransactionField.HASH],
log=[
LogField.ADDRESS,
LogField.TOPIC0,
LogField.TOPIC1,
LogField.TOPIC2,
LogField.DATA,
]
)

# Define query for USDC transfers
query = hypersync.Query(
from_block=12000000,
to_block=12100000,
field_selection=field_selection,
logs=[
LogSelection(
address=["0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48"], # USDC contract
topics=[
["0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef"] # Transfer signature
]
)
]
)

# Configure output
config = hypersync.StreamConfig(
hex_output=HexOutput.PREFIXED,
event_signature="Transfer(address indexed from, address indexed to, uint256 value)"
)

# Collect data to a Parquet file
result = await client.collect_parquet("usdc_transfers", query, config)
print(f"Processed blocks {query.from_block} to {result.end_block}")

asyncio.run(collect_usdc_transfers())

Decoding Event Logs

When working with blockchain data, event logs contain encoded data that needs to be properly decoded to extract meaningful information. HyperSync provides powerful decoding capabilities to simplify this process.

Understanding Log Structure

Event logs in Ethereum have the following structure:

  • Address: The contract that emitted the event
  • Topic0: The event signature hash (keccak256 of the event signature)
  • Topics 1-3: Indexed parameters (up to 3)
  • Data: Non-indexed parameters packed together

Using the Decoder

HyperSync's client libraries include a Decoder class that can parse these raw logs into structured data:

// Create a decoder with event signatures
const decoder = Decoder.fromSignatures([
"Transfer(address indexed from, address indexed to, uint256 amount)",
"Approval(address indexed owner, address indexed spender, uint256 amount)",
]);

// Decode logs
const decodedLogs = await decoder.decodeLogs(logs);

Single vs. Multiple Event Types

HyperSync provides flexibility to decode different types of event logs:

  • Single Event Type: For processing one type of event (e.g., only Swap events)

  • Multiple Event Types: For processing different events from the same contract (e.g., Transfer and Approval)

Working with Decoded Data

After decoding, you can access the log parameters in a structured way:

  • Indexed parameters: Available in decodedLog.indexed array
  • Non-indexed parameters: Available in decodedLog.body array

Each parameter object contains:

  • name: The parameter name from the signature
  • type: The Solidity type
  • val: The actual value

For example, to access parameters from a Transfer event:

// Access indexed parameters (from, to)
const from = decodedLog.indexed[0]?.val.toString();
const to = decodedLog.indexed[1]?.val.toString();

// Access non-indexed parameters (amount)
const amount = decodedLog.body[0]?.val.toString();

Benefits of Using the Decoder

  • Type Safety: Values are properly converted to their corresponding types
  • Simplified Access: Direct access to named parameters
  • Batch Processing: Decode multiple logs with a single call
  • Multiple Event Support: Handle different event types in the same processing pipeline

Next Steps

Now that you understand the basics of using HyperSync:

For detailed API references and examples in other languages, check our client documentation.