Getting Started with HyperSync

HyperSync is Envio's high-performance blockchain data engine that provides up to 2000x faster access to blockchain data compared to traditional RPC endpoints. This guide will help you understand how to effectively use HyperSync in your applications.

Quick Start Video

Watch this quick tutorial to see HyperSync in action:

Core Concepts

HyperSync revolves around two main concepts:

Queries - Define what blockchain data you want to retrieve
Output Configuration - Specify how you want that data formatted and delivered

Think of queries as your data filter and the output configuration as your data processor.

Building Effective Queries

Queries are the heart of working with HyperSync. They allow you to filter for specific blocks, logs, transactions, and traces.

Query Structure

A basic HyperSync query contains:

query = hypersync.Query(
    from_block=12345678,               # Required: Starting block number
    to_block=12345778,                 # Optional: Ending block number
    field_selection=field_selection,   # Required: What fields to return
    logs=[log_selection],              # Optional: Filter for specific logs
    transactions=[tx_selection],       # Optional: Filter for specific transactions
    traces=[trace_selection],          # Optional: Filter for specific traces
    include_all_blocks=False,          # Optional: Include blocks with no matches
    max_num_blocks=1000,               # Optional: Limit number of blocks processed
    max_num_transactions=5000,         # Optional: Limit number of transactions processed
    max_num_logs=5000,                 # Optional: Limit number of logs processed
    max_num_traces=5000                # Optional: Limit number of traces processed
)

Field Selection

Field selection allows you to specify exactly which data fields you want to retrieve. This improves performance by only fetching what you need:

field_selection = hypersync.FieldSelection(
    # Block fields you want to retrieve
    block=[
        BlockField.NUMBER,
        BlockField.TIMESTAMP,
        BlockField.HASH
    ],

    # Transaction fields you want to retrieve
    transaction=[
        TransactionField.HASH,
        TransactionField.FROM,
        TransactionField.TO,
        TransactionField.VALUE
    ],

    # Log fields you want to retrieve
    log=[
        LogField.ADDRESS,
        LogField.TOPIC0,
        LogField.TOPIC1,
        LogField.TOPIC2,
        LogField.TOPIC3,
        LogField.DATA,
        LogField.TRANSACTION_HASH
    ],

    # Trace fields you want to retrieve (if applicable)
    trace=[
        TraceField.ACTION_FROM,
        TraceField.ACTION_TO,
        TraceField.ACTION_VALUE
    ]
)

Filtering for Specific Data

For most use cases, you'll want to filter for specific logs, transactions, or traces:

Log Selection Example

# Filter for Transfer events from USDC contract
log_selection = hypersync.LogSelection(
    address=["0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48"],  # USDC contract
    topics=[
        ["0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef"]  # Transfer event signature
    ]
)

Transaction Selection Example

# Filter for transactions to the Uniswap V3 router
tx_selection = hypersync.TransactionSelection(
    to=["0xE592427A0AEce92De3Edee1F18E0157C05861564"]  # Uniswap V3 Router
)

Processing the Results

HyperSync provides multiple ways to process query results:

Stream to Parquet Files

Parquet is the recommended format for large data sets:

# Configure output format
config = hypersync.StreamConfig(
    hex_output=hypersync.HexOutput.PREFIXED,
    event_signature="Transfer(address indexed from, address indexed to, uint256 value)"
)

# Stream results to a Parquet file
await client.collect_parquet("data_directory", query, config)

Stream to JSON Files

For smaller datasets or debugging:

# Stream results to JSON
await client.collect_json("output.json", query, config)

Process Data in Memory

For immediate processing:

# Process data directly
async for result in client.stream(query, config):
    for log in result.logs:
        # Process each log
        print(f"Transfer from {log.event_params['from']} to {log.event_params['to']}")

Tips and Best Practices

Performance Optimization

Use Appropriate Batch Sizes: Adjust batch size based on your chain and use case:

config = hypersync.ParquetConfig(
    path="data",
    hex_output=hypersync.HexOutput.PREFIXED,
    batch_size=1000000,  # Process 1M blocks at a time
    concurrency=10,      # Use 10 concurrent workers
)

Enable Trace Logs: Set RUST_LOG=trace to see detailed progress:
```
export RUST_LOG=trace
```

Paginate Large Queries: HyperSync requests have a 5-second time limit. For large data sets, paginate results:

current_block = start_block
while current_block < end_block:
    query.from_block = current_block
    query.to_block = min(current_block + 1000000, end_block)
    result = await client.collect_parquet("data", query, config)
    current_block = result.end_block + 1

Network-Specific Considerations

High-Volume Networks: For networks like Ethereum Mainnet, use smaller block ranges or more specific filters
Low-Volume Networks: For smaller chains, you can process the entire chain in one query

Complete Example

Here's a complete example that fetches all USDC Transfer events:

import hypersync
from hypersync import (
    LogSelection,
    LogField,
    BlockField,
    FieldSelection,
    TransactionField,
    HexOutput
)
import asyncio

async def collect_usdc_transfers():
    # Initialize client
    client = hypersync.HypersyncClient(
        hypersync.ClientConfig(
            url="https://ethereum.hypersync.xyz",
            bearer_token="your-token-here",  # Get from https://docs.envio.dev/docs/HyperSync/api-tokens
        )
    )

    # Define field selection
    field_selection = hypersync.FieldSelection(
        block=[BlockField.NUMBER, BlockField.TIMESTAMP],
        transaction=[TransactionField.HASH],
        log=[
            LogField.ADDRESS,
            LogField.TOPIC0,
            LogField.TOPIC1,
            LogField.TOPIC2,
            LogField.DATA,
        ]
    )

    # Define query for USDC transfers
    query = hypersync.Query(
        from_block=12000000,
        to_block=12100000,
        field_selection=field_selection,
        logs=[
            LogSelection(
                address=["0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48"],  # USDC contract
                topics=[
                    ["0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef"]  # Transfer signature
                ]
            )
        ]
    )

    # Configure output
    config = hypersync.StreamConfig(
        hex_output=HexOutput.PREFIXED,
        event_signature="Transfer(address indexed from, address indexed to, uint256 value)"
    )

    # Collect data to a Parquet file
    result = await client.collect_parquet("usdc_transfers", query, config)
    print(f"Processed blocks {query.from_block} to {result.end_block}")

asyncio.run(collect_usdc_transfers())

Decoding Event Logs

When working with blockchain data, event logs contain encoded data that needs to be properly decoded to extract meaningful information. HyperSync provides powerful decoding capabilities to simplify this process.

Understanding Log Structure

Event logs in Ethereum have the following structure:

Address: The contract that emitted the event
Topic0: The event signature hash (keccak256 of the event signature)
Topics 1-3: Indexed parameters (up to 3)
Data: Non-indexed parameters packed together

Using the Decoder

HyperSync's client libraries include a Decoder class that can parse these raw logs into structured data:

// Create a decoder with event signatures
const decoder = Decoder.fromSignatures([
  "Transfer(address indexed from, address indexed to, uint256 amount)",
  "Approval(address indexed owner, address indexed spender, uint256 amount)",
]);

// Decode logs
const decodedLogs = await decoder.decodeLogs(logs);

Single vs. Multiple Event Types

HyperSync provides flexibility to decode different types of event logs:

Single Event Type: For processing one type of event (e.g., only Swap events)
- See complete example: run-decoder.js
Multiple Event Types: For processing different events from the same contract (e.g., Transfer and Approval)
- See complete example: run-decoder-multi.js

Working with Decoded Data

After decoding, you can access the log parameters in a structured way:

Indexed parameters: Available in decodedLog.indexed array
Non-indexed parameters: Available in decodedLog.body array

Each parameter object contains:

name: The parameter name from the signature
type: The Solidity type
val: The actual value

For example, to access parameters from a Transfer event:

// Access indexed parameters (from, to)
const from = decodedLog.indexed[0]?.val.toString();
const to = decodedLog.indexed[1]?.val.toString();

// Access non-indexed parameters (amount)
const amount = decodedLog.body[0]?.val.toString();

Benefits of Using the Decoder

Type Safety: Values are properly converted to their corresponding types
Simplified Access: Direct access to named parameters
Batch Processing: Decode multiple logs with a single call
Multiple Event Support: Handle different event types in the same processing pipeline

Next Steps

Now that you understand the basics of using HyperSync:

Browse the Python Client or other language-specific clients
Learn about advanced query options
See example queries for common use cases
Get your API token to start building

For detailed API references and examples in other languages, check our client documentation.

Getting Started with HyperSync

Quick Start Video​

Core Concepts​

Building Effective Queries​

Query Structure​

Field Selection​

Filtering for Specific Data​

Log Selection Example​

Transaction Selection Example​

Processing the Results​

Stream to Parquet Files​

Stream to JSON Files​

Process Data in Memory​

Tips and Best Practices​

Performance Optimization​

Network-Specific Considerations​

Complete Example​

Decoding Event Logs​

Understanding Log Structure​

Using the Decoder​

Single vs. Multiple Event Types​

Working with Decoded Data​

Benefits of Using the Decoder​

Next Steps​