> For the complete documentation index, see [llms.txt](https://docs.envio.dev/llms.txt).

# HyperSync Query

This guide explains how to structure queries for HyperSync to efficiently retrieve blockchain data. You'll learn both the basics and advanced techniques to make the most of HyperSync's powerful querying capabilities.

:::note HyperFuel Limitations
Not all features implemented in HyperSync are available in HyperFuel (the Fuel implementation of HyperSync). For example, as of this writing, stream and collect functions aren't implemented in the Fuel client.
:::

## Client Examples

HyperSync offers client libraries in multiple languages, each with its own comprehensive examples. Instead of providing generic examples here, we recommend exploring the language-specific examples:

| Client      | Example Links                                                                                |
| ----------- | -------------------------------------------------------------------------------------------- |
| **Node.js** | [Example Repository](https://github.com/enviodev/hypersync-client-node/tree/main/examples)   |
| **Python**  | [Example Repository](https://github.com/enviodev/hypersync-client-python/tree/main/examples) |
| **Rust**    | [Example Repository](https://github.com/enviodev/hypersync-client-rust/tree/main/examples)   |
| **Go**      | [Example Repository](https://github.com/enviodev/hypersync-client-go/tree/main/examples)     |

Additionally, we maintain a comprehensive collection of real-world examples covering various use cases across different languages:

- [**30 HyperSync Examples**](https://github.com/enviodev/30-hypersync-examples) - A diverse collection of practical examples demonstrating HyperSync's capabilities in Python, JavaScript, TypeScript, Rust, and more.

For more details on client libraries, see the [HyperSync Clients](./hypersync-clients) documentation.

:::tip Visual Query Builder
Need help building queries? Try our **[Intuitive Query Builder](http://builder.hypersync.xyz)** to construct queries visually and see the results in real-time.
:::

:::tip Developer Tip
Set the `RUST_LOG` environment variable to `trace` for more detailed logs when using client libraries.
:::

## Table of Contents

1. [Understanding HyperSync Queries](#understanding-hypersync-queries)
2. [Query Execution Process](#query-execution-process)
3. [Query Structure Reference](#query-structure-reference)
4. [Data Schema](#data-schema)
5. [Response Structure](#response-structure)
6. [Stream and Collect Functions](#stream-and-collect-functions)
7. [Working with Join Modes](#working-with-join-modes)
8. [Best Practices](#best-practices)

## Understanding HyperSync Queries

A HyperSync query defines what blockchain data you want to retrieve and how you want it returned. Unlike regular RPC calls, HyperSync queries offer:

- **Flexible filtering** across logs, transactions, traces, and blocks
- **Field selection** to retrieve only the data you need
- **Automatic pagination** to handle large result sets
- **Join capabilities** that link related blockchain data together

### Core Concepts

- **Selections**: Define criteria for filtering blockchain data (logs, transactions, traces)
- **Field Selection**: Specify which fields to include in the response
- **Limits**: Control query execution time and response size
- **Joins**: Determine how related data is connected in the response

## Query Execution Process

### How Data is Organized

HyperSync organizes blockchain data into groups of contiguous blocks. When executing a query:

1. The server identifies which block group contains the starting block
2. It processes data groups sequentially until it hits a limit
3. Results are returned along with a `next_block` value for pagination

### Query Limits

HyperSync enforces several types of limits to ensure efficient query execution:

| Limit Type        | Description                                      | Behavior                                                  |
| ----------------- | ------------------------------------------------ | --------------------------------------------------------- |
| **Time**          | Server-configured maximum execution time         | May slightly exceed limit to complete current block group |
| **Response Size** | Maximum data returned                            | May slightly exceed limit to complete current block group |
| **to_block**      | User-specified ending block (exclusive)          | Never exceeds this limit                                  |
| **max*num*\***    | User-specified maximum number of results by type | May slightly exceed limit to complete current block group |

### Execution Steps

1. Server receives query and identifies the starting block group
2. It scans each block group, applying selection criteria
3. It joins related data according to the specified join mode
4. When a limit is reached, it finishes processing the current block group
5. It returns results with pagination information

### Understanding Pagination

HyperSync uses a time-based pagination model that differs from traditional RPC calls:

- By default, HyperSync has a **5-second query execution limit**
- Within this time window, it processes as many blocks as possible
- For example, starting with `from_block: 0` might progress to block 10 million in a single request
- Each response includes a `next_block` value indicating where to resume for the next query
- This differs from RPC calls where you typically specify fixed block ranges (e.g., 0-1000)

#### Understanding nextBlock

`nextBlock` is the block number immediately after the last block included in the response. Use it as the `fromBlock` of your next query if you want to continue scanning. Resuming from `nextBlock` gives you a continuous, non-overlapping scan—no gaps, no duplicates.

**Usage pattern:** Call `get` or `getEvents`, process the page, then if `nextBlock` is less than your desired end (`toBlock` or `archiveHeight`), set `fromBlock = nextBlock` and repeat:

```javascript
let query = { fromBlock: 0, logs: [...], fieldSelection: {...} };
while (true) {
  const res = await client.get(query);
  // Process res.data...
  const targetEnd = query.toBlock ?? res.archiveHeight;
  if (res.nextBlock >= targetEnd) break;
  query = { ...query, fromBlock: res.nextBlock };
}
```

For most use cases, the `stream` function handles pagination automatically, making it the recommended approach for processing large ranges of blocks.

### Reverse Search

HyperSync supports searching from the head of the chain backwards, which is useful for:

- Block explorers showing the most recent activity
- UIs displaying latest transactions for a user
- Any use case where recent data is more relevant

To use reverse search, add the `reverse: true` parameter to your stream call:

```javascript
// Example of reverse search to get recent transactions
const receiver = await client.stream(query, { reverse: true });

let count = 0;
while (true) {
  let res = await receiver.recv();
  if (res === null) {
    break;
  }
  for (const tx of res.data.transactions) {
    console.log(JSON.stringify(tx, null, 2));
  }
  count += res.data.transactions.length;
  if (count >= 20) {
    break;
  }
}
```

With reverse search, HyperSync starts from the latest block and works backwards, allowing you to efficiently access the most recent blockchain data first.

## Query Structure Reference

A complete HyperSync query can include the following components:

### Core Query Parameters

```rust
struct Query {
    /// The block to start the query from
    from_block: u64,

    /// The block to end the query at (exclusive)
    /// If not specified, the query runs until the end of available data
    to_block: Optional<u64>,

    /// Log selection criteria (OR relationship between selections)
    logs: Array<LogSelection>,

    /// Transaction selection criteria (OR relationship between selections)
    transactions: Array<TransactionSelection>,

    /// Trace selection criteria (OR relationship between selections)
    traces: Array<TraceSelection>,

    /// Whether to include all blocks in the requested range
    /// Default: only return blocks related to matched transactions/logs
    include_all_blocks: bool,

    /// Fields to include in the response
    field_selection: FieldSelection,

    /// Maximum results limits (approximate)
    max_num_blocks: Optional<usize>,
    max_num_transactions: Optional<usize>,
    max_num_logs: Optional<usize>,
    max_num_traces: Optional<usize>,

    /// Data relationship model (Default, JoinAll, or JoinNothing)
    join_mode: JoinMode,
}
```

### Selection Types

#### Log Selection

```rust
struct LogSelection {
    /// Contract addresses to match (empty = match all)
    address: Array<Address>,

    /// Topics to match by position (empty = match all)
    /// Each array element corresponds to a topic position (0-3)
    /// Within each position, any matching value will satisfy the condition
    topics: Array<Array<Topic>>,
}
```

#### Transaction Selection

```rust
struct TransactionSelection {
    /// Sender addresses (empty = match all)
    /// Has AND relationship with 'to' field
    from: Array<Address>,

    /// Recipient addresses (empty = match all)
    /// Has AND relationship with 'from' field
    to: Array<Address>,

    /// Method signatures to match (first 4 bytes of input)
    sighash: Array<Sighash>,

    /// Transaction status to match (1 = success, 0 = failure)
    status: Optional<u8>,

    /// Transaction types to match (e.g., 0 = legacy, 2 = EIP-1559)
    type: Array<u8>,

    /// Created contract addresses to match
    contract_address: Array<Address>,
}
```

#### Block Selection

```rust
struct BlockSelection {
    /// Block hashes to match (empty = match all)
    hash: Array<Hash>,

    /// Miner/validator addresses to match (empty = match all)
    miner: Array<Address>,
}
```

#### Trace Selection

```rust
struct TraceSelection {
    /// Sender addresses (empty = match all)
    /// Has AND relationship with 'to' field
    from: Array<Address>,

    /// Recipient addresses (empty = match all)
    /// Has AND relationship with 'from' field
    to: Array<Address>,

    /// Created contract addresses to match
    address: Array<Address>,

    /// Call types to match (e.g., "call", "delegatecall")
    call_type: Array<String>,

    /// Reward types to match (e.g., "block", "uncle")
    reward_type: Array<String>,

    /// Trace types to match (e.g., "call", "create", "suicide", "reward")
    kind: Array<String>,

    /// Method signatures to match (first 4 bytes of input)
    sighash: Array<Sighash>,
}
```

#### Field Selection

```rust
struct FieldSelection {
    /// Block fields to include in response
    block: Array<String>,

    /// Transaction fields to include in response
    transaction: Array<String>,

    /// Log fields to include in response
    log: Array<String>,

    /// Trace fields to include in response
    trace: Array<String>,
}
```

## Data Schema

HyperSync organizes blockchain data into four main tables. Below are the available fields for each table.

:::info Field Naming
When specifying fields in your query, always use snake_case names (e.g., `block_number`, not `blockNumber`).
:::

### Block Fields

```python
class BlockField(StrEnum):
    # Fields present on all EVM chains
    NUMBER = 'number'                                   # Block number
    HASH = 'hash'                                       # Block hash
    PARENT_HASH = 'parent_hash'                         # Parent block hash
    SHA3_UNCLES = 'sha3_uncles'                         # SHA3 of uncles data
    LOGS_BLOOM = 'logs_bloom'                           # Bloom filter for logs
    TRANSACTIONS_ROOT = 'transactions_root'             # Root of transaction trie
    STATE_ROOT = 'state_root'                           # Root of state trie
    RECEIPTS_ROOT = 'receipts_root'                     # Root of receipts trie
    MINER = 'miner'                                     # Miner/validator address
    EXTRA_DATA = 'extra_data'                           # Extra data field
    SIZE = 'size'                                       # Block size in bytes
    GAS_LIMIT = 'gas_limit'                             # Block gas limit
    GAS_USED = 'gas_used'                               # Total gas used in block
    TIMESTAMP = 'timestamp'                             # Block timestamp (Unix time)

    # Optional fields — not present on all EVM chains (may be null)
    NONCE = 'nonce'                                     # Block nonce (absent on some L2s)
    DIFFICULTY = 'difficulty'                           # Block difficulty (PoW chains only)
    TOTAL_DIFFICULTY = 'total_difficulty'               # Total chain difficulty (PoW chains only)
    UNCLES = 'uncles'                                   # Uncle block hashes (absent on some L2s)
    MIX_HASH = 'mix_hash'                               # Mix hash (absent on some L2s)
    BASE_FEE_PER_GAS = 'base_fee_per_gas'              # EIP-1559 base fee (post-London chains only)
    BLOB_GAS_USED = 'blob_gas_used'                     # Total blob gas used (EIP-4844 chains only)
    EXCESS_BLOB_GAS = 'excess_blob_gas'                 # Excess blob gas (EIP-4844 chains only)
    PARENT_BEACON_BLOCK_ROOT = 'parent_beacon_block_root' # Parent beacon block root (EIP-4844 chains only)
    WITHDRAWALS_ROOT = 'withdrawals_root'               # Root of withdrawals trie (post-Shanghai chains only)
    WITHDRAWALS = 'withdrawals'                         # Validator withdrawals (post-Shanghai chains only)
    L1_BLOCK_NUMBER = 'l1_block_number'                 # L1 block number (Arbitrum only)
    SEND_COUNT = 'send_count'                           # Send count (Arbitrum only)
    SEND_ROOT = 'send_root'                             # Send root (Arbitrum only)
```

### Transaction Fields

```python
class TransactionField(StrEnum):
    # Block-related fields
    BLOCK_HASH = 'block_hash'           # The Keccak 256-bit hash of the block
    BLOCK_NUMBER = 'block_number'       # Block number containing the transaction

    # Transaction identifiers
    HASH = 'hash'                       # Transaction hash (keccak hash of RLP encoded signed transaction)
    TRANSACTION_INDEX = 'transaction_index' # Index of the transaction in the block

    # Transaction participants
    FROM = 'from'                       # 160-bit address of the sender
    TO = 'to'                           # 160-bit address of the recipient (null for contract creation)

    # Gas information
    GAS = 'gas'                         # Gas limit set by sender
    GAS_PRICE = 'gas_price'             # Wei paid per unit of gas
    GAS_USED = 'gas_used'               # Actual gas used by the transaction
    CUMULATIVE_GAS_USED = 'cumulative_gas_used' # Total gas used in the block up to this transaction
    EFFECTIVE_GAS_PRICE = 'effective_gas_price' # Sum of base fee and tip paid per unit of gas

    # EIP-1559 fields
    MAX_PRIORITY_FEE_PER_GAS = 'max_priority_fee_per_gas' # Max priority fee (a.k.a. GasTipCap)
    MAX_FEE_PER_GAS = 'max_fee_per_gas' # Max fee per gas (a.k.a. GasFeeCap)

    # Transaction data
    INPUT = 'input'                     # Transaction input data or contract initialization code
    VALUE = 'value'                     # Amount of ETH transferred in wei
    NONCE = 'nonce'                     # Number of transactions sent by the sender

    # Signature fields
    V = 'v'                             # Replay protection value (based on chain_id)
    R = 'r'                             # The R field of the signature
    S = 's'                             # The S field of the signature
    Y_PARITY = 'y_parity'               # Signature Y parity
    CHAIN_ID = 'chain_id'               # Chain ID for replay protection (EIP-155)

    # Contract-related fields
    CONTRACT_ADDRESS = 'contract_address' # Address of created contract (for contract creation txs)

    # Transaction result fields
    STATUS = 'status'                   # Success (1) or failure (0)
    LOGS_BLOOM = 'logs_bloom'           # Bloom filter for logs produced by this transaction
    ROOT = 'root'                       # State root (pre-Byzantium)

    # EIP-2930 fields
    ACCESS_LIST = 'access_list'         # List of addresses and storage keys to pre-warm

    # EIP-4844 (blob transactions) fields
    MAX_FEE_PER_BLOB_GAS = 'max_fee_per_blob_gas' # Max fee per data gas (blob fee cap)
    BLOB_VERSIONED_HASHES = 'blob_versioned_hashes' # List of blob versioned hashes

    # Transaction type
    KIND = 'type'                       # Transaction type (0=legacy, 1=EIP-2930, 2=EIP-1559, 3=EIP-4844, 4=EIP-7702) # note - in old versions of the clients this was called 'kind', in newer versions its called 'type'

    # L2-specific fields (for rollups)
    L1_FEE = 'l1_fee'                   # Fee for L1 data (L1GasPrice × L1GasUsed)
    L1_GAS_PRICE = 'l1_gas_price'       # Gas price on L1
    L1_GAS_USED = 'l1_gas_used'         # Amount of gas consumed on L1
    L1_FEE_SCALAR = 'l1_fee_scalar'     # Multiplier for L1 fee calculation
    GAS_USED_FOR_L1 = 'gas_used_for_l1' # Gas spent on L1 calldata in L2 gas units
```

### Log Fields

```python
class LogField(StrEnum):
    # Log identification
    LOG_INDEX = 'log_index'             # Index of the log in the block
    TRANSACTION_INDEX = 'transaction_index' # Index of the transaction in the block

    # Transaction information
    TRANSACTION_HASH = 'transaction_hash' # Hash of the transaction that created this log

    # Block information
    BLOCK_HASH = 'block_hash'           # Hash of the block containing this log
    BLOCK_NUMBER = 'block_number'       # Block number containing this log

    # Log content
    ADDRESS = 'address'                 # Contract address that emitted the event
    DATA = 'data'                       # Non-indexed data from the event

    # Topics (indexed parameters)
    TOPIC0 = 'topic0'                   # Event signature hash
    TOPIC1 = 'topic1'                   # First indexed parameter
    TOPIC2 = 'topic2'                   # Second indexed parameter
    TOPIC3 = 'topic3'                   # Third indexed parameter

    # Reorg information
    REMOVED = 'removed'                 # True if log was removed due to chain reorganization
```

### Trace Fields

```python
class TraceField(StrEnum):
    # Trace identification
    TRANSACTION_HASH = 'transaction_hash'   # Hash of the transaction
    TRANSACTION_POSITION = 'transaction_position' # Index of the transaction in the block
    SUBTRACES = 'subtraces'                 # Number of sub-traces created during execution
    TRACE_ADDRESS = 'trace_address'         # Array indicating position in the trace tree

    # Block information
    BLOCK_HASH = 'block_hash'               # Hash of the block containing this trace
    BLOCK_NUMBER = 'block_number'           # Block number containing this trace

    # Transaction participants
    FROM = 'from'                           # Address of the sender
    TO = 'to'                               # Address of the recipient (null for contract creation)

    # Value and gas
    VALUE = 'value'                         # ETH value transferred (in wei)
    GAS = 'gas'                             # Gas limit
    GAS_USED = 'gas_used'                   # Gas actually used

    # Call data
    INPUT = 'input'                         # Call data for function calls
    INIT = 'init'                           # Initialization code for contract creation
    OUTPUT = 'output'                       # Return data from the call

    # Contract information
    ADDRESS = 'address'                     # Contract address (for creation/destruction)
    CODE = 'code'                           # Contract code

    # Trace types and categorization
    TYPE = 'type'                           # Trace type (call, create, suicide, reward)
    CALL_TYPE = 'call_type'                 # Call type (call, delegatecall, staticcall, etc.)
    REWARD_TYPE = 'reward_type'             # Reward type (block, uncle)

    # Other actors
    AUTHOR = 'author'                       # Address of receiver for reward transactions

    # Result information
    ERROR = 'error'                         # Error message if failed
```

For a complete list of all available fields, refer to the [HyperSync API Reference](https://docs.envio.dev/docs/HyperSync/hypersync-query).

## Response Structure

When you execute a HyperSync query, the response includes both metadata and the requested data:

```rust
struct QueryResponse {
    /// Current height of the blockchain in HyperSync
    archive_height: Optional<u64>,

    /// Block number immediately after the last block included in this response.
    /// Use as from_block in your next query for pagination.
    next_block: u64,

    /// Query execution time in milliseconds
    total_execution_time: u64,

    /// The actual blockchain data matching your query
    data: ResponseData,

    /// Information to help handle chain reorganizations
    rollback_guard: Optional<RollbackGuard>,
}
```

The `next_block` value tells you where to resume scanning. See [Understanding nextBlock](#understanding-nextblock) for a clear definition and usage pattern.

### Rollback Guard

The optional `rollback_guard` lets you detect chain reorganizations (reorgs) between successive queries, so you can re-fetch any data that has become stale.

```rust
struct RollbackGuard {
    /// Last block scanned in this query
    block_number: u64,
    /// Timestamp of the last block scanned
    timestamp: i64,
    /// Hash of the last block scanned
    hash: Hash,

    /// First block scanned in this query
    first_block_number: u64,
    /// Parent hash of the first block scanned
    first_parent_hash: Hash,
}
```

The guard is `Option<RollbackGuard>`: it is present whenever the response covers blocks near the chain tip (where reorgs can still happen) and absent for queries that return no data.

#### How HyperSync handles reorgs internally

As HyperSync ingests new blocks it checks each block's `parent_hash` against the previous block's `hash`. When a mismatch is detected, HyperSync re-syncs the affected blocks and continues serving the canonical chain.

A single query response is always internally consistent: you will never receive a mix of blocks from different forks. The rollback guard exists to detect reorgs that happen _between_ successive queries, where data you fetched earlier may now be stale.

#### Detecting a reorg

After each query, store the guard's `block_number` and `hash`. On the next query, compare:

- `previous response.hash` (last block you saw)
- `next response.first_parent_hash` (parent of the first block in the new batch)

If they match, the chain is intact. If they differ, a reorg occurred somewhere between the two queries.

```
Query N:    rollback_guard.hash              = 0xABC...   (stored)

Query N+1:  rollback_guard.first_parent_hash = 0xABC...   match    -> no reorg
                                             = 0xDEF...   mismatch -> reorg
```

#### Recovering from a reorg

The guard tells you _that_ a reorg happened but not how deep. To find the depth, keep enough history to cover your chain's reorg threshold (for example, 200 blocks for Polygon) and walk backwards: re-fetch each stored block's hash and compare. The first block whose hash still matches is the last canonical block; rewind your downstream state to there and resume querying.

```python
history = []  # list of (block_number, hash)

while True:
    res = client.get(query)
    guard = res.rollback_guard
    if guard is None:
        process(res.data)
        query.from_block = res.next_block
        continue

    if history and guard.first_parent_hash != history[-1][1]:
        # Walk back to find the last block still on chain.
        while history:
            block_num, stored_hash = history[-1]
            if client.get_block_hash(block_num) == stored_hash:
                break
            history.pop()

        rewind_to = history[-1][0] + 1 if history else query.from_block
        rollback_state_to(rewind_to)
        query.from_block = rewind_to
        continue

    process(res.data)
    history.append((guard.block_number, guard.hash))

    cutoff = guard.block_number - REORG_THRESHOLD
    history = [(b, h) for b, h in history if b >= cutoff]

    query.from_block = res.next_block
```

:::tip Consider HyperIndex
[HyperIndex](/docs/HyperIndex/overview) handles all of this for you: it tracks recent block hashes, locates the reorg point, and rolls back database state automatically. See [Reorgs Support](/docs/HyperIndex/reorgs-support) for details.
:::

## Stream and Collect Functions

For continuous data processing or building data pipelines, client libraries provide `stream` and `collect` functions that wrap the base query functionality.

:::caution Tip of Chain Warning
These functions are not designed for use at the blockchain tip where rollbacks may occur. For real-time data near the chain tip, implement a custom loop using the `get` functions and handle rollbacks manually.
:::

### Stream Function

The `stream` function:

- Runs multiple queries concurrently
- Returns a stream handle that yields results as they're available
- Optimizes performance through pipelined decoding/decompression
- Continues until reaching either `to_block` or the chain height at stream start

### Collect Functions

The `collect` functions:

- Call `stream` internally and aggregate results
- Offer different output formats (JSON, Parquet)
- Handle data that may not fit in memory

:::warning Resource Management
Always call `close()` on stream handles when finished to prevent resource leaks, especially if creating multiple streams.
:::

## Working with Join Modes

HyperSync "joins" connect related blockchain data automatically. Unlike SQL joins that combine rows from different tables, HyperSync joins determine which related records to include in the response.

### Default Join Mode (logs → transactions → traces → blocks)

With the default join mode:

1. When you query logs, you automatically get their associated transactions
2. Those transactions' traces are also included
3. The blocks containing these transactions are included

```
┌───────┐     ┌───────────────┐     ┌───────┐     ┌───────┐
│  Logs │ ──> │ Transactions  │ ──> │ Traces│ ──> │ Blocks│
└───────┘     └───────────────┘     └───────┘     └───────┘
```

### JoinAll Mode

JoinAll creates a more comprehensive network of related data:

```
                 ┌─────────────────────────────┐
                 │                             │
                 ▼                             │
┌───────┐ <──> ┌───────────────┐ <──> ┌───────┐ <──> ┌───────┐
│  Logs │      │ Transactions  │      │ Traces│      │ Blocks│
└───────┘      └───────────────┘      └───────┘      └───────┘
```

For example, if you query a trace:

1. You get the transaction that created it
2. You get ALL logs from that transaction (not just the ones matching your criteria)
3. You get ALL traces from that transaction
4. You get the block containing the transaction

### JoinNothing Mode

JoinNothing is the most restrictive:

```
┌───────┐     ┌───────────────┐     ┌───────┐     ┌───────┐
│  Logs │     │ Transactions  │     │ Traces│     │ Blocks│
└───────┘     └───────────────┘     └───────┘     └───────┘
```

Only data directly matching your selection criteria is returned, with no related records included.

## Best Practices

To get the most out of HyperSync queries:

1. **Minimize field selection** - Only request fields you actually need to improve performance
2. **Use appropriate limits** - Set `max_num_*` parameters to control response size
3. **Choose the right join mode** - Use `JoinNothing` for minimal data, `JoinAll` for complete context
4. **Process in chunks** - For large datasets, use pagination or the `stream` function
5. **Consider Parquet** - For analytical workloads, use `collect_parquet` for efficient storage
6. **Handle chain tip carefully** - Near the chain tip, implement custom rollback handling
