Benchmarking Your Indexer Performance

Why Benchmark Your Indexer?

Benchmarking is a critical tool for understanding and optimizing your indexer's performance. By collecting and analyzing performance metrics, you can:

Identify bottlenecks in your indexing pipeline
Determine if performance issues are due to data fetching, processing, or database operations
Measure the impact of code optimizations
Set realistic expectations for indexing speed
Plan infrastructure requirements for production deployments

Running Benchmarks

Capturing Benchmark Data

To collect performance metrics while your indexer is running:

pnpm envio start --bench

Note: Benchmarking adds some memory and processing overhead. It should not be enabled in production environments, as it holds benchmark data points in memory and periodically writes them to disk.

Viewing Benchmark Results

After running your indexer with benchmarking enabled, you can generate a performance summary:

pnpm envio benchmark-summary

This command processes the collected benchmark data and displays a comprehensive performance report.

Understanding Benchmark Output

The benchmark output is divided into several sections, each providing insights into different aspects of your indexer's performance:

Time Breakdown

Time breakdown
┌─────────────────────────────────────────┬─────────┐
│ (index)                                 │ seconds │
├─────────────────────────────────────────┼─────────┤
│ Total Runtime                           │ 45      │
│ Total Time Fetching Chain 1 Partition 0 │ 44      │
│ Total Time Processing                   │ 9       │
└─────────────────────────────────────────┴─────────┘

What This Tells You:

Total Runtime: Overall time the indexer has been running
Total Time Fetching: Time spent retrieving data from the blockchain
Total Time Processing: Time spent in event handlers and database operations

How to Interpret:

If fetching time dominates (as in this example), your bottleneck is data retrieval, not processing
If processing time is high relative to fetching, your handlers may need optimization
Note that fetching and processing can overlap, so the sum may exceed total runtime

General Performance Metrics

General
┌─────────────────────┬─────────┐
│ (index)             │ Values  │
├─────────────────────┼─────────┤
│ batch sizes sum     │ 158205  │
│ total runtime (sec) │ 45.801  │
│ events per second   │ 3454.18 │
└─────────────────────┴─────────┘

What This Tells You:

Batch Sizes Sum: Total number of events processed
Total Runtime: Precise runtime in seconds
Events Per Second: Overall processing throughput

How to Interpret:

Events per second is your key performance indicator
Over 10,000 events/second represents excellent performance
1,000-5,000 events/second indicates good performance
Under 500 events/second may indicate optimization opportunities

Block Fetching Performance

BlockRangeFetched Summary for Chain 1 Root Register
┌───────────────────────────┬────┬───────────┬────────────┬──────┬──────────┬──────────┐
│ (index)                   │ n  │ mean      │ std-dev    │ min  │ max      │ sum      │
├───────────────────────────┼────┼───────────┼────────────┼──────┼──────────┼──────────┤
│ Total Time Elapsed (ms)   │ 12 │ 3675.17   │ 1147.69    │ 2329 │ 5972     │ 44102    │
│ Parsing Time Elapsed (ms) │ 12 │ 142.17    │ 40.15      │ 80   │ 235      │ 1706     │
│ Page Fetch Time (ms)      │ 12 │ 3481.58   │ 1042.93    │ 2249 │ 5737     │ 41779    │
│ Num Events                │ 12 │ 13183.75  │ 3858.92    │ 7579 │ 22426    │ 158205   │
│ Block Range Size          │ 12 │ 906593.17 │ 3006127.15 │ 149  │ 10876789 │ 10879118 │
└───────────────────────────┴────┴───────────┴────────────┴──────┴──────────┴──────────┘

What This Tells You:

Total Time Elapsed: Time spent fetching and parsing each batch of blocks
Parsing Time: Time spent decoding and preparing event data
Page Fetch Time: Time spent retrieving data from the blockchain
Num Events: Number of events in each batch
Block Range Size: Number of blocks in each fetch operation

How to Interpret:

Compare Page Fetch Time to Total Time to see if data retrieval is your bottleneck
Large standard deviations (std-dev) indicate inconsistent performance
If Block Range Size varies significantly, your indexer may be adjusting batch sizes dynamically

Event Processing Performance

EventProcessing Summary
┌─────────────────────────────────┬────┬─────────┬─────────┬─────┬──────┬────────┐
│ (index)                         │ n  │ mean    │ std-dev │ min │ max  │ sum    │
├─────────────────────────────────┼────┼─────────┼─────────┼─────┼──────┼────────┤
│ Batch Size                      │ 38 │ 4163.29 │ 1424.85 │ 89  │ 5000 │ 158205 │
│ Contract Register Duration (ms) │ 38 │ 0.11    │ 0.38    │ 0   │ 2    │ 4      │
│ Load Duration (ms)              │ 38 │ 80.79   │ 32.58   │ 5   │ 149  │ 3070   │
│ Handler Duration (ms)           │ 38 │ 22.18   │ 9.07    │ 0   │ 47   │ 843    │
│ DB Write Duration (ms)          │ 38 │ 135.92  │ 52.09   │ 8   │ 220  │ 5165   │
│ Total Time Elapsed (ms)         │ 38 │ 239     │ 83.24   │ 13  │ 370  │ 9082   │
└─────────────────────────────────┴────┴─────────┴─────────┴─────┴──────┴────────┘

What This Tells You:

Batch Size: Number of events in each processing batch
Contract Register Duration: Time spent preparing contract data
Load Duration: Time spent loading entities from the database
Handler Duration: Time spent executing your event handler logic
DB Write Duration: Time spent writing updated entities to the database
Total Time Elapsed: Overall time for the processing phase

How to Interpret:

Compare Load, Handler, and DB Write durations to identify bottlenecks
In this example, DB Write (135ms) and Load (80ms) operations dominate processing time
If Load Duration is high, consider implementing entity loaders
If DB Write Duration is high, check if you're updating too many entities per event

Per-Handler Performance

Handlers Per Event
┌─────────────────────────────┬────────┬────────┬─────────┬────────┬────────┬──────────┐
│ (index)                     │ n      │ mean   │ std-dev │ min    │ max    │ sum      │
├─────────────────────────────┼────────┼────────┼─────────┼────────┼────────┼──────────┤
│ ERC20 Transfer Handler (ms) │ 158205 │ 0.0021 │ 0.0364  │ 0.0007 │ 4.6752 │ 329.7264 │
└─────────────────────────────┴────────┴────────┴─────────┴────────┴────────┴──────────┘

What This Tells You:

Detailed timing for each specific event handler
Shows average and total execution time across all events

How to Interpret:

Compare different handlers to identify which ones are most expensive
Look for handlers with high maximum values (max column), which may indicate inconsistent performance
Handlers averaging above 1ms per event may benefit from optimization

Interpreting Results and Taking Action

Identifying Your Bottleneck

Based on the benchmark data, determine your primary performance bottleneck:

Data Fetching Bottleneck
- Symptoms: Most time spent in "Total Time Fetching"
- Solutions:
  - Use HyperSync if available for your network
  - If using RPC, consider a more performant provider
  - Adjust block batch sizes in your configuration
Data Loading Bottleneck
- Symptoms: High "Load Duration" in Event Processing
- Solutions:
  - Implement entity loaders to batch database operations
  - Add appropriate database indices for frequently queried fields
  - Optimize your entity relationships to reduce join complexity
Handler Logic Bottleneck
- Symptoms: High "Handler Duration" relative to other metrics
- Solutions:
  - Simplify complex calculations in your handlers
  - Move complex operations to a post-processing step
  - Consider caching frequently accessed values
Database Write Bottleneck
- Symptoms: High "DB Write Duration"
- Solutions:
  - Reduce the number of entities updated per event
  - Batch related updates where possible
  - Check if you're updating the same entity multiple times unnecessarily

Benchmarking Best Practices

Benchmark Before and After Optimizations
- Run benchmarks before making changes to establish a baseline
- Run again after each optimization to measure impact
Focus on the Largest Bottleneck First
- Prioritize optimizations based on where time is being spent
- Small improvements to the critical path yield the greatest results
Watch for Memory Usage
- Monitor memory consumption alongside performance metrics
- High memory usage can lead to degraded performance over time
Consider Real-World Conditions
- Test with realistic data volumes and event patterns
- Include periods of high activity in your benchmark tests

Advanced Performance Tuning

For cases where standard optimizations aren't sufficient:

Custom Database Indices
- Create indices tailored to your specific query patterns
- Add composite indices for multi-field filters
Handler Specialization
- Create specialized handlers for high-volume events
- Simplify logic for the most common paths
Speak to the Envio Team
- We can help!

By regularly benchmarking your indexer and methodically addressing performance bottlenecks, you can achieve significant improvements in indexing speed and efficiency.

Benchmarking Your Indexer Performance

Why Benchmark Your Indexer?​

Running Benchmarks​

Capturing Benchmark Data​

Viewing Benchmark Results​

Understanding Benchmark Output​

Time Breakdown​

General Performance Metrics​

Block Fetching Performance​

Event Processing Performance​

Per-Handler Performance​

Interpreting Results and Taking Action​

Identifying Your Bottleneck​

Benchmarking Best Practices​

Advanced Performance Tuning​

Why Benchmark Your Indexer?

Running Benchmarks

Capturing Benchmark Data

Viewing Benchmark Results

Understanding Benchmark Output

Time Breakdown

General Performance Metrics

Block Fetching Performance

Event Processing Performance

Per-Handler Performance

Interpreting Results and Taking Action

Identifying Your Bottleneck

Benchmarking Best Practices

Advanced Performance Tuning