Skip to main content

Existing methods to query blockchain data and their trade-offs

· 9 min read
future of blockchain indexing

Existing methods to query blockchain data and their trade-offs

Blockchain technology has transformed the data storage landscape by offering decentralized transparency and immutability. This innovative technology has sparked the creation of countless blockchain projects, each aiming to develop something unique with groundbreaking innovation.

Developers building blockchain-powered applications often encounter challenges related to the retrieval and reading of data stored on the blockchain, leaving data mostly under-utilized. The data retrieval process is inherently complex, computationally expensive, and hampers efficient querying, particularly in terms of speed, reliability, scalability, customizability, and for some protocols, multi-chain data aggregation.

These difficulties pose significant obstacles for developers, diverting their attention towards infrastructure and maintenance tasks instead of focusing on the core objective of building brilliant dApps. Moreover, optimal performance and reliability are essential for providing users with a frictionless experience when interacting with their favourite dApp.

Whether it's GameFi, where real-time game state updates are crucial as players submit their moves; NFTs, which require immediate drop status updates; DeFi, which demands real-time price and liquidity information; or Web3 Social, which strives to create a smooth user experience with instant updates, the need for efficient data querying directly impacts user satisfaction and overall UX.

Existing methods to interpret blockchain data:

The most common methods or services used to query data from the blockchain

  • Hosting your own dedicated node (e.g. locally, co-location, or cloud service provider)
  • Using an RPC node provider (e.g. public or private)
  • Blockchain indexing solution providers (backend as a service)

Hosting your own dedicated node Hosting a node yourself and then querying that node directly can be done by using an Ethereum client. An Ethereum client is the respective blockchain platform's “client” software on a machine, which will download, verify, and propagate new blocks across a blockchain network. It uses a JSON-RPC specification to provide a uniform set of methods for accessing blockchain data commonly known as an RPC node.

Web3 developers have the choice of running an RPC node to read and write data on the blockchain. However, some developers may opt to manage their own nodes, allowing for personalized node configurations, increased security, and the implementation of system-level optimizations that are otherwise unattainable when relying on shared or dedicated nodes provided by an RPC service provider.

Trade-offs There are three main trade-offs with running a node on your own compared to using an RPC node provider, including maintenance, time, and reliability costs.

  • Running your own full node requires dedicated hardware (e.g. RAM, storage, etc.) to download, validate, and store transaction information. Maintaining hardware to support changing levels of product usage is important to balance capacity and fault tolerance for your customers without overspending.
  • Running and maintaining your own blockchain nodes can involve lots of technical issues, which can be challenging and time-consuming for blockchain application developers. For Web3 startups with limited funding and engineering time, dedicating a non-trivial amount of engineering resources to managing their own infrastructure comes at the cost of not focusing on building out the core functionality of their product.
  • The maintenance costs of running a node will be highly dependent on whether you use a cloud service provider like AWS, run your own bare-metal server, engineering time, and the amount of hardware and bandwidth resources you need for your specific application.

In today's fast-paced Web3 environment, time is of the essence to stand out in a crowded space. With an endless stream of innovative products being released daily, reducing time-to-market is critical to success. - Sven, BD at Envio.

Unreliable nodes not only take time away from blockchain developers that could be building the core functionality of their product, but it directly impacts the end-user experience. When nodes are down, users cannot use your product and will experience friction, which has potential downstream implications such as user trust and retention, where users churn to alternative products.

Using an RPC node provider

RPC node providers handle all IT infrastructure setup, management, and maintenance of hosting a node and expose an endpoint for developers to make requests for blockchain data. By choosing a node provider, all node setup and maintenance responsibilities are relieved from the developer. Node providers are available for most leading blockchains such as Ethereum and Solana and also Layer-2 scaling solutions like Polygon, Avalanche, and Arbitrum.

Node RPC endpoints are classified into two primary offerings: Public and Private endpoints.

  • Public RPC endpoints are shared, rate-limited APIs available for anybody to send and receive data from the blockchain (e.g., make a transaction).
  • Private RPC endpoints are dedicated APIs that operate in isolation, in order to service the demand needs of a high-throughput application and provide a more consistent performance. Private RPC endpoints often maintain explicit service-level agreements (SLAs), guaranteeing both performance and availability.

Trade-offs

Public endpoints are free and ready to use at any time, and are often rate-limited, making them unsuitable for supporting production-grade applications. Further, public RPC endpoints have limited customer support, lack active developer infrastructure, and do not scale to the demands of running dApps.

Private RPC endpoints predominantly focus only on solving two of the said challenges for blockchain developers to query data efficiently and effectively from the blockchain:

❌ Speed

✅ Reliability

✅ Scalability

❌ Customisability

❌ Aggregation (e.g. multi-chain app, full tx history)

However, even the challenges RPC node providers aim to solve are a good debate about whether they are using the best tech and most efficient methods to solve these user problems. RPC nodes are typically base-level tech and form one of the simplest building blocks of blockchain technology.

For one, RPC nodes are request-heavy, which results in a lot of back-and-forth communication of the network and more logic built into your dApp. e.g if a user has one hundred tokens, the user may need to make one hundred requests to get the balance of every token. Moreover, you have to keep in mind that checking a balance is a base-level task. Imagine the number of requests you would require to do more advanced queries and computations. Applications built around RPC nodes are “heavy” and also difficult to maintain.

Another major limitation here is the inability to filter and aggregate data, and as mentioned above, RPC nodes are only the first step in making an expansive and functional application work as it should. Public nodes are commonly not connected to long-term transaction history storage, so you will also have to find workarounds to get a full transaction history

Blockchain indexing solution providers

Most blockchain-powered applications are making use of some kind of indexing solution, whether it’s in-house developed, are using a third-party blockchain indexing solution. In practice, a blockchain developer should never speak with an RPC node directly unless absolutely necessary. For instance, when you need to deploy a smart contract, you have to communicate directly with the RPC node. However, for most of the Web3 development process (especially to fetch data from the blockchain), this is not necessary when using blockchain indexing solutions.

Imagine your backend is also pre-built, RPC requests are optimized to your specific requirements (e.g. real-time web3 events, NFT events, etc.), and you’ve got a tool to present information within your application in just a few commands. This is where another form of querying and storing blockchain data is emerging: Blockchain data indexing solutions.

A blockchain indexer is a hosted backend that indexes and organizes blockchain data, and typically makes this data readily available for your application in an instant query-able API, such as GraphQL. Blockchain indexing solutions abstract away a lot of the complexity away from the developer by prioritizing the developer experience and offering full-stack Web3 SDKs with all the materials and tools required to help developers focus on building brilliant dApps.

It is important to note, that some blockchain indexers also allow developers to aggregate event data from multiple data sources, into a unified database, which eradicates the need to deploy and maintain multiple API instances for each blockchain for their multi-chain dApp.

Trade-offs

  • Customizability: Some solutions only offer pre-built APIs to “plug-and-play”. These often follow a pre-configured API standard according to the underlying smart contract or use case. Examples include but are not limited to an NFT API, Token API, Balance API, etc. Other solutions are indexing frameworks, that offer more customisability for application-specific needs, such as novel applications or protocols creating innovative solutions that require custom event handling.
  • Centralization: This may be a consideration point for some protocols looking to decentralize more than their underlying smart contracts. The business model for decentralized indexing solutions varies compared to centralized indexing providers. Decentralized indexers require you to participate in the network in exchange for token rewards. The token in most cases is the work utility token that coordinates data providers and consumers and incentivizes protocol participants to organize data effectively. Centralized indexers usually follow a pay-per-use or subscription model, and the centralization risk is mitigated due to the use of compliant cloud providers with best-in-class redundancy and no single points of failure.

In conclusion, effectively and efficiently querying blockchain data is crucial for developers and companies operating in the Web3 space. While hosting your own dedicated node provides customization and control, it comes with maintenance, time, and reliability costs that can distract from building core functionality. Using RPC node providers alleviates some of these challenges, but they have limitations in terms of speed, reliability, scalability, customizability, and data aggregation.

A new paradigm is emerging with blockchain indexing solution providers, which offer full-stack Web3 SDKs and abstract the complexity of backend development. These solutions prioritize developer experience, performance, and customizability, allowing for the creation of innovative applications in a shorter period of time. However, it's important to consider the trade-offs, such as potential centralization risks. Overall, leveraging blockchain indexing solutions can greatly improve the efficiency and effectiveness of querying blockchain data, enabling developers to focus on building user-friendly and impactful applications in the fast-paced Web3 environment.

Ship with us.

If you're a blockchain developer looking to enhance your development process and unlock the true potential of Web3 infrastructure, look no further. Build with Envio, and sail into the future of Web3 applications and dApps. Join our community of elite shippers today and ship with us!