How to Implement Caching Layer in Web3 Products?

A web3 caching is a high-speed data storage layer that stores a subset of data. It allows future requests for the data to be served up quicker than if they were stored in their primary storage location. Web3 products cache layer makes it possible to reuse data that we have previously retrieved or computed. This layer in web3 stores data in fast-access hardware such as RAM or in-memory and can also be used to correlate with a software component. A web3 caching improves data retrieval performance by decreasing the need to access the slower storage layer underneath. It is also a way to trade off speed for capacity in the blockchain network. It temporarily stores a small subset of data instead of databases, which store complete and permanent data. Web browsers cache HTML files, JavaScript, and images in order to load websites more quickly.

Why is caching important in web applications?

Because of the high demand rates for data, RAM and in-memory engines support IOPS (input/output operations per second). Additional resources are required to support traditional databases and disk-based equipment scalability. These resources can increase cost but still not achieve the same low latency performance as an in-memory cache. Caching improves data retrieval performance and lowers costs at scale.
Caches can be used and leveraged across many layers of technology, including operating systems, networking layers including CDNs and DNS, web apps, and databases. Caching can be used to improve IOPS and reduce latency in many application workloads such as gaming, social networking, media sharing, and Q&A portals. The cached information could include results from database queries, API requests/responses, computationally-intensive calculations, and web artifacts like HTML, Javascript, or image files. A caching layer that acts as an in-memory data layer is also useful for compute-intensive tasks such as recommendation engines or high-performance computing simulations. These applications require large data sets to be accessed instantly across hundreds of machines. Thus, the low speed of the underlying hardware hampers these applications, making it difficult to manipulate the data in a storage-based format.
A dedicated cache layer in a distributed computing environment allows systems and applications to run independently of the cache with their respective lifecycles without affecting the cache. The cache acts as a central layer, which disparate systems can access. It has its own lifecycle and topology. This is particularly important in systems where applications can dynamically scale in and out. Scaling can affect the integrity and availability of caches that are located on the same node or system as the applications using them. Local caches can only be used by the application that is consuming the data. Data can be spread across multiple cache servers in a distributed caching environment and stored in one central location to benefit all users.

Best practices in caching

When implementing a caching layer, it is important to verify the data’s validity. High hit rates indicate that the cache is working correctly and the data is available when it is retrieved. If the cache does not contain the data, it is called a cache miss. Controls such as TTLs (time-to-live) can be used to expire data. We may also need to consider whether the cache environment must be highly available. In-memory engines like Redis can achieve this. An In-memory layer may be used to store data independently of caching it from a primary location. To determine if this is suitable, we need to establish an RTO (Recovery Time Objective) and RPO ( Recovery Point Objective) for the in-memory engine. Design strategies can be used for different in-memory engines to meet most RTO and RPO requirements.

Why do we need a web3 caching?

Blockchains are not efficient for real-time data extraction. In the traditional process, users make the call to the blockchain network via RPC calls through the web3 library, which is not faster. For example, if we need to fetch one-year transaction data for a web3 product from the blockchain, we will not get a suitable query to do the same. To resolve this issue, we can add a web3 caching in the architecture.

Advantages of caching layer in web3

Fewer queries to the actual blockchain network.
Faster access to data.
More complex queries can be made to the caching layer using any DB, while queries are limited on the blockchain.
Creates replica of blockchain data as backup.
More user-friendly APIs can be exposed hence better access to the data.

How to implement web3 caching?

As we can see in the above architecture diagram, the web3 caching layer is implemented in a docker-based container. The components of the web3 caching layer and their implementations are described below:

Blockchain follower

It is nothing but a microservice for processing blockchain data.

For synching of blockchain data, we can create a blockchain follower service, where a listener method from web3 will be used for every new transaction on the blockchain network.

newBlocks = web3.eth.subscribe("newBlockHeaders")
newBlocks.on("data", (blockHeader) => processBlockHeader(blockHeader, web3Instance));

This way, we will be receiving the data from the blockchain nodes. This data comes in raw form and is structured as per blockchain structure. We can process this data and can transform it into a more structured format that we can save into the database. We can also perform indexing on data for faster access.

With this syncing service, we need to consider certain factors mentioned in the next section in order to make this caching layer a fast, reliable and more structured data holder.

Embark on your web3 journey with our future-ready web3 solutions.

Launch your Web3 project with LeewayHertz

Learn More

Database

To save the coming blockchain data, we will definitely need a database where we can have the structured data to make queries to the DB. This database will be helpful as –

Complex queries can be executed for the required result.
Aggregated data can be fetched in order to show graphical info.
Data can be indexed on one of the DB keys, which will improve the data fetch timing.

We have used PostgreSQL here, also known as Postgres, which is a free and open-source relational database management system that emphasizes extensibility and SQL compliance. The data coming from the blockchain can be stored in the database with proper structuring so that it’s easy to query as per the need of the web3 products. Features like indexing make the queries much faster.

The structure of DB can look like this –

{
blockHash: txData.blockHash || "",
blockNumber: txData.blockNumber || 0,
hash: txData.hash.toLowerCase() || "",
from: txData.from.toLowerCase() || "",
to: txData.to || "",
gas: txData.gas || "",
gasPrice: String(txData.gasPrice) || "",
gasUsed: receipt.gasUsed || 0,
input: txData.input || "",
nonce: txData.nonce || 0,
transactionIndex: txData.transactionIndex || 0,
value: txData.value || "",
contractAddress: contractAddress || "",
cumulativeGasUsed: cumulativeGasUsed || 0,
logs: logs || [],
status: status || false,
timestamp: timestamp || 0,
modifiedOn: Date.now(),
createdOn: Date.now()
}

In Memory Database

As we are building a caching layer, so it must be faster. In our web3 system, there can be things that will not change frequently and are required on the client side frequently, so such information can be kept in an in-memory database.

Redis is an in-memory data structure store used as a distributed, in-memory key-value database, cache, and message broker, with optional durability. Hence, the data required to be fetched frequently and does not change frequently can be stored in Redis, which is much faster than a database query.

API Endpoints

As we have built a caching layer, we need to expose APIs for the client side so that data can be retrieved from the DB. There can be 2 approaches to defining APIs-

One is the regular way of defining various required APIs and manually validating all requests.
Another modern way to expose APIs for such a huge system is GraphQL, which is much more advanced and high performing.

GraphQL is an open-source language that can perform data query and manipulation for APIs. It is a runtime for executing queries with existing data. Instead of exposing multiple API endpoints, GraphQL can be used, which provides the flexibility to interact with the caching layer’s database as needed.

Socket Connections

Consider a case where the user is on the web3 product screen, whereas in the background, a transaction just happened that increased their balance amount. It will be magical if the user instantly gets the updated balance on the same screen without refreshing the web page. That’s how the web3 product will have a more synchronized and live version of the blockchain state. This can be achieved by Sokei.IO.

Socket.IO is an event-driven library that we can use for real-time web applications. It enables bi-directional communication between web clients and servers on a real-time basis. This can be used in order to deliver the updates from the blockchain to the client instantaneously, which helps the system deliver all the live updates from the network to the user. As a result, the user also feels he is looking at the live data directly coming from the blockchain.

B2B APIs

Apart from graphQL APIs, this web3 caching can expose a few B2B APIs, which can be public and defined for the specific scope of work, unlike graphQL. These APIs will be pre-defined, i.e., they will not have the freedom to query data directly from the database. Instead, business logic will be there in accordance with the requirement of the API endpoint. These APIs will be more focused on the business requirement, and the caching layer we just implemented will boost it as B2B APIs will be much faster than the web3 calls to the actual blockchain.

How to ensure security in the web3 products caching layer?

All the exposed APIs and connections to the caching layer in web3 should be secured with a level of security. We can secure caching layer in web3 products in the following ways –

API Gateway

An API gateway is also important to an organization’s secure access point. API gateways use industry-standard encryption and access control. The best API gateways are designed in a way that they can provide robust security. An API gateway usually performs the following functions:

Works as an inline proxy point for control over APIs.
Credential validation and token validation are used to verify the identity of API requests.
Determine which traffic can pass through the API to backend services.
Measuring traffic through APIs by rate limiting or throttling.
Logging all transactions and applying runtime policies in order to enforce governance.
Last-mile security for the backend services that run the APIs.

JWT Authentication

When it comes to protecting web resources, security is crucial. Integrating JSON Web Tokens (JWT) in web applications is one way to protect these resources. JWT is an open standard that allows for secure information transmission between two parties. Also, JWTs are beneficial as they allow stateless session management (no session cookies), where a backend server doesn’t need to communicate with an authorization server. Additionally, JWT supports all types of blockchains, and we can activate any number of JWTs on a blockchain.

Session validation (If required)

There are many ways to manage user sessions. We can either store them locally on the node that responded to our request, or we can designate a layer within the architecture that can store these sessions in a robust and scalable manner. There are many ways to store sessions within key/value storage. Many application frameworks offer libraries that can abstract some of the integration plumbing to store these sessions in memory. We can also write our own session handlers to keep the session alive.

End note

From the business aspects of web3 products, building a web3 caching is essential. The web3 caching allows for more efficient network utilization by reducing the number of iterations to request and deliver content. This can reduce the requirement for duplicated infrastructure deployment, resulting in substantial cost savings and economic benefits to the entire internet ecosystem. This is also energy efficient. Moreover, commercial caching providers are able to operate on a large scale and make extensive use of infrastructure shared among multiple customers.

If you plan to build a web3 caching product, feel free to contact LeewayHertz blockchain experts!

Webinar Details

Author’s Bio

Akash Takyar

CEO LeewayHertz

Akash Takyar is the founder and CEO of LeewayHertz. With a proven track record of conceptualizing and architecting 100+ user-centric and scalable solutions for startups and enterprises, he brings a deep understanding of both technical and user experience aspects.
Akash's ability to build enterprise-grade technology solutions has garnered the trust of over 30 Fortune 500 companies, including Siemens, 3M, P&G, and Hershey's. Akash is an early adopter of new technology, a passionate technology enthusiast, and an investor in AI and IoT startups.

Write to Akash

Start a conversation by filling the form

Once you let us know your requirement, our technical expert will schedule a call and discuss your idea in detail post sign of an NDA.
All information will be kept confidential.