Read Web3 data track: market structure, representative projects and future trends

Published on 5 Days ago   42 views   0 Comments  

Source: sevenx Ventures

Author: FC@SevenX Ventures

If the hot word in the field of science and technology in 2021 is metauniverse, then this year's seat rate will be reserved for "Web3". For a while, various popular science, analysis, prospects and questions have come, and this term has become a well deserved traffic password.

In various views, although people have different definitions of Web3, there is a consensus that Web3 enables users to have ownership and autonomy over their own data, which is also a key factor to promote the evolution of web2 to Web3. As our life and work are more thoroughly digitized, that is, when human activities will be presented as data streams, the transfer of this data right is particularly critical.

Therefore, we have reason to believe that the data track of Web3 will become the most important part of the new order, with broad development space. From the perspective of entrepreneurs, the decentralized network driven by blockchain technology is essentially an open and unlicensed distributed database. There are many scenarios that need to be served in the data direction, and it is highly likely to evolve and grow on the right technology tree. In today's article, I will sort out the market structure and typical players of the existing Web3 data track, briefly interpret its future development trend, and share some sevenx investment judgments.

The core point of this article:

  1. Web3 breaks the data island and returns the data rights to individual users. Users can carry them at any time and can combine and interact with applications at will.
  2. The structure of Web3 data track can be divided into four levels: data source, data acquisition, data query and index, data analysis and application. The degree of decentralization, scalability, speed and accuracy of services provided, and the irreplaceable nature of scenarios are the main dimensions for us to judge the project.
  3. With the gradual enrichment of data market participants and the precipitation and accumulation of data itself, the value of data will increase significantly. However, how to use data to generate greater value while better protecting privacy in accordance with the tenet of blockchain is another important issue.
  4. Building a decentralized reputation system through multidimensional data vectors is one of the most important use cases in the Web3 data market. Based on the reputation system, it is possible to unlock various financial scenarios, such as credit lending.

When I talk about Web3 data, what am I talking about

In the process of the development of human civilization, a large amount of data will be generated. They may be forgotten, disappeared in the long river of time, or recorded, and precipitated into a known history. The emergence of the Internet enables human beings to share data records in a more efficient and broad-based way. The value of data is further explored, and its importance has gradually become the consensus of the whole society, In the cover story of the economist in may2017, data is defined as "the most valuable resource in the world".

However, with more and more data deposited on the Internet, a fundamental problem began to appear: the data generated by individuals created value, but these data did not belong to individuals, and the value created was not allocated to individuals. So people yearn for a new order, so Web3 came into being.

How does Web3 reshape the value of data? There are three main aspects:

Make data transparent and tamper proof.

In the world of web2, applications obtain user data by providing free services, and then monopolize these data to make profits and establish their own commercial moat. The data is stored on their centralized server, which is inaccessible to the outside world, and there is no way to know which data is stored, in what way and in what granularity. Moreover, once these applications are attacked or actively end the service, the user's data can be turned into nothing overnight. However, under the we3 framework with blockchain technology as the underlying layer, the data on the chain are open, transparent and tamper proof, which is the premise for their better use.

Break the data island and improve interoperability.

Whenever you use a new application, you don't need to go through the registration process. This should be the most intuitive manifestation of the negative impact caused by web2 data islands on the user side. Because each application has its own database, which is independent of each other and can not get through, this kind of repeated collection is caused. At the same time, user behavior data is fragmented in the hands of different applications, which can neither be reused across platforms nor integrated. In the world of Web3, in a broad sense, users only need one address to access and use all kinds of decentralized applications, and the corresponding data of each chain interaction at this address can be combined without any application license.

Better value distribution through token economy.

How the value created by data can be distributed to the individuals who generate these data is an important issue Web3 needs to answer for data, and the evolving token economy may be the core means to realize this value redistribution. Any user who has benefited from various air drops should have a very intuitive feeling. In the context of Web3, the data accumulated and generated by the interaction between users and any application is the carrier of value capture.

In fact, the evolution of crypto market itself has also driven the development of Web3 data track to a large extent. On the supply side, the formation of the Multi Chain universe, the explosion of various applications, the vigorous development of NFT, and the influx of new users have led to an exponential growth in the type and quantity of data; On the demand side, the multidimensional and complex requirements have created numerous imaginative scenarios and opportunities around data acquisition, collation, access, query, processing and analysis.

Web3 data track structure diagram

The structure of the Web3 data track can be divided into four levels: the lowest level of data sources, the second level of data acquisition, the third level of data query and index, and the top level of data analysis and application.

First layer, data source

Data sources are divided into on chain and off chain data as a whole. Data on the chain mainly includes: chain related data (such as hash, timestamp, etc.), transfer transactions, wallet addresses, smart contract events, and some data stored in the cache (such as queued data in Ethereum MemPool). Such data are maintained by a decentralized database, and the reliability is guaranteed by the consensus of the blockchain. In addition, storage is also the main source of data on the chain. At present, it focuses on IPFs, arweave, storj and other protocols. Off chain data mainly includes centralized exchange data, social media data, GitHub data, and some typical web2 data, such as PV, UV, daily life, monthly life, download, search index, etc.

In the past two years, the type and quantity of data have increased exponentially. However, at present, there are three problems at the level of data source:

  1. Some public chains adopt the light node mode, resulting in incomplete data on the chain, such as Solana.
  2. The storage layer is congested due to the large amount of data. My good friend Reva once uploaded her NFT works to IPFs, but when she wanted to call them, she failed to download a few hundred megabytes of files in two hours (think about the crash that failed to download a standard definition movie in two hours). However, there are already projects in the market to solve this problem, such as sevenx's portfolio:mason network. It is a decentralized CDN network, which aggregates idle servers through mining, schedules bandwidth resources and serves the file and streaming media acceleration market. Its objects include traditional websites, video, live broadcast and blockchain storage solutions. At present, it supports AR, IPFs, etc.
  3. The offline data lacks methods to ensure its authenticity, and the data dimension needs to be expanded.

The second layer, data acquisition

The main player in this layer is the node service provider. If you choose to build your own nodes to obtain data on the chain, it will require high time, money and technology costs. You may also face problems such as memory leakage and insufficient disk space. The node service provider has greatly optimized this process. As the infrastructure of the entire data track, node service providers are the first players to participate in the game, and unicorns valued at tens of billions of dollars have also been born.

At present, the well-known service providers include infra, quicknode, alchemy and pocket. When choosing, developers and entrepreneurs will mainly consider the number of coverage chains, business models and the diversity of additional services (are there CDN like services? Can you access MemPool data? Can you provide private nodes?) And other factors, and whether to decentralize is also one of the criteria we choose for the node downtime event that has occurred more than once in the past in infra. (in november2020, infra did not run the latest version of the geth client, and some special transactions triggered a bug in this version of the client. Then infra went down and caused a series of chain reactions: the mainstream trading platform could not fully provide the token of erc-20, metamask could not be used, etc.)

A simple comparison of the four node service providers is as follows:

On February 8 this year, alchemy completed the financing of USD 200million at the valuation of USD 10.2 billion; Last year, the parent company of infra, consensys, also completed a financing of US $200million, with a valuation of US $3.2 billion; By march2022, the circulation market value of pocket had reached US $3.28 billion.

The third layer, data query and index

Node service providers that directly interact with various public chains are market participants that provide data query and index services. By parsing and formatting the data, they make the original data easier to access and use.

The Graph

The graph is a decentralized on chain data indexing protocol. The main network will be launched in December, 2020. So far, it can support indexing data from more than 30 different networks, including Ethereum, near, arbitrum, optimization, polygon, avalanche, CELO, fantom, moonbeam, arweave, etc.

It is similar to the traditional API based on cloud services. The difference is that the traditional API is operated by a centralized company; The data index on the chain is composed of decentralized index nodes. With the help of graphql API, users can directly access information through subgraphs, which is fast and saves resources. The graph has designed GRT token mechanism to encourage multiple parties to participate in its own network, including delegator, indexer, curator and developer. The flow of business is summarized as follows: the user puts forward a query request, the indexer runs the graph node, the client pledges GRT tokens to the indexer, and the curator uses GRT to guide which subgraphs have query value.


Covelent provides a data query layer so that its users can quickly call data in the form of API. Currently, it supports Ethereum, BNB chain, avalanche, ronin, fantom, moonbeam, klayth, heco, shiden and mainstream layer2 networks.

Covalent not only supports the query of all blockchain data types, such as transaction, balance, log type, etc., but also supports the data query of a certain protocol. Covalent's most prominent feature is cross chain query. It does not want to re-establish the index similar to the graph subgraph. It can be achieved by changing the chain ID. The project also has its own token CQT, which can be used by the holder to pledge and vote for new events on the database.

SubQueryΒ Β 

Subquery provides data query services for Polkadot and substrate projects, allowing developers to focus on their core use cases and front ends without wasting time building custom back ends for data processing. Inspired by the graph, subquery also uses the graphql language, and its token economics is similar to the graph: there are three types of roles in the subquery system: consumer, index and principal. Consumers publish tasks, index this provides data, and entrust idle SQT tokens to indexers, encouraging them to participate in work more honestly.


Blocknative focuses on the retrieval function of real-time transaction data, and provides MemPool data browser, such as address tracking, internal transaction tracking, information of unsuccessful transactions, and information of replaced transactions (accelerated or cancelled). Because the data of MemPool is not consistent with the final block data, it requires high real-time performance. Blocknative provides more immediate and accurate field queries.

Koii Network

Koii is a decentralized ecosystem for creators to help them permanently own content and earn content value. Anyone can use the koii system to earn token rewards by deploying tasks, running nodes or making / registering content. The system will reward participants based on the data processed through real traffic proof, so as to realize the cycle of "attention economy". In addition, atomic NFT developed by koii team realizes the preservation and authorization of NFT and its meta info (meta information, that is, the actual digital content represented by NFT) on the same chain. Therefore, the content on all koii platforms can be generated according to the same standard. If this extended performance is successful enough to promote the content accumulation to a certain order of magnitude, koii will also become an important content data index platform.

The following items not only provide data query and index services, but also products belonging to the data application and analysis layer. For convenience, they are described here.

Dune Analytics

Dune analytics is a comprehensive Web3 data platform, which can query, analyze and visualize massive data on the chain. It parses the on chain data saved in the key value database, and then enters and collects them into a PostgreSQL relational database. Users do not need to write scripts, but can query as long as they can use simple SQL statements. Dune analytics can provide three types of data tables: original transaction data table, project level data table, and aggregate data table.

Dune analytics encourages data sharing. By default, all queries and datasets are public. Users can directly copy other people's dashboards and use them as references. At present, a group of the best data analysts in the Web3 field are gathered here. Dune analytics currently supports data query of Ethereum, polygon, binance smart chain, optimization and gnosis chain. In February this year, the company completed the round B financing of USD 69.42 million, valued at USD 1billion, and officially entered the list of Unicorn.

Flipside Crypto

Like dune analytics, flipside also uses visual tools and automatically generated API excuses to enable users to query complex data through simple SQL statements, as well as copy and edit SQL queries generated by others. Flipside actively cooperates with leading encryption projects, motivates on-demand analysis through structured reward plans and guidance, and helps projects quickly obtain the data insight they need to achieve growth.

Currently, flipside supports Ethereum, Solana, Terra, algorand and other public link networks. On April 19, flipside announced that it had completed a $50million financing.


Debank is a tracker for the defi portfolio. Users can track and manage their interactive defi applications through debank in a one-stop way, and view address balance and changes, asset distribution, authorization, awards to be received, loan positions, etc. Currently, 1147 protocols on 27 networks are supported.

In April last year, debank officially launched its OpenAPI program, which includes access to all protocols on a chain, access to all chains supported by a protocol and their contract address lists, and access to 28 APIs such as a real-time portfolio of a protocol. All institutions and individual developers can apply to become official partners to access debank's defi analysis data in real time. At present, imtoken, tokenpocket, math wallet, mask, hashkey me, OneKey and zerion are all using the APIs of debank. Debank has also successfully extended its market from data application to data query and index.


Cyberconnect is a decentralized social graph protocol. Its solution is to build an extensible standardized social graph module, so that developers can migrate the social graph module to new applications through simple code, saving time and economic cost. For end users, their social data becomes personal portable assets and can also be easily transplanted to new applications, Broke the barriers between platforms in the web2 world.


RSS3 is a next-generation data indexing and distribution protocol derived from RSS protocol. It allows users to generate RSS3 files based on addresses and associate their twitter, mirror, instant and other social platforms into the files. The files will synchronize users' assets, content and behavior data (such as transactions, reviews and forwarding) in real time. At the same time, these information will be stored in the decentralized network of RSS3. Developers can, with the permission of users, Retrieve the content published by users on different platforms through different API interfaces, and filter and display different information according to application characteristics.


Based on its own "security engine", go+ is committed to creating a "secure data layer" in the world of Web3. At present, the token security monitoring function has been released for C-end users. By entering the token contract address, users can obtain nearly 30 security monitoring of the changed token in terms of contract security, transaction security and information security, covering eth, BSC, polygon, avalance, arbitrum, heco and other public chain states. At the same time, the security API of go+ can also be referenced by other developers and downstream applications to create a more secure encryption ecosystem for their own projects. These security APIs include token detection, NFT detection, real-time risk warning, DAPP contract security, interaction security, etc.

The emergence of go+ actually shows a trend of Web3 data race, that is, data index verticality. In the research, sevenx found that with the proliferation of protocols and projects and the complexity of user behavior, there are more and more vertical data scenarios in the data market. These scenarios are characterized by non general data, high frequency of user demand, and users are both data users and data providers. In the future, there will be more and more data indexing, query and analysis services for these vertical scenarios, These services will probably become the market breaker because of their clear positioning.

The fourth layer, data analysis and Application

This layer is directly oriented to C-end users (C-end in a broad sense, not just individual users) and delivers ready to use data products. They help users complete all the heavy and responsible work, and directly present the data value for users from the perspective of their own data methodology. Participants at this level can be roughly divided into those for online transactions, token prices, defi protocols, Dao, NFT, security, social networking, etc. according to the type of data. Of course, more and more project departments focus on a certain type of data, aiming to become a more comprehensive data analysis platform.

Blockchain browser

This may be the earliest data application layer product, allowing users to directly search information on the chain through web pages, including chain data, block data, transaction data, smart contract data, address data, etc.

Glassnode & Messari & CoinMetrics. io

Blockchain data and information providers, from different perspectives & Indicators provide investors with chain data and transaction intelligence, and output market analysis insights and research reports.

CoinGecko & CoinMarketCap

Token analysis tool is used to observe and track token price, transaction volume, market value, etc.

Token Terminal

Use traditional financial indicators, such as p/s ratio, p/e ratio and agreement revenue, to analyze the defi project. At present, it also supports the analysis of NFT trading market.


Deeply cultivate the data analysis platform of DFI TVL, and support 107 layer1 & Tvls of nearly a thousand defi protocols on layer2 network can be classified, compared and viewed with different indicators and time dimensions. Currently, defillama also supports NFT analysis, focusing on the transaction volume and types of collections in different trading markets on different chains.


The data platform focusing on NFT market provides services such as data analysis and Jujing wallet monitoring, aiming to help users better track and evaluate the value of NFT projects and assets, and help them make wise investment decisions.


If you use one word to summarize Nansen, it must be "label". Nansen has accumulated and analyzed 50million + Ethereum wallet addresses and their activities, and combined the data on the chain with a database containing millions of tags to help users better find signals and new investment opportunities. Nansen is currently one of the most star projects in the Web3 data analysis and application layer. Last December, it completed a financing of 75Million US dollars with a valuation of 750million.


Founded in 2014, chainalysis, known as the "chain FBI", is an enterprise data solution company. It helps customers such as governments, cryptocurrency exchanges, international law enforcement agencies and banks comply with compliance requirements, assess risks and identify illegal activities through chain data monitoring and analysis. Last June, chainalysis announced that it had obtained US $100million of E-series financing, with a valuation of US $4.2 billion.

Footprint Analytics

Footprint is a comprehensive data analysis platform for discovering and visualizing blockchain data. Compared with other applications, footprint has a lower threshold for use and is very friendly to novice users. The platform provides rich data analysis templates, supports one click bifurcation, and helps users easily create and manage personalized dashboards. At the same time, footprint also has tags for other wallet addresses and their activities in the chain. Users can make investment decisions through indicators with rich dimensions.

Zerion & Zapper

The first defi portfolio tracker and manager has also added support for NFT assets.


Deepdao is a comprehensive data platform focusing on various Dao organizations. Users can easily view the Treasury amount and changes, the distribution of Treasury tokens, the holding of governance tokens, active members of the organization, proposals and votes. Deepdao also provides dozens of tools for creating and managing Daos.

There are many applications in this layer, so I won't list them one by one here.

In fact, sevenx has been paying attention to the data track since very early, and has invested in debank, zerion, footprint, koii, deepdao, RSS3, cyberconnect and go+. In the process of screening projects, we have some experiences and judgments, which we can share briefly here:

In general, application layer traffic is no longer the core barrier. Users may migrate quickly at any time due to the ease of use and update speed of other products. Products that have the ability to provide data and form a closed-loop data channel with users will be more competitive. However, before the barrier is formed, traffic products have the possibility to feed back.

How do we evaluate? There are five dimensions:

1. scenario selection:

(1) Are there requirements and are they mature enough or will they occur in the future?

When looking for requirements, the project should judge the maturity or stage of the requirements. Take goplus as an example. In the world of defi, "security" has become a necessity. Security is a common demand of almost everyone. This demand is activated and gradually matured after an endless variety of security incidents that are difficult for ordinary users to identify and prevent. So now we would rather pay more or spend money appropriately to buy a safer experience.

(2) Do you want to do the C-side or the protocol first?

We believe that when the scene needs are not fully stimulated, we should first make C-end products to find the user pain points, otherwise it is easy to find nails with a hammer. For example, goplus made the go Pocket Wallet in the early days, which is actually like a model room. With the model room, other partners can better understand what problems the product is solving, which will provide great help for the b-end customers when extending the agreement.

After that, sevenx will focus on gamefi, defi, Dao, NFT, social networking, security and other scenarios

2. data capability:

Data acquisition and structure are basic skills, but whether to have the data ability based on industry cognition is the key.

3. C-end product capacity:

The C-end product capability mainly depends on whether the urgent needs of the audience can be found as a cold start method and can be easily used.

4. to B expansion ability:

The expansion of to B is a complex decision-making process. Whether it can obtain benchmark users or whether it can effectively obtain long tail users according to product positioning needs to be considered.

5. team background:

  1. Vertical track web2 background, independently operated a project
  2. Open source community experience
  3. Ability to learn quickly and without prejudice

Web3 data possibilities

With the increase of on chain analysis, the anonymous attribute of blockchain has been gradually broken. For example, we can track the trading address and trading behavior of large investors according to Nansen's tag, and we can also identify the activities, organizations and on chain behaviors of an address through the on chain address, which exposes our data to the sun and loses the right to choose privacy. Nansen recently said that it has marked more than 100million wallets, which makes the need for privacy more and more valued.

The current privacy solutions mainly include privacy coins, privacy computing protocols, privacy transaction networks, privacy applications, etc.

If we want to protect the selective discovery of our online transactions or activities, or if we want the process to be invisible but the results to be visible, we can choose privacy computing protocols, such as oasis network. Common technologies include zero knowledge proof, secure multi-party computing, federated learning based on modern cryptography, trusted execution link (TEE), etc.

However, the current protocol availability is relatively limited, and most of them are still in the development stage. The secret network is the most popular one. The public chain has launched applications such as the cross link bridge secret bridge, the privacy defi protocol Sienna network, the privacy transaction protocol secret swap, and the bitcoin untrusted privacy solution protocol Shinobi protocol.

From the second half of 2021, head VC and developers began to pour into the privacy track in large numbers. I believe that with the gradual development of this market, people will find a balance between how to use data to generate greater value and how to better protect privacy in accordance with the tenet of blockchain.

Finally, let's briefly talk about our judgment on the market trend: building a decentralized reputation system through multidimensional data vectors is one of the most important use cases in the Web3 data market. Based on the reputation system, it is possible to unlock various financial scenarios such as credit lending.

Lending has always been an important part of the defi ecosystem. At present, the product types of the whole market are mainly mortgage lending (usually excess mortgage) and flash lending. Credit lending that does not rely on (or does not fully rely on) collateral has always been considered the most important evolution direction, because credit will create a more free exchange market.

However, the biggest obstacle to the introduction of credit lending in DFI is that the lender only faces one address, which can not effectively verify the repayment ability of the borrower at the other end of the address and whether he has a bad credit record. Some solutions try to achieve this goal by introducing the off chain credit data into the chain, but the question of how to ensure the authenticity of the off chain data itself and the process of uploading has not been well answered.

Now, with the gradual improvement of the identity system on the chain and the synchronous growth of data available for analysis and data analysis tools, the things that users create, contribute, earn and own on the chain can gradually accumulate into the user's reputation, so as to realize the effective credit evaluation of one address to another address. In fact, lens protocol endorsed by AAVE is actually doing such things, using NFT management data to lay the foundation for unsecured credit loans on the chain.

Write at the end

Although ten billion dollar unicorns have grown, the data track of Web3 has just begun. Standing in the torrent of application explosion on the chain, every bit and byte defines what kind of Web3 citizen you are. We need to find new orders and paradigms to jointly resist the entropy increase of the new world.

Reference link:

Generic placeholder image
Promote your coin to 10k unique users daily
contact us PM Twitter
52 views   0 Comments   4 Days ago
39 views   0 Comments   5 Days ago