Developer onboarding

Developing Subgraphs

As a subgraph developer, you can define which blockchain data is being indexed by The Graph and how it is stored. Here are the three files the subgraph definition consists of:

  • subgraph.yaml: The central YAML file in which the subgraph manifest is stored.
  • schema.graphql: Defines what data is stored and how it can be queried via GraphQL.
  • AssemblyScript Mappings: For translating blockchain event data to the entities defined by a developer’s schema (in this tutorial mapping.ts)

As a prerequisite, you have to install The Graph CLI that allows you to create and deploy a suppressed.Let’s have a look at how it all works.

1. Installing the Graph CLI.

Before installing The Graph CLI, install yarn (installation instructions are in the graph-cli repository) You can also use npm but for the sake of brevity, we will use yarn in the following tutorial.

Once you have installed yarn, you can run the following command to install the Graph CLI:

Copy to Clipboard

2. Creating a Subgraph Project.

In the following, we will show you two ways of creating a subgraph. You can either create a subgraph from an existing contract or use an example sub graph.

A) Creating a subgraph from existing contract

You can use an already existing smart contract to bootstrap your new subgraph. Follow this part of the instruction in case you already have deployed a smart contract to Ethereum or a testnet.

First of all, we will create a subgraph that is assigned to index all events of an existing smart contract. The subgraph will attempt to fetch the contract ABI from Etherscan. If unsuccessful, it will fall back to requesting a local file path. An interactive form will guide you through the process in case any of the optional arguments are missing.

Copy to Clipboard
  • GITHUB_USER: The name of your organization or your GitHub user
  • SUBGRAPH_NAME: The name you want to give your subgraph
  • DIRECTORY: (Optional) – To define the directory where graph init stores your subgraph manifest.

These are the supported networks on the Hosted Service:

goerli
kovan
mainnet
matic
mumbai
poa-core
poa-sokol
rinkeby
ropsten
xdai

B) Creating a subgraph by using an example subgraph

If you do not have an existing smart contract, you can use the sample subgraph of The Graph team. To initialize the example subgraph, use this command:

Copy to Clipboard

In this tutorial, we’re going to use a subgraph based on the Gravity contract. The smart contract manages user avatars. Whenever avatars are created or updated, the contract emits NewGravatar or UpdateGravatar events.

These events are then written to the Graph Node store as Gravatar entities by the subgraph.

3. Creating a Subgraph Manifest

To create a subgraph manifest, we use the example of the Gravity contract mentioned above.

The subgraph manifest is stored in subgraph.yaml. It defines which smart contracts your subgraph indexes, which events from these contracts to monitor and how to map the event data to entities that The Graph Node stores. Detailed specifications for the subgraph manifest can be found in the documentation.

This is the subgraph.yaml for the example subgraph that indexes the Gravity contract:

Copy to Clipboard

Here’s a detailed explanation of the most important entries of the manifest:

  • description: This entry allows you to describe what your subgraph is all about. It is human-readable and will be displayed by the Graph Explorer when the subgraph is deployed to the Hosted Service.
  • repository: Include the repository’s URL where the subgraph manifest is located. The URL is also displayed by the Graph Explorer once deployed.
  • dataSources.source: Defines the smart contract address to be sourced by the subgraph and which abi to use. Note: address is optional. By omitting it, the subgraph is allowed to index matching events from all contracts.
  • dataSources.source.startBlock: (Optional). Defines the block number that the data source starts indexing from. It is suggested to use the block in which the smart contract was created.
  • dataSources.mapping.entities: This entry defines which entities are written to the store by the data source. schema.graphql defines the schema for each entity.
  • dataSources.mapping.abis: The ABI file(s) for the source contract and any other smart contract you interact with from within the mappings.
  • dataSources.mapping.eventHandlers: Use of this entry to define the events of a smart contract that your subgraph reacts to. This is also where you define the handlers in the mapping that transform the smart contract events into entities in the store. In our sample case, this is ./src/mapping.ts.
  • dataSources.mapping.callHandlers: This entry allows you to list all the functions of the smart contract your subgraph is going to react to. Use this also for the handlers in the mapping that transform the inputs and outputs to function calls into entities in the store.
  • dataSources.mapping.blockHandlers: Defines the blocks your subgraph talks to and which handlers in the mapping to run when a block is appended to the chain. Please note: the block handler will be run every block if you do not specify a filter. You can provide an optional filter with the following kinds: call. The handler will be run by the filter if the block contains at least one call to the data source contract.

You can use one subgraph to index the data originating from multiple smart contracts. To do so, simply add an entry to the dataSources array for each from which you want to index data from.

Ordering rules

The following process is used to order of the triggers for a data source within a block:

  1. The transaction index within the block orders event and call triggers
  2. Should there be event and call triggers within the same transaction, the following convention is used for ordering: event triggers first then call triggers. Each type respects the particular order they are defined in the manifest.
  3. After event and call triggers, block triggers are run in the order as defined in the manifest.

Getting the ABIs

Please note that the ABI file(s) will have to match your contract(s). You can obtain AVI files in the following ways:

  • Use your most current ABIs if you are building your own project
  • In the case of creating a subgraph by public project, download it to your computer and use truffle compile or solc to compile the ABI
  • The third alternative is to use Etherscan. However, this is not recommended as it is not always reliable. Please note that the ABI found on Etherscan may be the latest one(s). If this is the case, your subgraph will not run.

4. The GraphQL Schema.

You can find the schema for your subgraph in schema.graphql. Use the GraphQL interface definition language to define the GraphQL schema.

If you want to learn more about writing a GraphQL schema, have a look at the primer on the GraphQL type system. Edge & Node Also provides reference documentation for GraphQL schemas.

5. Defining Entities.

In this section we are going to take a look how you can structure and link your data. This is a prerequisite before actually defining entities. This is so important because every single query will be made against the entities indexed by your subgraphs and the data model you are going to define in the subgraph schema. Make sure to define the subgraph schema to match your dApp’s needs. Think of entities not as events or functions but as objects containing data.

Defining entity types with The Graph is straightforward with schema.graphql. The Graph Node was due to rest by generating top-level fields for querying single instances and collections of that entity type. Just make sure to annotate each type that should be an entity with an @entity directive.

A) Good example

To come back to our sample Gravatar contract, let’s have a look at a good example of defining entities correctly. You can see that the Gravatar entity shown below is structured around a Gravatar object.

Copy to Clipboard

B) Bad example

In contrast to the example above, defining entities based around events (like the GravatarAccepted and GravatarDeclined entities) is not the way an entity should be defined. Mapping events or function calls 1:1 two entities is not recommended.

Copy to Clipboard

C) Optional and Required Fields

Please note that Entity feels can be defined as either required or optional. A ! in the schema indicates that a field is required. An error message will be shown if a required field is not set in the mapping:

Copy to Clipboard

Is necessary for each entity to have an id field, which is of type ID! (string). Note that the id field needs to be unique among all entities of the same type as it serves as the primary key.

D) Built-In Scalar Types

GraphQL Supported Scalars

The following scalars are supported in the GraphQL API:

Type Description
Bytes Byte array, represented as a hexadecimal string. Commonly used for Ethereum hashes and addresses.
ID Stored as a string.
String Scalar for string values. Null characters are not supported and are automatically removed.
Boolean Scalar for boolean values.
Int The GraphQL spec defines Int to have size of 32 bytes.
BigInt Large integers. Used for Ethereum’s uint32, int64, uint64, …, uint256 types. Note: Everything below uint32, such as int32, uint24 or int8 is represented as i32.
BigDecimal BigDecimal High precision decimals represented as a signficand and an exponent. The exponent range is from −6143 to +6144. Rounded to 34 significant digits.
Enums

The syntax for creating enums within a schema is as follows:

Copy to Clipboard

After you have defined the enum in the schema, set an enum field on an entity by using the string representation of the enum value.

Let’s have a look how this all works with an example. Let’s say you want to set the tokenStatus to SecondOwner. You would do this by defining your entity first and settin the field with entity.tokenStatus = "SecondOwner".

Here’s an example how the Token entity with an enum field would look like:

Copy to Clipboard

If you want to learn more about writing enums, have a look at the GraphQL documentation.

E) Entity Relationships

Let’s have a look at entity relationships next. In your schema, an entity may have a relationship to one or more entities. The relationships are unidirectional in The Graph and may be traversed in your queries. Bidirectional relationships can be simulated by defining a unidirectional relationship on either “end” of the relationship.

You can define relationships in the same manner as with other fields. However, it is important to note that the type specified needs to be that of another entity.

There are two relationship types: one-to-one relationships and one-to-many relationships. We will discuss both in the following and give you examples for each.

One-To-One Relationships

Here is how you can define an optional one-to-one relationship between a a Transaction entity type and a TransactionReceipt entity type:

Copy to Clipboard
One-To-Many Relationships

If you want to define a required one-to-many relationship for a TokenBalance entity type with a Token entity, use the following:

Copy to Clipboard

F) Reverse Lookups

Defining reverse lookups on an entity is accomplished through the @derivedFrom field. You can define reverse lookups for entities that may be queried but cannot be set manually through the mappings API. Doing so will create a virtual field on the entity that is derived from the relationship defined on the other entity.

You can increase the indexing and querying performance by only storing one side of the relationship and deriving the other. In almost all cases, it is not recommended to store both sides of the relationship.

When it comes to storing one-to-many-relationships, it is recommended to always store the relationship on the ‘one’ side, leaving the ‘many’ side derived. By not storying an array of entities on the ‘many’ side, you can dramatically increase both indexing and querying performance of your subgraph. As a general rule, avoid storing arrays of entities as much as practically possible.

Example for a Reverse Lookup

Let’s have a look at an example that illustrates what we’ve learned so far. By deriving a tokenBalances field, token balances for a specific token can be made accessible:

Copy to Clipboard

G) Many-To-Many Relationships

Let’s say you want to define a many-to-many relationship for any number of users all belonging to any number of organizations. Modeling the relationship as an array in each of the two entities involved (users and organizations) would be the most straightforward way. However, this is not the most performant option for a symmetric relationship.

If you want to define a symmetric relationship, store only one side of the relationship and derive the other side for the highest performance.

Example for Defining a Symmetric Relationship

Let’s return to our example of defining relationships for users of organizations. In the following, we’ll have a look how you can define a reverse lookup (from a User entity type to an Organization entity type).

One way to achieve this is to look up the members attribute from within the Organization entity:

Copy to Clipboard

When it comes to querying, the organizations field on User will be resolved by finding all Organization entities that include the user’s ID.

To increase performance, you can create a mapping table to store the relationship. For each User / Organization pair, there is one entry. Here’s an example schema for a mapping table:

Copy to Clipboard

By using this strategy, queries will be required to descend into one additional level in order to retrieve a user’s organisation or the organisation ID:

Copy to Clipboard

By using this approach, your subgraph will be – in many cases dramatically – faster to index and query. This is because less data is being stored for the subgraph as you store many-to-many relationships in a more elaborate manner.

H) Adding comments to the schema

You can add comments above schema entities for readers by using double quotations "". Here’s an example how you can add comments:

Copy to Clipboard

6. Defining Fulltext Search Fields.

In The Graph, subgraph developers can use text search inputs to filter and rank entities by enabling fulltext search queries. Another functionality of fulltext queries is to process the query text input into stems before comparing it to the indexed text data. This allows developers to return matches for similar words.

Here are the elements of a fulltext query definition:

Fields included in the search
Ranking algorithm used to order the results
Query name
Language dictionary used to process text fields

Please note: All fields included in the definition must be from a single entity type, while each fulltext query may span multiple fields.

You can add a fulltext query by including a _Schema_ type with a fulltext directive in the GraphQL schema. Here are the elements of a fulltext query definition:

Copy to Clipboard

To filter Band entities in queries based on the text documents in the name, description, and bio fields, use the example bandSearch field. For a detailed descripton of the fulltext search API, have a look at GraphQL API – Queries.

Copy to Clipboard

A) Languages supported

The fulltext search API supports a variety of different languages. Please take into consideration that there are definitive and sometimes very subtle effects on the fulltext search API when choosing a different language than English.

The chosen language defines the context in which fields covered by a fulltext query field will be examined. This results in varying lexemes that are being produced by analysis and search queries depending on the language used.

As an example, choosing the supported Turkish dictionary will cause the word “token” to be stemmed to “toke”, while the English dictionary will stem it to “token”.

These are the supported language dictionaries:

Code Dictionary
simple General
da Danish
nl Dutch
en English
fi Finnish
fr French
de German
hu Hungarian
it Italian
no Norwegian
pt Portugese
ro Romanian
ru Russian
es Spanish
sv Swedish
tr Turkish

B) Ranking Algorithms

You can choose between two supported alogrithms that can be used to order results:

Algorithm Description
rank Use the match quality (0-1) of the fulltext query to order the results.
proximityRank Similar to rank but also includes the proximity of the matches.

7. Writing Mappings.

Mappings are used to transform the sourced blockchain data into entities defined in your schema. This is done so that the sourced data can be stored in The Graph Node. You can write mappings by using a subset of TypeScript , which is called AssemblyScript. The assembyl script can be compiled to WASM (WebAssembly).

Note that despite providing a familiar syntax, AssemblyScript is stricter than normal TypeScript.

When writing mappings, make sure to create an exported function of the same name for each event handler that is defined in subgraph.yaml under mapping.eventHandlers. Each event handler has to accept a single parameter called event. The type of the event needs to correspond to the event name which is being handled.

To return to our example subgraph with the Gravatar contract, the src/mapping.ts includes handlers for the NewGravatar and UpdatedGravatar events of the smart contract:

Copy to Clipboard

Now, let’s have a look what the mapping above does exactly.

The first handler in the mapping takes the NewGravatar event and transforms the sourced data from the Ethereum blockchain. To do this, the mapping creates a new Gravatar entity with new Gravatar(event.params.id.toHex()). The entity fields are populated through this using the corresponding event parameters. The variable gravatar represents this entity instance and has a corresponding id value of event.params.id.toHex().

The second handler in the mapping accesses The Graph Node and tries to load a potentially already existing Gravatar. If it does not yet exist, the Gravatar is being created on demand. Before the entity is saved back to the node’s store by using gravatar.save(), it is updated to match the new event parameters.

C) Recommended IDs for Creating New Entities

When creating new entities, it’s important to note that for every entity there needs to be a unique id among all entities of the same type. When the entity is created, the value of an entity’s id is set.

Please note: The value of id must be a string.

These are the recommended id values you can use when creating new entities:

  • event.params.id.toHex()
  • event.transaction.from.toHex()
  • event.transaction.hash.toHex() + "-" + event.logIndex.toString()

In The Graph Typescript Library, you can find utilities for interacting with The Graph Node store. The library also contains conveniences for handling smart contrat data and entites.

The library can be used in the mappings for your subgraph by imorting @graphprotocol/graph-ts in mapping.ts.

8. Code Generation.

For easy and type-safe smart contracts, events and entities, you can use The Graph CLI to generate AssemblyScript types from the subgraph’s GraphQL schema. This will include the contract ABIs in the data sources.

Code generation is accomplished via:

Copy to Clipboard

If a subgraph is already preconfigured via package.json, you can make use of the following command to generate AssemblyScript types:

Copy to Clipboard

For every smart contract in the ABI files mentioned in subgraph.yaml, an AssemblyScript class is being generated by the command. Doing so will allow oyu to bind the contracts to specific addresses in the mappings. It also allows you to call read-only contract methods against the block that is being processed. Aside from that, the command will provide you easy access to event parameters (and also the block and transaction the event originated from) by generating a class for every contract event.

The AssemblyScript types are written to:

Copy to Clipboard

In our Gravatar example, the type is written to generated/Gravity/Gravity.ts. This in turn allows the mappings to import these types with the following:

Copy to Clipboard

For each entity type in the subgraph’s GraphQL schema, the above generates one class. These classes provide:

  • Type-safe entity loading
  • Read and write access to entity fields
  • A save() method to write entities to store

Mappings can import all entity classes with the following command, as they are being written to < OUTPUT_DIR >/schema.ts:

Copy to Clipboard

Important: Before building or deploying your subgraph, you must perform the code generation at least once. After every change to the ABIs included in the manifest or the GraphQL schema, the code generation will have to be performed again.

The mapping code in your src/mapping.ts is not checked by the code generation. Before deploying your subgraph to The Graph Explorer, you can check the mapping code. To do so, run yarn build. Should the TypeScript compiler find any syntax errors, they will be highlighted so that you can fix these.

Calling the generated contract method prefixes with try_ is used in case the read-only methods of your contract revert. To return to the example of Gravatar, you will notice that the contract exposes the gravatarToOwner method. To handle a revert in that method, use this code:

Copy to Clipboard

Should you decide to rely on this method, it is recommended to be using a Graph Node connected to a Parity client as Graph Nodes connected to a Geth or Infura client may not detect all reverts.

9. Data Source Templates.

Registry or factory contracts are a common pattern in Ethereum smart contracts. This is important to note, as there will be one contract that is responsible for creating, managing or referencing a random number of other contracts, each with their own state state and events.

Whenever a smart contract uses registry/factory contracts, it is not possible to define a single or fixed number of data sources. This is because many of these contracts may be created and/or added throughout time. Even more so, the addresses of the sub-contracts may not always be known upfront.

A more dynamic approach to solve this issue is described in the following: data source templates.

A) Data Source for the Main Contract

By using data source templates, you can avoid the problems that come with registry or factory contracts. To define a data source for your main contract, you can use the simplified example data source for the Uniswap exchange factory contract below as a reference. Have a close look at the NewExchange(address,address) event handler, which is emitted whenever a new exchange contract is created on chain by the factory contract:

Copy to Clipboard

B) Data Source Templates for Dynamically Created Contracts

After defining the data source, you can now add the data source template to your subgraph’s manifest. The only difference to regular data sources is that your template lacks a predefined contract address under source. For each type of sub-contract that the parent contract manages, you would define one template.

Copy to Clipboard

C) Instantiating a Data Source Template

After you have defined all the data source templates, you can update your main contract mapping. Doing so will will lead to the creation of a dynamic data source instance from one of the templates.

Let’s have a look at our Gravatar example and how this is accomplished. In the first step, the Exchange template will have to be imported. You can do this, by changing the main contract mapping. Then you call the Exchange.create(address) method on the contract so that the new exchange contract starts to get indexed.

Copy to Clipboard

Important: Historical data (i.e., data contained in prior blocks) will not be processed. Only the calls and events for the block in which the data source was created and all following blocks will be processed.

If you like to include historical data that is relevant to the new data source, it is recommended to index the data in question. You can accomplish this by reading the current state of the contract followed by the creation of entities that represent that state at the creation time of the new data source.

D) Data Source Context

When instantiating a template, you can pass extra configuration by using data source contexts. In the Uniswap example, we could assume that exchanges are associated with a particular trading pair (included in the NewExchange event). In this example case, you can pass the information into the instantiated data source in the following way:

Copy to Clipboard

The context can then be accessed inside a mapping of the Exchange template:

Copy to Clipboard

Setters and getters such as setString are in the mappings for all value types.

10. Start Blocks.

To define from which block the data source starts indexing, you can use the optional setting of the startBlock. Defining a start block comes with the benefit that you are able to skip a large number of irrelevant blocks. As a general rule, it is recommended for subgraph developers to set the startBlock to the block in which the smart contract of the data source was created.

Copy to Clipboard

Important: If you want to find the block in which a contract was created, head over to Etherscan and do the following. 1) Find out the contract address, 2) Enter the address in the search bar, 3) Next you can load the transaction details page, in which you can find the start block

11. Call Handlers.

There are numerous contracts that avoid generating logs so that gas costs can be reduced. In these instances, events are not an effective way to collect relevant changes to the state of a contract. Instead of using events, your subgraph can subscribe to calls that are being made to the data source contract.

To do this, you will have to define call handlers. These need to reference the function signature and the mapping handler that will process calls to this function. The ethereum.Call (with the typed inputs to and outputs from the call) will have to be added to the mapping handler to process these calls. The mapping will be triggered by calls made at any depth in a transaction’s call chain. This allows to capture the activity with the data source contract by means of using proxy contracts.

There are only two cases in which call handlers are triggered. The first case is when the function specified gets called by an account other than the contract itself. The second case is when the function is marked as external (in Solidity) and gets called as part of another function in the same contract.

Note: Ganache and Rinkeby do not support call handlers. This is because Ganache and Rinkeby do not support the Parity tracing API, which call handlers depend on.

A) Defining a Call Handler

By adding a callHandlers array source under the data source you would like to subscribe your subgraph to, you can define a call handler in your manifest.

Copy to Clipboard

Calls are filtered by the function which is the normalized function signature for filtering calls.

The handler property is the name of the function in your mapping you would like to execute when the target function is called in the data source contract.

B) Mapping Function

Each call handler takes a single parameter that has a type corresponding to the name of the called function. In the example subgraph, there is a handler in the mapping when the createGravatar function is called. Consequently, it receives a CreateGravatarCall parameter as an argument:

Copy to Clipboard

A new CreateGravatarCall (which is a subclass of ethereum.Call and provided by @graphprotocol/graph-ts) will be taken by the handleCreateGravatar function. It includes the typed inputs and outputs of the call. The CreateGravatarCall type is generated for you when you run graph codegen.

12. Block Handlers.

When new blocks are appended to the chain, it is recommended that your subgraph updates its data – in addition to the subgraph subscribing to contract events or function calls. You can achieve this by letting the subgraph run a function after every block or only after blocks matching a predefined filter.

A) Supported Filters

Here is a quick overview of the filters that are supported:

Copy to Clipboard

For every block which contains a call to the contract (data source) the handler is defined under, the defined handler will be called once.

If there is no filter for a block handler, the handler will be called with every block. Only one block handler for each filter type can be contained in a data source.

Copy to Clipboard

B) Mapping Function

An ethereum.Block is the only argument the mapping function receives. Similar to mapping functions for events, the function is able to call smart contracts, create or update entities and can access existing subgraph entities in the store.

Copy to Clipboard

13. Anonymous Events.

By providing the topic 0 of the event, you can process anonymous events in Solidity, as in the example:

Copy to Clipboard

topic0 is equal to the hash of the event signature by default. If the signature and topic0 do not match, no event will be triggered.

14. IPFS Pinning.

Storing data on IPFS that would otherwise be too expensive to maintain on the blockchain is a common use case for combining Ethereum with IPFS. This can be accomplished by referencing the IPFS hash in Ethereum contracts.

By using IPFS hashes, your subgraph will use ipfs.cat and ipfs.map to read the corresponding files from IPFS. However, make sure that the files are pinned on the IPFS node that the Graph Node indexing the subgraph connects to so that this is done reliably. If you’re using the Hosted Service, you can find it here: https://api.thegraph.com/ipfs/

If you are developing a subgraph, there is a tool from Edge & Node that allows you to transfer files from one IPFS note to another. The tool is called ipfs-sync.

Summary

Fabulous! This was a huge tutorial. Great to see that you’ve made it to the end. You’ve learned how you can define which blockchain data is being indexed by The Graph and how it is stored.

Go Back

Developer Knowledge Hub