Developer onboarding
Developing Subgraphs
As a subgraph developer, you can define which blockchain data is being indexed by The Graph and how it is stored. Here are the three files the subgraph definition consists of:
As a prerequisite, you have to install The Graph CLI that allows you to create and deploy a suppressed.Let’s have a look at how it all works.
1. Installing the Graph CLI.
Before installing The Graph CLI, install yarn (installation instructions are in the graph-cli repository) You can also use npm but for the sake of brevity, we will use yarn in the following tutorial.
Once you have installed yarn, you can run the following command to install the Graph CLI:
2. Creating a Subgraph Project.
In the following, we will show you two ways of creating a subgraph. You can either create a subgraph from an existing contract or use an example sub graph.
A) Creating a subgraph from existing contract
You can use an already existing smart contract to bootstrap your new subgraph. Follow this part of the instruction in case you already have deployed a smart contract to Ethereum or a testnet.
First of all, we will create a subgraph that is assigned to index all events of an existing smart contract. The subgraph will attempt to fetch the contract ABI from Etherscan. If unsuccessful, it will fall back to requesting a local file path. An interactive form will guide you through the process in case any of the optional arguments are missing.
These are the supported networks on the Hosted Service:
B) Creating a subgraph by using an example subgraph
If you do not have an existing smart contract, you can use the sample subgraph of The Graph team. To initialize the example subgraph, use this command:
In this tutorial, we’re going to use a subgraph based on the Gravity contract. The smart contract manages user avatars. Whenever avatars are created or updated, the contract emits NewGravatar
or UpdateGravatar
events.
These events are then written to the Graph Node store as Gravatar
entities by the subgraph.
3. Creating a Subgraph Manifest
To create a subgraph manifest, we use the example of the Gravity contract mentioned above.
This is the subgraph.yaml for the example subgraph that indexes the Gravity contract:
Here’s a detailed explanation of the most important entries of the manifest:
You can use one subgraph to index the data originating from multiple smart contracts. To do so, simply add an entry to the dataSources
array for each from which you want to index data from.
Ordering rules
The following process is used to order of the triggers for a data source within a block:
- The transaction index within the block orders event and call triggers
- Should there be event and call triggers within the same transaction, the following convention is used for ordering: event triggers first then call triggers. Each type respects the particular order they are defined in the manifest.
- After event and call triggers, block triggers are run in the order as defined in the manifest.
Getting the ABIs
Please note that the ABI file(s) will have to match your contract(s). You can obtain AVI files in the following ways:
- Use your most current ABIs if you are building your own project
- In the case of creating a subgraph by public project, download it to your computer and use truffle compile or
solc
to compile the ABI - The third alternative is to use Etherscan. However, this is not recommended as it is not always reliable. Please note that the ABI found on Etherscan may be the latest one(s). If this is the case, your subgraph will not run.
4. The GraphQL Schema.
You can find the schema for your subgraph in schema.graphql
. Use the GraphQL interface definition language to define the GraphQL schema.
5. Defining Entities.
In this section we are going to take a look how you can structure and link your data. This is a prerequisite before actually defining entities. This is so important because every single query will be made against the entities indexed by your subgraphs and the data model you are going to define in the subgraph schema. Make sure to define the subgraph schema to match your dApp’s needs. Think of entities not as events or functions but as objects containing data.
Defining entity types with The Graph is straightforward with schema.graphql
. The Graph Node was due to rest by generating top-level fields for querying single instances and collections of that entity type. Just make sure to annotate each type that should be an entity with an @entity
directive.
A) Good example
To come back to our sample Gravatar contract, let’s have a look at a good example of defining entities correctly. You can see that the Gravatar entity shown below is structured around a Gravatar object.
B) Bad example
In contrast to the example above, defining entities based around events (like the GravatarAccepted
and GravatarDeclined
entities) is not the way an entity should be defined. Mapping events or function calls 1:1 two entities is not recommended.
C) Optional and Required Fields
Please note that Entity feels can be defined as either required or optional. A !
in the schema indicates that a field is required. An error message will be shown if a required field is not set in the mapping:
Is necessary for each entity to have an id
field, which is of type ID!
(string). Note that the id
field needs to be unique among all entities of the same type as it serves as the primary key.
D) Built-In Scalar Types
GraphQL Supported Scalars
The following scalars are supported in the GraphQL API:
Type | Description |
---|---|
Bytes | Byte array, represented as a hexadecimal string. Commonly used for Ethereum hashes and addresses. |
ID | Stored as a string. |
String | Scalar for string values. Null characters are not supported and are automatically removed. |
Boolean | Scalar for boolean values. |
Int | The GraphQL spec defines Int to have size of 32 bytes. |
BigInt | Large integers. Used for Ethereum’s uint32, int64, uint64, …, uint256 types. Note: Everything below uint32, such as int32, uint24 or int8 is represented as i32. |
BigDecimal | BigDecimal High precision decimals represented as a signficand and an exponent. The exponent range is from −6143 to +6144. Rounded to 34 significant digits. |
Enums
The syntax for creating enums within a schema is as follows:
After you have defined the enum in the schema, set an enum field on an entity by using the string representation of the enum value.
Let’s have a look how this all works with an example. Let’s say you want to set the tokenStatus
to SecondOwner
. You would do this by defining your entity first and settin the field with entity.tokenStatus = "SecondOwner"
.
Here’s an example how the Token
entity with an enum field would look like:
If you want to learn more about writing enums, have a look at the GraphQL documentation.
E) Entity Relationships
Let’s have a look at entity relationships next. In your schema, an entity may have a relationship to one or more entities. The relationships are unidirectional in The Graph and may be traversed in your queries. Bidirectional relationships can be simulated by defining a unidirectional relationship on either “end” of the relationship.
You can define relationships in the same manner as with other fields. However, it is important to note that the type specified needs to be that of another entity.
There are two relationship types: one-to-one relationships and one-to-many relationships. We will discuss both in the following and give you examples for each.
One-To-One Relationships
Here is how you can define an optional one-to-one relationship between a a Transaction
entity type and a TransactionReceipt
entity type:
One-To-Many Relationships
If you want to define a required one-to-many relationship for a TokenBalance
entity type with a Token
entity, use the following:
F) Reverse Lookups
Defining reverse lookups on an entity is accomplished through the @derivedFrom
field. You can define reverse lookups for entities that may be queried but cannot be set manually through the mappings API. Doing so will create a virtual field on the entity that is derived from the relationship defined on the other entity.
When it comes to storing one-to-many-relationships, it is recommended to always store the relationship on the ‘one’ side, leaving the ‘many’ side derived. By not storying an array of entities on the ‘many’ side, you can dramatically increase both indexing and querying performance of your subgraph. As a general rule, avoid storing arrays of entities as much as practically possible.
Example for a Reverse Lookup
Let’s have a look at an example that illustrates what we’ve learned so far. By deriving a tokenBalances
field, token balances for a specific token can be made accessible:
G) Many-To-Many Relationships
Let’s say you want to define a many-to-many relationship for any number of users all belonging to any number of organizations. Modeling the relationship as an array in each of the two entities involved (users and organizations) would be the most straightforward way. However, this is not the most performant option for a symmetric relationship.
Example for Defining a Symmetric Relationship
Let’s return to our example of defining relationships for users of organizations. In the following, we’ll have a look how you can define a reverse lookup (from a User
entity type to an Organization entity
type).
One way to achieve this is to look up the members
attribute from within the Organization
entity:
When it comes to querying, the organizations
field on User will be resolved by finding all Organization
entities that include the user’s ID.
To increase performance, you can create a mapping table to store the relationship. For each User
/ Organization
pair, there is one entry. Here’s an example schema for a mapping table:
By using this strategy, queries will be required to descend into one additional level in order to retrieve a user’s organisation or the organisation ID:
By using this approach, your subgraph will be – in many cases dramatically – faster to index and query. This is because less data is being stored for the subgraph as you store many-to-many relationships in a more elaborate manner.
H) Adding comments to the schema
You can add comments above schema entities for readers by using double quotations ""
. Here’s an example how you can add comments:
6. Defining Fulltext Search Fields.
In The Graph, subgraph developers can use text search inputs to filter and rank entities by enabling fulltext search queries. Another functionality of fulltext queries is to process the query text input into stems before comparing it to the indexed text data. This allows developers to return matches for similar words.
Here are the elements of a fulltext query definition:
You can add a fulltext query by including a _Schema_
type with a fulltext directive in the GraphQL schema. Here are the elements of a fulltext query definition:
To filter Band entities in queries based on the text documents in the name
, description
, and bio
fields, use the example bandSearch
field. For a detailed descripton of the fulltext search API, have a look at GraphQL API – Queries.
A) Languages supported
The fulltext search API supports a variety of different languages. Please take into consideration that there are definitive and sometimes very subtle effects on the fulltext search API when choosing a different language than English.
The chosen language defines the context in which fields covered by a fulltext query field will be examined. This results in varying lexemes that are being produced by analysis and search queries depending on the language used.
As an example, choosing the supported Turkish dictionary will cause the word “token” to be stemmed to “toke”, while the English dictionary will stem it to “token”.
These are the supported language dictionaries:
Code | Dictionary |
---|---|
simple | General |
da | Danish |
nl | Dutch |
en | English |
fi | Finnish |
fr | French |
de | German |
hu | Hungarian |
it | Italian |
no | Norwegian |
pt | Portugese |
ro | Romanian |
ru | Russian |
es | Spanish |
sv | Swedish |
tr | Turkish |
B) Ranking Algorithms
You can choose between two supported alogrithms that can be used to order results:
Algorithm | Description |
---|---|
rank | Use the match quality (0-1) of the fulltext query to order the results. |
proximityRank | Similar to rank but also includes the proximity of the matches. |
7. Writing Mappings.
Mappings are used to transform the sourced blockchain data into entities defined in your schema. This is done so that the sourced data can be stored in The Graph Node. You can write mappings by using a subset of TypeScript , which is called AssemblyScript. The assembyl script can be compiled to WASM (WebAssembly).
When writing mappings, make sure to create an exported function of the same name for each event handler that is defined in subgraph.yaml
under mapping.eventHandlers
. Each event handler has to accept a single parameter called event
. The type of the event needs to correspond to the event name which is being handled.
To return to our example subgraph with the Gravatar contract, the src/mapping.ts
includes handlers for the NewGravatar
and UpdatedGravatar
events of the smart contract:
Now, let’s have a look what the mapping above does exactly.
The first handler in the mapping takes the NewGravatar
event and transforms the sourced data from the Ethereum blockchain. To do this, the mapping creates a new Gravatar
entity with new Gravatar(event.params.id.toHex())
. The entity fields are populated through this using the corresponding event parameters. The variable gravatar
represents this entity instance and has a corresponding id
value of event.params.id.toHex()
.
The second handler in the mapping accesses The Graph Node and tries to load a potentially already existing Gravatar. If it does not yet exist, the Gravatar is being created on demand. Before the entity is saved back to the node’s store by using gravatar.save()
, it is updated to match the new event parameters.
C) Recommended IDs for Creating New Entities
When creating new entities, it’s important to note that for every entity there needs to be a unique id
among all entities of the same type. When the entity is created, the value of an entity’s id
is set.
These are the recommended id
values you can use when creating new entities:
In The Graph Typescript Library, you can find utilities for interacting with The Graph Node store. The library also contains conveniences for handling smart contrat data and entites.
The library can be used in the mappings for your subgraph by imorting @graphprotocol/graph-ts
in mapping.ts.
8. Code Generation.
For easy and type-safe smart contracts, events and entities, you can use The Graph CLI to generate AssemblyScript types from the subgraph’s GraphQL schema. This will include the contract ABIs in the data sources.
Code generation is accomplished via:
If a subgraph is already preconfigured via package.json, you can make use of the following command to generate AssemblyScript types:
For every smart contract in the ABI files mentioned in subgraph.yaml
, an AssemblyScript class is being generated by the command. Doing so will allow oyu to bind the contracts to specific addresses in the mappings. It also allows you to call read-only contract methods against the block that is being processed. Aside from that, the command will provide you easy access to event parameters (and also the block and transaction the event originated from) by generating a class for every contract event.
The AssemblyScript types are written to:
In our Gravatar example, the type is written to generated/Gravity/Gravity.ts
. This in turn allows the mappings to import these types with the following:
For each entity type in the subgraph’s GraphQL schema, the above generates one class. These classes provide:
Mappings can import all entity classes with the following command, as they are being written to < OUTPUT_DIR >/schema.ts:
The mapping code in your src/mapping.ts
is not checked by the code generation. Before deploying your subgraph to The Graph Explorer, you can check the mapping code. To do so, run yarn build
. Should the TypeScript compiler find any syntax errors, they will be highlighted so that you can fix these.
Calling the generated contract method prefixes with try_
is used in case the read-only methods of your contract revert. To return to the example of Gravatar, you will notice that the contract exposes the gravatarToOwner
method. To handle a revert in that method, use this code:
Should you decide to rely on this method, it is recommended to be using a Graph Node connected to a Parity client as Graph Nodes connected to a Geth or Infura client may not detect all reverts.
9. Data Source Templates.
Registry or factory contracts are a common pattern in Ethereum smart contracts. This is important to note, as there will be one contract that is responsible for creating, managing or referencing a random number of other contracts, each with their own state state and events.
Whenever a smart contract uses registry/factory contracts, it is not possible to define a single or fixed number of data sources. This is because many of these contracts may be created and/or added throughout time. Even more so, the addresses of the sub-contracts may not always be known upfront.
A more dynamic approach to solve this issue is described in the following: data source templates.
A) Data Source for the Main Contract
By using data source templates, you can avoid the problems that come with registry or factory contracts. To define a data source for your main contract, you can use the simplified example data source for the Uniswap exchange factory contract below as a reference. Have a close look at the NewExchange(address,address)
event handler, which is emitted whenever a new exchange contract is created on chain by the factory contract:
B) Data Source Templates for Dynamically Created Contracts
After defining the data source, you can now add the data source template to your subgraph’s manifest. The only difference to regular data sources is that your template lacks a predefined contract address under source. For each type of sub-contract that the parent contract manages, you would define one template.
C) Instantiating a Data Source Template
After you have defined all the data source templates, you can update your main contract mapping. Doing so will will lead to the creation of a dynamic data source instance from one of the templates.
Let’s have a look at our Gravatar example and how this is accomplished. In the first step, the Exchange template will have to be imported. You can do this, by changing the main contract mapping. Then you call the Exchange.create(address) method on the contract so that the new exchange contract starts to get indexed.
If you like to include historical data that is relevant to the new data source, it is recommended to index the data in question. You can accomplish this by reading the current state of the contract followed by the creation of entities that represent that state at the creation time of the new data source.
D) Data Source Context
When instantiating a template, you can pass extra configuration by using data source contexts. In the Uniswap example, we could assume that exchanges are associated with a particular trading pair (included in the NewExchange
event). In this example case, you can pass the information into the instantiated data source in the following way:
The context can then be accessed inside a mapping of the Exchange
template:
Setters and getters such as setString
are in the mappings for all value types.
10. Start Blocks.
To define from which block the data source starts indexing, you can use the optional setting of the startBlock
. Defining a start block comes with the benefit that you are able to skip a large number of irrelevant blocks. As a general rule, it is recommended for subgraph developers to set the startBlock
to the block in which the smart contract of the data source was created.
11. Call Handlers.
There are numerous contracts that avoid generating logs so that gas costs can be reduced. In these instances, events are not an effective way to collect relevant changes to the state of a contract. Instead of using events, your subgraph can subscribe to calls that are being made to the data source contract.
To do this, you will have to define call handlers. These need to reference the function signature and the mapping handler that will process calls to this function. The ethereum.Call
(with the typed inputs to and outputs from the call) will have to be added to the mapping handler to process these calls. The mapping will be triggered by calls made at any depth in a transaction’s call chain. This allows to capture the activity with the data source contract by means of using proxy contracts.
There are only two cases in which call handlers are triggered. The first case is when the function specified gets called by an account other than the contract itself. The second case is when the function is marked as external (in Solidity) and gets called as part of another function in the same contract.
A) Defining a Call Handler
By adding a callHandlers
array source under the data source you would like to subscribe your subgraph to, you can define a call handler in your manifest.
Calls are filtered by the function
which is the normalized function signature for filtering calls.
The handler
property is the name of the function in your mapping you would like to execute when the target function is called in the data source contract.
B) Mapping Function
Each call handler takes a single parameter that has a type corresponding to the name of the called function. In the example subgraph, there is a handler in the mapping when the createGravatar
function is called. Consequently, it receives a CreateGravatarCall
parameter as an argument:
A new CreateGravatarCall
(which is a subclass of ethereum.Call
and provided by @graphprotocol/graph-ts
) will be taken by the handleCreateGravatar
function. It includes the typed inputs and outputs of the call. The CreateGravatarCall
type is generated for you when you run graph codegen
.
12. Block Handlers.
When new blocks are appended to the chain, it is recommended that your subgraph updates its data – in addition to the subgraph subscribing to contract events or function calls. You can achieve this by letting the subgraph run a function after every block or only after blocks matching a predefined filter.
A) Supported Filters
Here is a quick overview of the filters that are supported:
For every block which contains a call to the contract (data source) the handler is defined under, the defined handler will be called once.
If there is no filter for a block handler, the handler will be called with every block. Only one block handler for each filter type can be contained in a data source.
B) Mapping Function
An ethereum.Block
is the only argument the mapping function receives. Similar to mapping functions for events, the function is able to call smart contracts, create or update entities and can access existing subgraph entities in the store.
13. Anonymous Events.
By providing the topic 0 of the event, you can process anonymous events in Solidity, as in the example:
topic0
is equal to the hash of the event signature by default. If the signature and topic0 do not match, no event will be triggered.
14. IPFS Pinning.
Storing data on IPFS that would otherwise be too expensive to maintain on the blockchain is a common use case for combining Ethereum with IPFS. This can be accomplished by referencing the IPFS hash in Ethereum contracts.
By using IPFS hashes, your subgraph will use ipfs.cat and ipfs.map to read the corresponding files from IPFS. However, make sure that the files are pinned on the IPFS node that the Graph Node indexing the subgraph connects to so that this is done reliably. If you’re using the Hosted Service, you can find it here: https://api.thegraph.com/ipfs/
If you are developing a subgraph, there is a tool from Edge & Node that allows you to transfer files from one IPFS note to another. The tool is called ipfs-sync.