docs(substreams): miscellaneous docs improvements and update

2024-08-16 15:22:27 +02:00
parent bc2cd6bab2
commit 37f1fbfe04
6 changed files with 137 additions and 103 deletions
--- a/docs/indexing/substreams-integration.md
+++ b/docs/indexing/substreams-integration.md
@@ -18,29 +18,15 @@ So basically when processing a block we need to emit the block itself, all trans

 **The data model that encodes changes, transaction and blocks in messages, can be found** [**here**](https://github.com/propeller-heads/propeller-protocol-lib/tree/main/proto/tycho/evm/v1)**.**&#x20;

-#### Common Models
+#### Models

-The following models are shared for both vm and native integrations.
+The models below are used for communication between Substreams and our indexer, as well as between Substreams modules.

-{% @github-files/github-code-block url="https://github.com/propeller-heads/propeller-protocol-lib/blob/main/proto/tycho/evm/v1/common.proto" %}
+Our indexer expects to receive a `BlockChanges` output from your Substreams package.

-#### VM Specific Models
+{% @github-files/github-code-block url="https://github.com/propeller-heads/propeller-protocol-lib/blob/main/proto/tycho/evm/v1/" %}

-The models shown below are specific to vm integrations:
-
-{% @github-files/github-code-block url="https://github.com/propeller-heads/propeller-protocol-lib/blob/main/proto/tycho/evm/v1/vm.proto" %}
-
-Please be aware that changes need to be aggregated on the transaction level, it is considered an error to emit `BlockContractChanges` with duplicated transactions present in the `changes` attributes.
-
-All attributes are expected to be set in the final message unless the docs (in the comments) indicate otherwise.
-
-#### Native Integration Models
-
-The models below are very similar to the vm integration models but have a few modifications necessary to support native integrations.
-
-{% @github-files/github-code-block url="https://github.com/propeller-heads/propeller-protocol-lib/blob/main/proto/tycho/evm/v1/entity.proto" %}
-
-Once again changes must be aggregated on a transaction level, emitting these models with duplicated transaction as the final output would be considered an error.
+Please be aware that changes need to be aggregated on the transaction level, it is considered an error to emit `BlockChanges` with duplicated transactions present in the `changes` attributes.

 #### Integer Byte encoding

@@ -50,17 +36,15 @@ Many of the types above are variable length bytes. This allows for flexibility a

 **Strings**: If you need to store strings, please use utf-8 encoding to store them as bytes.

-**Attributes:** the value encoding for attributes in the native implementation messages is variable. It depends on the use case. Since the attributes are highly dynamic they are only used by the corresponding logic components, so the encoding can be tailored to the logic implementation: E.g. since Rust uses little endian one may choose to use little endian encoding for integers if the native logic module is written in Rust.
-
-
+**Attributes:** the value encoding for attributes is variable. It depends on the use case. Since the attributes are highly dynamic they are only used by the corresponding logic components, so the encoding can be tailored to the logic implementation: E.g. since Rust uses little endian one may choose to use little endian encoding for integers if the native logic module is written in Rust.

 ### Changes of interest

 PropellerHeads integration should at least communicate the following changes:

-* Any changes to the protocol state, for VM integrations that usually means contract storage changes of all contracts whose state may be accessed during a swap operation.
-* Any newly added protocol component such as a pool, pair, market, etc. Basically anything that signifies that a new operation can be executed now using the protocol.
-* ERC20 Balances, whenever the balances of one contracts involved with the protocol change, this change should be communicated in terms of absolute balances.
+- Any changes to the protocol state, for VM integrations that usually means contract storage changes of all contracts whose state may be accessed during a swap operation.
+- Any newly added protocol component such as a pool, pair, market, etc. Basically anything that signifies that a new operation can be executed now using the protocol.
+- ERC20 Balances, whenever the balances of one contracts involved with the protocol change, this change should be communicated in terms of absolute balances.

 In the next section we will show a few common techniques that can be leveraged to quickly implement an integration.

@@ -70,9 +54,9 @@ Before starting, it is important to be aware of the protocol we are aiming to in

 It is especially important to know:

-* Which contracts are involved in the protocol and what functions do they serve. How do they affect the behaviour of the component being integrated?
-* What conditions (e.g. oracle update) or what kind of method calls can lead to a relevant state change on the protocol, which ultimately changes the protocols behaviour if observed externally.
-* Are there components added or removed, and how are they added. Most protocols use either a factory contract, which can be used to deploy new components, or they use a method call that provisiona a new component within the overall system.
+- Which contracts are involved in the protocol and what functions do they serve. How do they affect the behaviour of the component being integrated?
+- What conditions (e.g. oracle update) or what kind of method calls can lead to a relevant state change on the protocol, which ultimately changes the protocols behaviour if observed externally.
+- Are there components added or removed, and how are they added. Most protocols use either a factory contract, which can be used to deploy new components, or they use a method call that provisiona a new component within the overall system.

 Once the workings of the protocol are clear the implementation can start.

@@ -104,34 +88,9 @@ Newly created components are detected by mapping over the `sf.ethereum.type.v2.B

 The output message should usually contain as much information about the component available at that time as well as the transaction that created the protocol component.

-We have found that using the final model prefilled with only component changes is usually good enough since it holds all the information that will be necessary at the end.&#x20;
+We have found that using the final model (`BlockChanges`) prefilled with only component changes is usually good enough since it holds all the information that will be necessary at the end.&#x20;

-For VM Integrations the final model is `BlockContractChanges`:
-
-```protobuf
-// A set of changes aggregated by transaction.
-message TransactionContractChanges {
-  // The transaction instance that results in the changes.
-  Transaction tx = 1;
-  // Contains the changes induced by the above transaction, aggregated on a per-contract basis.
-  // Must include changes to every contract that is tracked by all ProtocolComponents.
-  repeated ContractChange contract_changes = 2;
-  // An array of any component changes.
-  repeated ProtocolComponent component_changes = 3;
-  // An array of balance changes to components.
-  repeated BalanceChange balance_changes = 4;
-}
-
-// A set of transaction changes within a single block.
-message BlockContractChanges {
-  // The block for which these changes are collectively computed.
-  Block block = 1;
-  // The set of transaction changes observed in the specified block.
-  repeated TransactionContractChanges changes = 2;
-}
-```
-
-Note that a single transaction may emit multiple newly created components. In this case it is expected that the `TransactionContractChanges.component_changes`, contains multiple `ProtocolComponents`.
+Note that a single transaction may emit multiple newly created components. In this case it is expected that the `TransactionChanges.component_changes`, contains multiple `ProtocolComponents`.

 Once emitted, the protocol components should be stored in a Store, since we will later have to use this store to decide whether a contract is interesting to us or not.

@@ -143,11 +102,12 @@ This means the relative values have to be aggregated by component, to arrive at

 Since this is challenging the following approach is recommended:

-* Use a handler to process a block and emit the `BalanceDeltas` struct. Make sure to sort the balance deltas by `component_id, token_address`
-* Aggregate the BalanceDelta messages using a `BigIntAddStore`.
-* In a final handler, use as inputs: A `DeltaStore` input from step 2 and the `BalanceDeltas` from step 1. You can now zip the deltas from the store with the balance deltas from step 1. The store deltas contains the aggregated (absolute) balance at each version and the balance deltas contain the corresponding transaction.
+- Use a handler to process a block and emit the `BlockBalanceDeltas` struct. Make sure to sort the balance deltas by `component_id, token_address`
+- Aggregate the BalanceDelta messages using a `BigIntAddStore`.
+- In a final handler, use as inputs: A `DeltaStore` input from step 2 and the `BlockBalanceDeltas` from step 1. You can now zip the deltas from the store with the balance deltas from step 1. The store deltas contains the aggregated (absolute) balance at each version and the balance deltas contain the corresponding transaction.
+
+Our Substreams SDK provide the `extract_balance_deltas_from_tx` function that extracts all relevant `BalanceDelta` from ERC20 `Transfer` events in a given transaction (see Curve implementation).

 #### Tracking State Changes

 To track contract changes, you can simply use the `extract_contract_changes` function (see balancer implementation). This function will extract all relevant contract storage changes given the full block model and a store that flags contract addresses as relevant.
-
--- a/docs/indexing/vm-integration/README.md
+++ b/docs/indexing/vm-integration/README.md
@@ -4,7 +4,7 @@ Our indexing integrations use the Substreams library to transform raw blockchain

 ## Example

-We have integrated the **Ambient** protocol as a reference, see `/substreams/ethereum-ambient` for more information.
+We have integrated the **Balancer** protocol as a reference, see `/substreams/ethereum-balancer` for more information.

 ## Step by step

@@ -51,7 +51,6 @@ If you are unfamiliar with ProtoBuf at all, you can start with the [official doc

 First get familiar with the raw ProtoBuf definitions provided by us:
 - [common.proto](../../../proto/tycho/evm/v1/common.proto) - Common types used by all integration types
- [vm.proto](../../../proto/tycho/evm/v1/vm.proto) - Types specific to the VM integration

 You can also create your own intermediate ProtoBufs. These files should reside in your own substreams package, e.g. `./substreams/ethereum-template/proto/custom-messages.proto`. You have to link these files in the `substreams.yaml` file, see the [manifest docs](https://substreams.streamingfast.io/developers-guide/creating-your-manifest) for more information or you can look at the official substreams example integration of [UniswapV2](https://github.com/messari/substreams/blob/master/uniswap-v2/substreams.yaml#L20-L22).

@@ -63,7 +62,7 @@ The goal of the rust module is to implement the logic that will transform the ra

 *This is the actual integration code that you will be writing!*

-The module is a Rust library that is compiled into a SPKG (`.spkg`) file using the Substreams CLI and then loaded by the Substreams server. It is defined by the `lib.rs` file (see the [Ambient reference example](../../../substreams/ethereum-ambient/src/lib.rs)).
+The module is a Rust library that is compiled into a SPKG (`.spkg`) file using the Substreams CLI and then loaded by the Substreams server. It is defined by the `lib.rs` file (see the [Balancer reference example](../../../substreams/ethereum-balancer/src/lib.rs)).

 Read our [Substreams README.md](../../../substreams/README.md) for more information on how to write the Rust module.

@@ -74,26 +73,27 @@ Read our [Substreams README.md](../../../substreams/README.md) for more informat
    ```bash
    cp -r ./substreams/ethereum-template ./substreams/[CHAIN]-[PROTOCOL_SYSTEM]
    ```
-1. Implement the logic in the Rust module `lib.rs`. The main function to implement is the `map_changes` function, which is called for every block. 
+1. Implement the logic in the Rust module `lib.rs`. The main function to implement is the `map_protocol_changes` function, which is called for every block. 
    
    ```rust
    #[substreams::handlers::map]
-    fn map_changes(
+    fn map_protocol_changes(
        block: eth::v2::Block,
-    ) -> Result<tycho::BlockContractChanges, substreams::errors::Error> {}
+    ) -> Result<tycho::BlockChanges, substreams::errors::Error> {}
    ```
-    The `map_changes` function takes a raw block as input and returns a `BlockContractChanges` struct, which is derived from the `BlockContractChanges` protobuf message in [vm.proto](../../../proto/tycho/evm/v1/vm.proto). 
+    The `map_protocol_changes` function takes a raw block as input and returns a `BlockChanges` struct, which is derived from the `BlockChanges` protobuf message in [vm.proto](../../../proto/tycho/evm/v1/vm.proto). 


-1. The `BlockContractChanges` is a list of `TransactionContractChanges`, which includes these main fields:
+1. The `BlockChanges` is a list of `TransactionChanges`, which includes these main fields:
    - list of `ContractChange` - All storage slots that have changed in the transaction for every contract tracked by any ProtocolComponent
+    - list of `EntityChanges` - All the attribute changes in the transaction
    - list of `ProtocolComponent` - All the protocol component changes in the transaction
-    - list of `BalanceChange` - All the contract component changes in the transaction
+    - list of `BalanceChange` - All the token balances changes in the transaction

-    See the [Ambient reference example](../../../substreams/ethereum-ambient/src/lib.rs) for more information.
+    See the [Balancer reference example](../../../substreams/ethereum-balancer/src/lib.rs) for more information.

 1. If you are more advanced with Substreams, you can define more steps than a single "map" step, including defining your own protobuf files. Add these protobuf files in your `pb` folder and update the manifest accordingly. This allows for better parallelization of the indexing process. See the official documentation of [modules](https://substreams.streamingfast.io/concepts-and-fundamentals/modules#modules-basics-overview).

 ### Testing

-Read the [Substreams testing docs](../../../substreams/README.md#testing-your-implementation) for more information on how to test your integration.
+Read the [Substreams testing docs](../../../substreams/README.md#test-your-implementation) for more information on how to test your integration.
--- a/proto/tycho/evm/v1/common.proto
+++ b/proto/tycho/evm/v1/common.proto
@@ -83,7 +83,8 @@ message ProtocolComponent {
  // Addresses of the contracts used by the component.
  // Usually it is a single contract, but some protocols use multiple contracts.
  repeated bytes contracts = 3;
-  // Attributes of the component. Used mainly be the native integration.
+  // Static attributes of the component.
+  // These attributes MUST be immutable. If it can ever change, it should be given as an EntityChanges for this component id.
  // The inner ChangeType of the attribute has to match the ChangeType of the ProtocolComponent.
  repeated Attribute static_att = 4;
  // Type of change the component underwent.
@@ -160,6 +161,7 @@ message TransactionChanges {
 }

 // A set of transaction changes within a single block.
+// This message must be the output of your substreams module.
 message BlockChanges {
  // The block for which these changes are collectively computed.
  Block block = 1;
--- a/substreams/Readme.md
+++ b/substreams/Readme.md
@@ -22,3 +22,6 @@ the package you'd like to pre release. This will create a
 `[package]-[semver].pre-[commit-sha]` release in our spkg repository which you can use 
 to run the substream´.

+## Test your implementation
+
+To run a full end-to-end integration test you can refer to the [testing script documentation](../testing/README.md)
--- a/substreams/ethereum-template/integration_test.tycho.yaml
+++ b/substreams/ethereum-template/integration_test.tycho.yaml
@@ -1,19 +1,36 @@
+# Name of the substreams config file in your substreams module. Usually "./substreams.yaml"
 substreams_yaml_path: ./substreams.yaml
-adapter_contract: "SwapAdapter.evm.runtime"
+# Name of the adapter contract, usually: ProtocolSwapAdapter"
+adapter_contract: "SwapAdapter"
+# Constructor signature of the Adapter contract"
 adapter_build_signature: "constructor(address)"
+# A comma separated list of args to be passed to the contructor of the Adapter contract"
 adapter_build_args: "0x0000000000000000000000000000000000000000"
+# Whether or not the testing script should skip checking balances of the protocol components.
+# If set to `true` please always add a reason why it's skipped.
 skip_balance_check: false
+# A list of accounts that need to be indexed to run the tests properly.
+# Usually used when there is a global component required by all pools and created before the tested range of blocks. For example a factory or a vault.
+# Please note that this component needs to be indexed by your substreams module, this feature is only for testing purpose.
+# Also please always add a reason why this account is needed for your tests.
+# This will be applied to each test.
 initialized_accounts:
  - "0xae7ab96520DE3A18E5e111B5EaAb095312D7fE84" # Needed for ....
+# A list of protocol types names created by your Substreams module.
 protocol_type_names:
  - "type_name_1"
  - "type_name_2"
+# A list of tests.
 tests:
+  # Name of the test
  - name: test_pool_creation
+    # Indexed block range
    start_block: 123
    stop_block: 456
+    # Same as global `initialized_accounts` but only scoped to this test.
    initialized_accounts:
      - "0x0c0e5f2fF0ff18a3be9b835635039256dC4B4963" # Needed for ....
+    # A list of expected component indexed in the block range. Each component must match perfectly the `ProtocolComponent` indexed by your subtreams module.
    expected_components:
      - id: "0xbebc44782c7db0a1a60cb6fe97d0b483032ff1c7"
        tokens:
@@ -24,6 +41,8 @@ tests:
          attr_1: "value"
          attr_2: "value"
        creation_tx: "0x20793bbf260912aae189d5d261ff003c9b9166da8191d8f9d63ff1c7722f3ac6"
+        # Whether or not the script should skip trying to simulate a swap on this component.
+        # If set to `true` please always add a reason why it's skipped.
        skip_simulation: false
  - name: test_something_else
    start_block: 123
--- a/testing/README.md
+++ b/testing/README.md
@@ -1,66 +1,116 @@
 # Substreams Testing

-This package provides a comprehensive testing suite for Substreams modules. The testing suite is designed to facilitate end-to-end testing, ensuring that your Substreams modules function as expected.
+This package provides a comprehensive testing suite for Substreams modules. The testing suite is designed to facilitate
+end-to-end testing, ensuring that your Substreams modules function as expected.

 ## Overview

-The testing suite builds the `.spkg` for your Substreams module, indexes a specified block range, and verifies that the expected state has been correctly indexed in PostgreSQL.
+The testing suite builds the `.spkg` for your Substreams module, indexes a specified block range, and verifies that the
+expected state has been correctly indexed in PostgreSQL.
+Additionally, it will also try to simulate some transactions using the `SwapAdapter` interface.

 ## Prerequisites

- Latest version of our indexer, Tycho. Please contact us to obtain the latest version. Once acquired, place it in the `/testing/` directory.
+- Latest version of our indexer, Tycho. Please contact us to obtain the latest version. Once acquired, place it in the
+  `/usr/local/bin/` directory.
 - Access to PropellerHeads' private PyPI repository. Please contact us to obtain access.
 - Docker installed on your machine.
+- [Conda](https://conda.io/projects/conda/en/latest/user-guide/install/index.html)
+  and [AWS cli](https://aws.amazon.com/cli/) installed

 ## Test Configuration

-Tests are defined in a `yaml` file. A template can be found at
+Tests are defined in a `yaml` file. A documented template can be found at
 `substreams/ethereum-template/integration_test.tycho.yaml`. The configuration file should include:

 - The target Substreams config file.
+- The corresponding SwapAdapter and args to build it.
 - The expected protocol types.
 - The tests to be run.

-Each test will index all blocks between `start-block` and `stop-block` and verify that the indexed state matches the expected state.
+Each test will index all blocks between `start-block` and `stop-block`, verify that the indexed state matches the
+expected state and optionally simulate transactions using `SwapAdapter` interface.

-You will also need the EVM Runtime file for the adapter contract. 
-The script to generate this file is available under `evm/scripts/buildRuntime.sh`.
-Please place this Runtime file under the respective `substream` directory inside the `evm` folder.
+You will also need the VM Runtime file for the adapter contract.
+Our testing script should be able to build it using your test config.
+The script to generate this file manually is available under `evm/scripts/buildRuntime.sh`.

-## Running Tests
+## Setup testing environment

 ### Step 1: Export Environment Variables

-Export the required environment variables for the execution. You can find the available environment variables in the `.env.default` file.
+**DOMAIN_OWNER**
+
+- **Description**: The domain owner identifier for Propellerhead's AWS account, used for authenticating on the private
+  PyPI repository.
+- **Example**: `export DOMAIN_OWNER=123456789`
+
+### Step 2: Create python virtual environment for testing
+
+Run setup env script. It will create a conda virtual env and install all dependencies.
+
+Please note that some dependencies require access to our private PyPI repository.
+
+```
+setup_env.sh
+```
+
+## Running Tests
+
+### Prerequisites
+
+This section requires a testing environment setup. If you don’t have it yet, please refer to the [setup testing
+environment section](#setup-testing-environment)
+
+### Step 1: Export Environment Variables
+
+Export the required environment variables for the execution. You can find the available environment variables in the
+`.env.default` file.
 Please create a `.env` file in the `testing` directory and set the required environment variables.

 #### Environment Variables

-**SUBSTREAMS_PACKAGE**
- **Description**: Specifies the Substreams module that you want to test
- **Example**: `export SUBSTREAMS_PACKAGE=ethereum-balancer`
-
-**DATABASE_URL**
- **Description**: The connection string for the PostgreSQL database. It includes the username, password, host, port, and database name. It's already set to the default for the Docker container.
- **Example**: `export DATABASE_URL="postgres://postgres:mypassword@localhost:5431/tycho_indexer_0`
-
 **RPC_URL**
- **Description**: The URL for the Ethereum RPC endpoint. This is used to fetch the storage data. The node needs to be an archive node, and support [debug_storageRangeAt](https://www.quicknode.com/docs/ethereum/debug_storageRangeAt).
+
+- **Description**: The URL for the Ethereum RPC endpoint. This is used to fetch the storage data. The node needs to be
+  an archive node, and support [debug_storageRangeAt](https://www.quicknode.com/docs/ethereum/debug_storageRangeAt).
 - **Example**: `export RPC_URL="https://ethereum-mainnet.core.chainstack.com/123123123123"`

 **SUBSTREAMS_API_TOKEN**
+
 - **Description**: The API token for accessing Substreams services. This token is required for authentication.
 - **Example**: `export SUBSTREAMS_API_TOKEN=eyJhbGci...`

-**DOMAIN_OWNER**
- **Description**: The domain owner identifier for Propellerhead's AWS account, used for authenticating on the private PyPI repository.
- **Example**: `export DOMAIN_OWNER=123456789`
+### Step 2: Run tests

-### Step 2: Build and the Testing Script
+Run local postgres database using docker compose

-To build the testing script, run the following commands:
 ```bash
-source pre_build.sh
-docker compose build
-docker compose run app
+docker compose up -d db
+```
+
+Run tests for your package.
+
+```bash
+python ./testing/src/runner/cli.py --package "your-package-name"
+```
+
+#### Example
+
+If you want to run tests for `ethereum-balancer`, use:
+
+```bash
+conda activate propeller-protocol-lib-testing
+export RPC_URL="https://ethereum-mainnet.core.chainstack.com/123123123123"
+export SUBSTREAMS_API_TOKEN=eyJhbGci...
+docker compose up -d db
+python ./testing/src/runner/cli.py --package "ethereum-balancer"
+```
+
+#### Testing CLI args
+
+A list and description of all available CLI args can be found using:
+
+```
+python ./testing/src/runner/cli.py --help
 ```