Production Network Deployment Guide
This page illustrates the process of deploying a new production network comprising multiple independent nodes.
Overview
The Tutorial page illustrates the process of creating a new network of nodes by running each one on a single development machine. While that is a quick and concise demonstration of how nodes interact and the validator management process, a real production network requires the deployment of nodes on different host machines that typically operate in remote locations. This page demonstrates the deployment of such a network, with attention to the relevant settings and considerations in such a setting.
Concepts
There are several important concepts to understand when creating a network of independent remote nodes.
Peers and Node IDs
Every Kwil node has a "node ID" that is derived from the node's public key. This plays an important role creating secure authenticated connections between nodes over the internet.
A peer is always specified in the format <nodeID>@<IP address>:<port>
. For example, given a node with ID f664bfc6060231b12421aafbad971a58a27611f8
that is reachable at IP address 10.1.2.3
on TCP port 26656
, the notation used to specify this node as a peer is [email protected]:26656
.
An operator can determine their node ID in different ways:
- Use the
kwil-admin key info
command, which derives and displays the node ID given the private key. - Use the
kwil-admin node status
command, which requests the node ID from the running node's admin RPC service. - Locate the node ID in the node logs. An info-level log entry with the message "P2P Node ID" contains the ID.
The node ID is also displayed when the private key is initially generated with the key gen
subcommand.
Exposing Listening TCP Ports
For a node to be reachable, incoming TCP traffic on the P2P port must be allowed by firewall rules, both on the host machine and any network security policies enforced by the provider.
The node's P2P server must also be listening on a routable network interface that may be reached from remote peers. The default chain.p2p.listen_address
setting of 0.0.0.0:26656
permits this. If changing this value, be sure it uses a non-loopback interface that is reachable by other peers.
If a node is not configured to permit incoming P2P connections, it may still operate by connecting to the network with outgoing connections, but other peers will not be able to bootstrap from the node.
Finally, depending on the host machine's IP address assignment, it is often necessary to specify the actual external IP address that routes to the host machine's network interface using the chain.p2p.external_address
setting. For instance, if the host machine is assigned a public IP address of 53.35.53.35
that transparently routes traffic to a VPC address of 10.1.2.3
on the host OS, it is helpful to set chain.p2p.external_address = 53.35.53.35:26656
so that the Kwil node advertises itself on a reachable network address that is otherwise undiscoverable to kwild
itself.
Peer Exchange
In a healthy network, nodes have many peers to prevent single points of failure or unintended network partitions, and to support efficient network gossip. The feature that facilitates this is called peer exchanges (PEX). This is also enabled by default with the chain.p2p.pex = true
setting. Disable this setting only if a node must not share its peers with other nodes.
Network Bootstrapping
There are two supported ways of bootstapping a node's connection to a network: specifying a "seed" node and "persistent" peers. Both may be used in conjunction with PEX. These are described below.
Persistent Peers
The simplest method, particularly for a small new network, is to list known peers to which the node should always attempt to connect. This is done using the peer notation described above.
For example, to specify two remote peers to always connect to on startup, list them as follows:
[chain.p2p]
persistent_peers = "[email protected]:26656,[email protected]:26656"
Seeder and Seed nodes
For larger networks, and in particular organically growing networks, special "seed" nodes may be set up to facilitate node onboarding. To connect to a seed node, configure the chain.p2p.seeds
setting.
Like the persistent peers setting, this is a comma-separated list of nodes using the same <nodeID>@<IP address>:<port>
notation. Unlike persistent peers, the connections to these nodes are transient, disconnecting after obtaining a list of their peers.
The chain.p2p.seeds
setting should be used with chain.p2p.pex = true
.
A seed may be thought of as a network crawler, which performs no functions other than to continually connect to other peers, discovering new peer addresses, and sharing discovered addresses with peers making incoming connections. The recommended way to create a seed is with the cometseed tool. A mature network should have one or more seeds that all nodes may use for bootstrapping.
Validators
While not a networking concept, it is important to understand the role of validators as it pertains to block production. The consensus engine used by Kwil requires greater than 2/3 of validators to approve proposed blocks. Thus, if the network is not well connected and validators are unable to communicate efficiently, block production may be slowed or even stopped in the case of a network partition.
Not all nodes in a network must be validators. Non-validator nodes called "sentry" nodes are typically created to function as RPC service providers or just to support network health by relaying transactions, blocks, and misbehavior evidence.
Compatible Versions
A final important concept is version compatibility. On a production network, it is important to only run nodes built from the same release branch, such as the release-v0.8
branch, or a release tag with the same MAJOR.MINOR
version.
Different patch versions are compatible, but large releases that change the application's state machine logic cannot be run on the same network.
For example, it is not possible to run v0.8.4
and v0.9.1
on the same network. To update a network between such major releases, it is necessary to perform a migration.
Walkthrough
This section illustrates the process for launching a small network of independent remote nodes. It will begin with two validators at genesis, followed by the addition of a sentry node using block sync, and finally the new node will become a validator.
Deploying Genesis Nodes and Validators
Starting a network with a single validator is possible, however, we will demonstrate starting with two validators from genesis since this also illustrates the consensus considerations described above in Validators.
Generating Node Private Keys
Securely generate and store private keys. Anyone in possession of a private key can imitate and act on behalf of a node. In the case of a validator, this impacts the security of the network.
Every node has a unique identity, which is determined by its private key. We will generate a private key using the kwil-admin key gen
command:
kwil-admin key gen --key-file ./private_key
This will write a file named private_key
in the current directory that contains the ed25519 private key for a new node. Each node must have a different private key.
We can inspect the key using the kwil-admin key info
command as follows:
$ kwil-admin key info --key-file ./private_key
Private key (hex): d058de92183058efee63512d7997be7040cfbbfbf6d37c6b9f1157ba421dc2699020d218f830106fad8ce378a606531719bec4f4efee6127a371f25d393ed39e
Private key (base64): 0FjekhgwWO/uY1EteZe+cEDPu/v203xrnxFXukIdwmmQINIY+DAQb62M43imBlMXGb7E9O/uYSejcfJdOT7Tng==
Public key (base64): kCDSGPgwEG+tjON4pgZTFxm+xPTv7mEno3HyXTk+054=
Public key (cometized hex): PubKeyEd25519{9020D218F830106FAD8CE378A606531719BEC4F4EFEE6127A371F25D393ED39E}
Public key (plain hex): 9020d218f830106fad8ce378a606531719bec4f4efee6127a371f25d393ed39e
Address (string): 317E0230F8897D38157743D470731EE98FCC1326
Node ID: 317e0230f8897d38157743d470731ee98fcc1326
This displays different encodings of the private key, and the corresponding public key and node ID. The "Public key (plain hex)" value is used in the genesis.json
file to specify the initial validator set as described in the next section.
Creating the Genesis Document
Before starting the first nodes, it is necessary is to create a genesis file for the new network. The file specifies the validator set at genesis, initial account funding allocations, and other network parameters. All nodes must use the exact same genesis.json
file since its parameters affect the consensus rules for the network.
Unlike the node quick start guide where all genesis validator keys are generated with the kwil-admin setup testnet
command run once on one machine, we will create a genesis file with the kwil-admin setup genesis
command. This takes as input:
- the name, plain hex public keys, and power of each genesis validator (formatted as
name:pubkey:power
) - the "chain ID" string that is the new network's name
- any other default parameter overrides
Consider we want to start a new network called "prod-chain" with two genesis validators:
$ kwil-admin setup genesis \
--validator "node0:8e11018570091d79b6105d5954cbbc4f1570cd3a02df2214b362e7c3f22c0459:1" \
--validator "node1:1e3ab63c62ad165d01627d53c6c4c61e59626ceeeb26653527cf3df18eff7a23:1" \
--chain-id "prod-chain" --out ./prod-chain
Created genesis.json file at /home/user/prod-chain/genesis.json
The resulting genesis file has the chain ID and validators customized, with default values for the rest of the document:
{
"genesis_time": "2024-10-03T19:00:42.192646924Z",
"chain_id": "prod-chain",
"initial_height": 1,
"app_hash": null,
"activations": null,
"consensus_params": {
"block": {
"max_bytes": 6291456,
"max_gas": -1,
"abci_max_bytes": false
},
"evidence": {
"max_age_num_blocks": 100000,
"max_age_duration": 172800000000000,
"max_bytes": 1048576
},
"validator": {
"pub_key_types": [
"ed25519"
],
"join_expiry": 100800
},
"votes": {
"vote_expiry": 14400,
"max_votes_per_tx": 100
},
"abci": {
"vote_extensions_enable_height": 0
},
"migration": {},
"without_gas_costs": true
},
"validators": [
{
"pub_key": "8e11018570091d79b6105d5954cbbc4f1570cd3a02df2214b362e7c3f22c0459",
"power": 1,
"name": "node0"
},
{
"pub_key": "1e3ab63c62ad165d01627d53c6c4c61e59626ceeeb26653527cf3df18eff7a23",
"power": 1,
"name": "node1"
}
]
}
Provide this same genesis.json
file to all node operators. The genesis file with the initial validators will remain the same for the entire lifetime of the network, even as the network adds new validators.
Node Configuration
Unlike genesis.json
, which contains network-wide settings, the config.toml
file is used to configure a Kwil node for its own purpose and network configuration. Create the config.toml
file for each node separately, setting the listen addresses, external address, persistent peers, and seeds as appropriate for the node.
The kwil-admin setup init
command is used to initialize a node's "root" directory, which by default is ~/.kwild
. We will use the agreed upon genesis file using the --genesis
flag. Since these are genesis validators, we must also copy the private_key
file that we manually generated for each node above into the created root directory (rather than the private key that is generated by the setup init
command). Other nodes that are not validators specified in genesis.json
may use the private key that is generated by the setup init
command.
Consider we have prepared the keys and addresses for the two nodes described above as follows:
node0 | node1 | |
---|---|---|
external IP | 23.45.67.89 | 98.76.54.32 |
node ID | 2a8b9eb0d94be766ad34cb2591f2924477fc397c | 2164057e1d8248f42aadb8ef00afcc123180571d |
To have them connect to each other, we will set their persistent_peer
settings appropriately.
On node0
, we can initialize the root directory and the config.toml
there with the following command, which specifies its external IP address and node1
in the persistent peers:
[node0]:~$ kwil-admin setup init --genesis prod-chain/genesis.json \
--chain.moniker "node0" \
--chain.p2p.external-address "23.45.67.89:26656" --chain.p2p.persistent-peers \
"[email protected]:26656" \
--root-dir /data/kwil-node0
Initialized node in /data/kwil-node0
Then copy the private_key
file created previously for this node into the root directory.
On node1
, do the same, but with its external IP address and node0
in the persistent peers:
[node1]:~$ kwil-admin setup init --genesis prod-chain/genesis.json \
--chain.moniker "node1" \
--chain.p2p.external-address "98.76.54.32:26656" --chain.p2p.persistent-peers \
"[email protected]:26656" \
--root-dir /data/kwil-node1
Initialized node in /data/kwil-node1
Finally, copy this node's private_key
file into the root directory.
Initial Startup and Block Production
Now that both of the validators have been prepared with the same genesis.json
file and their persistent peer configurations pointing to each other, it is time to start the nodes.
Depending on how the nodes are deployed, kwild
would be run as a service, container, or in a terminal multiplexer.
Postgres Database
Install Postgres create a kwild
database on each node. Refer to the Postgres setup guide for more information and quick start instructions.
Start Kwild
Start the nodes on each of the remote machines, specifying the path to the root directory if it is not at the default location (~/.kwild
):
[node0]:~$ kwild -r /data/kwil-node0
[node1]:~$ kwild -r /data/kwil-node1
Once the nodes connect, the first block will be produced shortly after:
2024-09-19T12:01:23.443-05:00 info kwild.cometbft finalizing commit of block {"module": "consensus", "height": 1, "hash": "010806C959A3CB3F186C1216745E9C9C22F38F98A8D6C8673848BFF79FBD6C81", "root": "B4D2AE49E40746CB1C5C1F1B89D3B30F4927A566E0E0D7F7EDBC6C5A1F540B18", "num_txs": 0}
2024-09-19T12:01:23.465-05:00 info kwild.cometbft finalized block {"module": "state", "height": 1, "num_txs_res": 0, "num_val_updates": 0, "block_app_hash": "5DF6E0E2761359D30A8275058E299FCC0381534545F55CF43E41983F5D4C9456"}
Recall that both of the validators need to be started and able to communicate before blocks can be created, as discussed in the Validators section. If the nodes are unable to connect, look for peer-related errors in the logs, and check your persistent peers settings and firewall rules.
Adding a New Node
Now that our network of two validators is running and producing blocks, we will create and synchronize a new node to the network, and once it is synchronized, promote the node to a validator.
Starting as a Sentry Node
Now we will add a third note to the network as a sentry node. Since this node is not a genesis validator, its public key is not included in the genesis file, which is now immutable. The kwil-admin setup init
command can be used to simultaneously initialize a new node's root directory and generate a new private key.
When adding a node to the network, start it as a non-validator / sentry node. Always completely synchronize the node to the network before attempting to promote it to a validator.
Like before, we will specify the location of the network's genesis.json
file, the host's external IP address, and one or more persistent peers. However, we will use the generated private key:
[node2]:~$ kwil-admin setup init --genesis prod-chain/genesis.json \
--chain.moniker "node2" \
--chain.p2p.external-address "2.3.4.5:26656" --chain.p2p.persistent-peers \
"[email protected]:26656" \
--root-dir /data/kwil-node2
Initialized node in /data/kwil-node2
We can see the ID of node2
before starting it:
[node2]:~$ kwil-admin key info --key-file /data/kwil-node2/private_key | grep "Node ID"
Node ID: bd56ce5a93c47b532be71531ff6ed469b341e61f
Verifying Successful Node Synchronization
Before adding the node as a validator, ensure that node2
has completely synchronized with the network. You can do this with either the kwil-admin
tool, or with the public user service using kwil-cli
.
Using kwil-admin node status
, inspect the .sync.best_block_height
and .sync.syncing
values:
[node2]:~$ kwil-admin node status
{
"node": {
"chain_id": "prod-chain",
"name": "node2",
"node_id": "bd56ce5a93c47b532be71531ff6ed469b341e61f",
"proto_ver": 8,
"app_ver": 1,
"block_ver": 11,
"listen_addr": "tcp://127.0.0.2:27656",
"rpc_addr": "tcp://127.0.0.2:26657"
},
"sync": {
"app_hash": "0e0674b0b88abc26cc60da6e64883dfef802f3431f19296b88c8bc1db8f4f4ae",
"best_block_hash": "1d5a4ef85927dce3c90168e89fdfb32219c5c95bb5ed3332eb69e20bbe7346da",
"best_block_height": 109,
"best_block_time": "2024-10-03T18:01:38.585-05:00",
"syncing": false
},
"validator": {
...
},
"app": null
}
Ensure that the node is no longer syncing (catching up) with the network and that the best block height is within a couple blocks of the network's known height.
Use kwil-cli utils chain-info
to query the height of any node with its RPC service exposed:
$ kwil-cli utils chain-info --provider "http://98.76.54.32:8484"
Chain ID: prod-chain
Height: 109
Hash: 1d5a4ef85927dce3c90168e89fdfb32219c5c95bb5ed3332eb69e20bbe7346da
Compare the output of that command with the result from node2
to ensure the new node stays in sync with the network as new blocks are produced.
Alternatively, check the new node's health check REST endpoint, which will return an HTTP status code of 200 and a JSON body with "healthy":true
if the node is ready:
[node2]:~$ curl -s --fail http://127.0.0.1:8484/api/v1/health
{"kwil_alive":true,"healthy":true, ...}
Joining the Validator Set
Unlike node0
and node1
, consensus does not require participation from node2
, which just executes the blocks that are produced and approved by the validators. In this section we will add node2
to the validator set.
Start by submitting a validator "join" request from node2
:
[node2]:~$ kwil-admin validators join
TxHash: 0722630c4296ea27d77f4ee8db6129c1ca042372811187f7ba787c75464400ed
We have now submitted a request to become a validator, which the existing validators must approve for this node to join the validator set. The validators list-join-requests
subcommmand from any node can view this request:
$ kwil-admin validators list-join-requests
Pending join requests (1 approval needed):
Candidate | Power | Approvals | Expiration
------------------------------------------------------------------+-------+-----------+------------
7c02692ed4e1f165791b43742274edb4f9587f6da02d54a0c111386b7110bb64 | 1 | 0 | 14627
Both validators are required to approve this request since 1/2 does not meet the threshold of greater than 2/3 of the validators. This is done by running the following on each validator:
$ kwil-admin validators approve 7c02692ed4e1f165791b43742274edb4f9587f6da02d54a0c111386b7110bb64
TxHash: cab4b66d12cd6ecee256ba73dd9500a3ea7b21ba03bd38d5db7701268a0f8ecb
Once the approval transactions are mined, node2
will become part of the validator set:
$ kwil-admin validators list
Current validator set:
0. {pubkey = 8e11018570091d79b6105d5954cbbc4f1570cd3a02df2214b362e7c3f22c0459, power = 1}
1. {pubkey = 1e3ab63c62ad165d01627d53c6c4c61e59626ceeeb26653527cf3df18eff7a23, power = 1}
2. {pubkey = 7c02692ed4e1f165791b43742274edb4f9587f6da02d54a0c111386b7110bb64, power = 1}
On the block where the final approval transaction was mined, the kwild
logs for node2
will display:
{"level":"info","ts":1727997606.6760225,"logger":"kwild.cometbft","msg":"finalized block","module":"state","height":291,"num_txs_res":1,"num_val_updates":1,"block_app_hash":"F67E090714D70095DB4DF0EF3BF592A8E60FEC0CEEE58B5FB4D55CBD92E8BC40"}
{"level":"info","ts":1727997606.6823015,"logger":"kwild.cometbft","msg":"updates to validators","module":"state","updates":"2EA4B9CA4A6FCBAF5102E4C8AB335D785F5D4772:1"}
{"level":"info","ts":1727997606.6865957,"logger":"kwild.listener-manager","msg":"Node is a validator and caught up with the network, starting listeners"}
The node is now a validator. Keep it running and monitor its health.