Transact Merkle State in SQL
Overview
The current Database abstraction acts as a simple key-value store, where both the keys and the values are arbitrary bytes. This does not allow for various optimizations that can be made via more complicated SQL queries. Transact has been updated to provide an implementation using several tables which will accurately reflect the data and allow for those optimizations, where possible.
One such operation is made by representing the merkle radix tree nodes with an array type for the children. This allows for the use of a recursive query to find the complete node path of a given address/data pair.
Insertion, while still requiring the same number of records as the key-value abstraction, can be done with bulk executions, thereby limiting the round trips to and from the database.
Database Tables
Each set of state roots are related by a tree name. This is stored in a
merkle_radix_tree
table:
CREATE TABLE IF NOT EXISTS merkle_radix_tree (
id BIGSERIAL PRIMARY KEY,
name VARCHAR(512),
UNIQUE(name)
);
Leaf data is maintained in its own table:
CREATE TABLE IF NOT EXISTS merkle_radix_leaf (
id BIGSERIAL PRIMARY KEY,
tree_id BIGINT NOT NULL,
address VARCHAR(70) NOT NULL,
data BYTEA,
FOREIGN KEY(tree_id) REFERENCES merkle_radix_tree (id)
);
While this stores all the revisions of data for a given address, the order of
the data is maintained by the merkle_radix_tree_node
. This table maintains the
structure of the tree, where each record is a node in the tree. Each state root
is also included in this table.
This table contains an array of the children for the given node, where the contents of the array are the hash id of a child, or NULL if that branch does not have any children.
Its complete structure is as follows:
CREATE TABLE IF NOT EXISTS merkle_radix_tree_node (
hash VARCHAR(64) NOT NULL,
tree_id BIGINT NOT NULL,
leaf_id BIGINT,
children VARCHAR(64)[],
PRIMARY KEY (hash, tree_id),
FOREIGN KEY(tree_id) REFERENCES merkle_radix_tree(id),
FOREIGN KEY(leaf_id) REFERENCES merkle_radix_leaf(id)
);
Like the key-value implementation of MerkleState, additions and deletions are recorded to support state pruning:
create TABLE IF NOT EXISTS merkle_radix_change_log_addition (
id BIGSERIAL PRIMARY KEY,
tree_id BIGINT NOT NULL,
state_root VARCHAR(64) NOT NULL,
parent_state_root VARCHAR(64),
addition VARCHAR(64),
FOREIGN KEY(state_root, tree_id)
REFERENCES merkle_radix_tree_node(hash, tree_id),
FOREIGN KEY(parent_state_root, tree_id)
REFERENCES merkle_radix_tree_node(hash, tree_id),
FOREIGN KEY(addition, tree_id)
REFERENCES merkle_radix_tree_node(hash, tree_id)
);
create TABLE IF NOT EXISTS merkle_radix_change_log_deletion (
id BIGSERIAL PRIMARY KEY,
tree_id BIGINT NOT NULL,
successor_state_root VARCHAR(64) NOT NULL,
state_root VARCHAR(64) NOT NULL,
deletion VARCHAR(64),
FOREIGN KEY(successor_state_root, tree_id)
REFERENCES merkle_radix_tree_node(hash, tree_id),
FOREIGN KEY(state_root, tree_id)
REFERENCES merkle_radix_tree_node(hash, tree_id),
FOREIGN KEY(deletion, tree_id)
REFERENCES merkle_radix_tree_node(hash, tree_id)
);
All the tables listed here are specific to Postgres. The SQLite-specific tables are slightly different, in that any arrays are represented by JSON values.
Querying a Tree
In the key-value implementation, an entry at an address is found by walking the tree from the state root to the leaf node a single node at a time. Each intermediate node is read from the database for each node transition.
In the SQL implementation, we can do this lookup in a single query. Using a recursive query like so:
WITH RECURSIVE tree_path AS
(
-- This is the initial node
SELECT hash, tree_id, leaf_id, children, 1 as depth
FROM merkle_radix_tree_node
WHERE hash = $STATE_ROOT_HASH AND tree_id = $TREE_ID
UNION ALL
-- Recurse through the tree
SELECT c.hash, c.tree_id, c.leaf_id, c.children, p.depth + 1
FROM merkle_radix_tree_node c, tree_path p
WHERE c.hash = p.children[$ADDRESS_BYTES[p.depth]]
AND c.tree_id = $TREE_ID
)
SELECT l.data
FROM tree_path t, merkle_radix_leaf l
WHERE t.tree_id = $TREE_ID AND t.leaf_id = l.id
Given an address’s bytes, where each byte is branch position, and a starting state root hash, the leaf (or lack thereof) can be found with the above query.
Similar queries can be used to list all leaves for a given state root, or find all the nodes along a given address’s path.
Traits and Structs
Transact introduced an new version of MerkleState
, SqlMerkleState
to
specifically provide the implementation of the above. This new struct implements
the standard transact state traits Read
, Write
and Prune
.
Each instance of SqlMerkleState
is linked to a specific entry in the
merkle_radix_tree
table.
It also introduced a new trait to cover listing out a merkle tree’s leaves,
MerkleRadixLeafReader
. This trait provides a way to iterate over the leaves
on a given state root hash, with an optional subtree argument. It is applied to
both the key-value MerkleState
and SqlMerkleState
.
References
See the Transact documentation for detailed Rust documentation.