Replace payload hash with Merkle tree root in block header

Hello everyone,

In this thread, I want to propose a new LIP for the roadmap objective “Introduce authenticated data structure”. This LIP proposes to remove the payloadHash from the block header and include instead the Merkle root of the transactions included in the block payload. This would allow for efficient proofs-of-inclusion for transactions included in the block.

As always, I’m looking forward to your feedback.

Here is the complete LIP draft:

LIP: <LIP number>
Title: Replace payload hash with Merkle tree root in block header
Author: Alessandro Ricottone <alessandro.ricottone@lightcurve.io>
Type: Standards Track
Created: <YYYY-MM-DD>
Updated: <YYYY-MM-DD>
Requires: 00xx "Introduce Merkle Trees and Inclusion Proofs"       
          0019

Abstract

This LIP proposes to remove the payloadHash value stored in each block header and replace it with transactionRoot, the root of a Merkle tree built from the transactions included in the payload.

Copyright

This LIP is licensed under the Creative Commons Zero 1.0 Universal.

Motivation

A Merkle tree is an authenticated data structure that allows for efficient proof-of-inclusion, where a Verifier can verify the presence of a certain data block in a dataset by quiring only Log(N) hashes from a Prover, where N is the length of the dataset. To check that the data block is part of the dataset, the Verifier recursively calculates the hash of the original data block with the hashes provided by the Prover, and checks that the final result is equal to the root of the Merkle tree.

In the current protocol, the payloadHash is calculated by taking the SHA-256 hash of the block payload, given as the array of concatenated serialized transactions. Thus, to verify that a transaction has been included in a block, it is necessary to download the full block payload. With the new proposal to change to byte-based limit for block size, this amounts to downloading up to 15kb of data. By storing the root of the transaction Merkle tree in the block header as transactionRoot instead of the payloadHash, it will be possible, using a proof-of-inclusion, to check whether a transaction is included in a block downloading at most 7 different 32-bytes-long hashes, with a corresponding proof of size 232 bytes (see “Proof serialization” in LIP 00xx “Introduce Merkle trees and inclusion proofs”).

In this LIP, we define how the transactionRoot of a block is calculated as the root of the Merkle tree built from the transactions included in it.

Rationale

The general specifications to build a Merkle tree in the Lisk protocol are given in LIP 00xx “Introduce Merkle Trees and Inclusion Proofs”. In this LIP we specify which changes are required to store the root of the transaction Merkle tree as the transactionRoot in a block header and to verify the validity of new blocks after this change.

Block Forging

To forge a new block, a delegate selects a set of transactions to be included in the block payload. Currently, these transactions are sorted using the sortTransactions function and then serialized. The payloadHash is calculated as the SHA-256 of the concatenated serialized transactions.

Once LIP 0026 “Establish block validity by applying transactions sequentially” is implemented, transactions will be sorted by the block forger in the order in which they are applied to the blockchain, and included in the same order in the block payload. The serialized transaction IDs in the transaction application order are used to build a Merkle tree according to the specifications defined in “Introduce Merkle Trees and Inclusion Proofs”. The transactionRoot is set to the resulting Merkle root and included in the block header instead of payloadHash. Notice that transactionRoot has the same type and size as payloadHash.

Block Verification

To verify the validity of a block, first the Merkle root is calculated from the transactions included in the block payload and checked against the transactionRoot stored in the block header. The same set of transactions sorted in a different way would result in a different Merkle root, and consequently an invalid block. In this sense, the transactionRoot contains information about the order in which the transactions were applied to the blockchain. Then, the transactions are verified and applied in the same order in which they have been inserted in the payload.

Transactions Ordering

To reproduce the same Merkle root consistently, the transactions have to be sorted in a consistent way. We consider two viable options:

  1. Sort by transaction ID: transactions are sorted according to their transaction ID in ascending order. Using this sorting method, it is easy to give a proof-of-absence for a certain transaction by giving a proof-of-inclusion for two transactions next to each other in the tree and whose IDs bound the target transaction ID.
  2. Sort by application order: after LIP 0026 comes into effect, transactions will be applied to the blockchain state in a specific order, chosen by the forger. Therefore, transactions will be already sorted in this way in the block payload. Using this sorting method to build the Merkle tree and consequently calculate the transactionRoot allows to implicitly make the application order an immutable part of the block header. This approach does not allow for an efficient proof-of-absence, however this is only a marginal downside as proofs-of-absence do not have relevant use-case scenarios in the current protocol.

As mentioned above, we choose option 2.

Specification

We assume LIP 0019 “Use full SHA-256 hash of transaction as transactionID” is already effective, and transaction IDs are 32 bytes long.

Given a Merkle tree and corresponding Merkle root merkleRoot built from the array transactionIDsArray according to the specifications defined in LIP 00xx, the transactionRoot is set as transactionRoot = merkleRoot. Here transactionIDsArray is an array of bytes containing the serialized transaction ids. The transactionRoot is then stored in the block header instead of payloadHash.

To verify the validity of the block, transactionIDsArray is again built from the serialized transactions stored in the payload hash, keeping the same order, and used to calculate the Merkle root merkleRoot. The transactionRoot is valid if transactionRoot == merkleRoot.

Backwards Compatibility

This proposal introduces a hard fork. Blocks forged with the new transactionRoot are not valid in the current protocol and vice-versa.

2 Likes

The pull request has been merged and the LIP is now drafted: