Creating a Merkle Tree in the shell

Creating a Merkle Tree in the shell

I have known about and used Merkle Trees for a while, but never really bothered to construct one manually (no real need as so many good libraries are out there). Well it’s Saturday night, we are in a lock down, so why not?

For anyone new to Merkle tree’s, they are a binary tree of hashed values concatenated together (one-way hash functions) until an eventual root hash is achieved. The root hash can then be used a sort of integrity snapshot of the entire tree. Merkle trees are currently utilised in three well known applications

  • Blockchains (verify transactions)
  • Bitorrent (verify file parts)
  • Git (bind code revision)
  • Transpareny logs (sigstore/rekor & certificate transparency)

The tree starts at the base with a set of “Data segments”. In a blockchain this would be the transaction. In bitorrent, it would be a file part.

Let's build the tree using concatenation and hashing (sha256)


# Hash the data segments into lead nodes
leafnodes=(HA HB HC HD HE HF HG HH)

for i in "${leafnodes[@]}"
   hash=$(echo -n $i | openssl sha256 | awk '{print $2}')
   eval $i=$hash

# Construct the parent nodes
HAB=$(echo -n ${HA}${HB} | openssl sha256 | awk '{print $2}')
HCD=$(echo -n ${HC}${HD} | openssl sha256 | awk '{print $2}')

HEF=$(echo -n ${HE}${HF} | openssl sha256 | awk '{print $2}')
HGH=$(echo -n ${HG}${HH} | openssl sha256 | awk '{print $2}')

HABCD=$(echo -n ${HAB}${HCD} | openssl sha256 | awk '{print $2}')
HEFGH=$(echo -n ${HEF}${HGH} | openssl sha256 | awk '{print $2}')

# Compute the Merkle Tree Root
HABCDEFGH=$(echo -n ${HABCD}${HEFGH} | openssl sha256 | awk '{print $2}')

# Get the values we need for verification
echo "HC: ${HC}"
echo "HAB: ${HAB}"
echo "HEFGH: ${HEFGH}"


The result of this script will be the merkle root HABCDEFGH and we also spit out some other hashes will we need for verifying later on.

HC: 616e8cc2cc762815bce92ba8e817da87e92c98b3e3e5c42697caaa76e18c6129
HAB: 545f638df3d2ba4d2295cd1cf6506c05b8340afd5a014f704c741245aab86831
HEFGH: 6d109b1eb6c0ae1353ce98977f15091d2e5d28664f24f362e697da8fc13a6617
HABCDEFGH: fe65b452f05c006bee04415be7a53030dbcb16040bfd1eb19b3b02f95b4d44d7

Verify a single Data Segment

Let’s say we now want to verify data segment HD. instead of requiring every data segment, we only need the leafs of HEFGH, HAB and HC and the root HABCDEFGH

Lets start by grabbing HD the segment we wish to verify

HD=$(echo -n HD | openssl sha256 | awk '{print $2}') ; echo $HD

We can now concat and hash HC | HD to get HCD

HCD=$echo -n ${HC}${HD} | openssl sha256 | awk '{print $2}')

Concat and hash HAD to HAB to get HABCD

HABCD=$(echo -n ${HAB}${HCD} | openssl sha256 | awk '{print $2}'); echo $HABCD

Concat and hash HABCD to HEFGH

HABCDEFGH=$(echo -n ${HABCD}${HEFGH} | openssl sha256 | awk '{print $2}'); echo $HABCDEFGH

And there we have our root and have verified HD. We were able to verify a data segment from hashing the object we want verified, by having only three hashes from the tree and the root hash "HEFGH", "HAB", "HC", "HABCDEFGH". To use the technical term, this would be computing an "inclusion proof".