> ## Documentation Index
> Fetch the complete documentation index at: https://docs.starknet.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Merkle Prover

This section explains the prover of the Merkle commitment scheme, focusing on how columns are committed to compute a Merkle root and how the Merkle tree layers are constructed, as well as how to generate Merkle proofs (decommitments).

S-two represents the computation trace using multiple columns of different sizes. Rather than committing to each column in separate Merkle trees, S-two uses a modified Merkle tree structure where all columns of different sizes are committed to using a single Merkle tree. We will describe the structure of this Merkle tree in this section.

## MerkleProver Structure

The `MerkleProver` struct represents a prover for a Merkle commitment scheme. It stores all layers of the Merkle tree, where each layer contains the hash values at that level.

```rust theme={null}
pub struct MerkleProver<B: MerkleOps<H>, H: MerkleHasher> {
    /// Layers of the Merkle tree.
    /// The first layer is the root layer.
    /// The last layer is the largest layer.
    /// See [MerkleOps::commit_on_layer] for more details.
    pub layers: Vec<Col<B, H::Hash>>,
}
```

## The Commit Process

The core of the Merkle prover is the `commit` function, which takes as input a set of columns and outputs a `MerkleProver` containing all layers of the Merkle tree. The columns must be of length that is a power of 2.

Below is the complete implementation of the `commit` function:

```rust theme={null}
    pub fn commit(columns: Vec<&Col<B, BaseField>>) -> Self {
        let _span = span!(Level::TRACE, "Merkle", class = "MerkleCommitment").entered();
        if columns.is_empty() {
            return Self {
                layers: vec![B::commit_on_layer(0, None, &[])],
            };
        }

        let columns = &mut columns
            .into_iter()
            .sorted_by_key(|c| Reverse(c.len()))
            .peekable();

        let mut layers: Vec<Col<B, H::Hash>> = Vec::new();

        let max_log_size = columns.peek().unwrap().len().ilog2();
        for log_size in (0..=max_log_size).rev() {
            // Take columns of the current log_size.
            let layer_columns = columns
                .peek_take_while(|column| column.len().ilog2() == log_size)
                .collect_vec();

            layers.push(B::commit_on_layer(log_size, layers.last(), &layer_columns));
        }
        layers.reverse();
        Self { layers }
    }
```

Let's walk through the function step by step:

1. **Sort Columns by Length:** Columns are sorted in descending order of length (columns with the most elements appear first). This ensures that the largest columns are committed first.

2. **Layer-by-Layer Commitment:**
   * For each layer (from largest to smallest), the function collects all columns of the current size and computes the hashes for that layer using the `commit_on_layer` function.
   * For the largest layer, the previous layer's hashes are empty, so the hash is computed directly from the column values.
   * For subsequent layers, the hash is computed from the previous layer's hashes and the current layer's column values.

3. **Reverse Layers:** After all layers are computed, the list of layers is reversed so that the root layer is at the beginning.

### Example: Commit Process

Suppose the input column data is as shown below:

<div
  id="fig-merkle-tree-data"
  style={{
display: 'flex',
flexDirection: 'column',
alignItems: 'center',
gap: '0.5rem',
}}
>
  <img src="https://mintcdn.com/starkware-9575960b/pocZ0Ad75vxRsKPm/learn/S-two-book/how-it-works/figures/merkle-tree-1.svg?fit=max&auto=format&n=pocZ0Ad75vxRsKPm&q=85&s=784ec1fa9261b143e0110a51ea6635cf" alt="Example column data to commit using a Merkle tree" style={{ maxWidth: '100%', height: 'auto' }} width="427" height="247" data-path="learn/S-two-book/how-it-works/figures/merkle-tree-1.svg" />

  <span>Figure 1: Example column data to commit using a Merkle tree</span>
</div>

For this example, the columns are already in the sorted order: $\textcolor{red}{\text{\textit{Column} 0}}$, $\textcolor{green}{\text{\textit{Column} 1}}$, $\textcolor{blue}{\text{\textit{Column} 2}}$ (from longest to shortest). We will now compute the hashes stored at each layer.

* **First Layer (Leaves):** The hashes are computed directly from the column values:
  $[h_{00}, h_{01}, h_{10}, h_{11}] = [H(\textcolor{red}{a}, \textcolor{green}{p}), H(\textcolor{red}{b}, \textcolor{green}{q}), H(\textcolor{red}{c}, \textcolor{green}{r}), H(\textcolor{red}{d}, \textcolor{green}{s})]$

* **Second Layer:** The next layer uses the previous hashes and the values from $\textcolor{blue}{\text{Column 2}}$:
  $[h_0, h_1] = [H(h_{00}, h_{01}, \textcolor{blue}{u}), H(h_{10}, h_{11}, \textcolor{blue}{v})]$

* **Root:** The root is computed as $root = H(h_0, h_1)$.

The resulting Merkle tree is illustrated below:

<div
  id="fig-merkle-tree-structure"
  style={{
display: 'flex',
flexDirection: 'column',
alignItems: 'center',
gap: '0.5rem',
}}
>
  <img src="https://mintcdn.com/starkware-9575960b/pocZ0Ad75vxRsKPm/learn/S-two-book/how-it-works/figures/merkle-tree-2.svg?fit=max&auto=format&n=pocZ0Ad75vxRsKPm&q=85&s=acf9bb0d68242cece9f8fa52d390cf86" alt="Merkle tree structure after commitment" style={{ maxWidth: '100%', height: 'auto' }} width="502" height="311" data-path="learn/S-two-book/how-it-works/figures/merkle-tree-2.svg" />

  <span>Figure 2: Merkle tree structure after commitment</span>
</div>

## The Decommit Process

The decommitment process enables the prover to generate a Merkle proof for a set of queried indices, allowing the verifier to check that specific elements are included in the committed Merkle tree.

The output is a `MerkleDecommitment` struct, which contains the hash and column values required for the verifier to reconstruct the Merkle root at the queried positions. Its implementation is shown below:

```rust theme={null}
pub struct MerkleDecommitment<H: MerkleHasher> {
    /// Hash values that the verifier needs but cannot deduce from previous computations, in the
    /// order they are needed.
    pub hash_witness: Vec<H::Hash>,
    /// Column values that the verifier needs but cannot deduce from previous computations, in the
    /// order they are needed.
    /// This complements the column values that were queried. These must be supplied directly to
    /// the verifier.
    pub column_witness: Vec<BaseField>,
}
```

The `decommit` function implemented for the `MerkleProver` takes as input:

* `queries_per_log_size`: A map from log size to a vector of query indices for columns of that size.
* `columns`: The column data that was committed to in the Merkle tree.

It returns:

* A vector of queried values, ordered as they are opened (from largest to smallest layer).
* A `MerkleDecommitment` containing the hash and column witnesses needed for verification.

Below is the complete implementation of the `decommit` function:

```rust theme={null}
    pub fn decommit(
        &self,
        queries_per_log_size: &BTreeMap<u32, Vec<usize>>,
        columns: Vec<&Col<B, BaseField>>,
    ) -> (Vec<BaseField>, MerkleDecommitment<H>) {
        // Prepare output buffers.
        let mut queried_values = vec![];
        let mut decommitment = MerkleDecommitment::empty();

        // Sort columns by layer.
        let mut columns_by_layer = columns
            .iter()
            .sorted_by_key(|c| Reverse(c.len()))
            .peekable();

        let mut last_layer_queries = vec![];
        for layer_log_size in (0..self.layers.len() as u32).rev() {
            // Prepare write buffer for queries to the current layer. This will propagate to the
            // next layer.
            let mut layer_total_queries = vec![];

            // Each layer node is a hash of column values as previous layer hashes.
            // Prepare the relevant columns and previous layer hashes to read from.
            let layer_columns = columns_by_layer
                .peek_take_while(|column| column.len().ilog2() == layer_log_size)
                .collect_vec();
            let previous_layer_hashes = self.layers.get(layer_log_size as usize + 1);

            // Queries to this layer come from queried node in the previous layer and queried
            // columns in this one.
            let mut prev_layer_queries = last_layer_queries.into_iter().peekable();
            let mut layer_column_queries =
                option_flatten_peekable(queries_per_log_size.get(&layer_log_size));

            // Merge previous layer queries and column queries.
            while let Some(node_index) =
                next_decommitment_node(&mut prev_layer_queries, &mut layer_column_queries)
            {
                if let Some(previous_layer_hashes) = previous_layer_hashes {
                    // If the left child was not computed, add it to the witness.
                    if prev_layer_queries.next_if_eq(&(2 * node_index)).is_none() {
                        decommitment
                            .hash_witness
                            .push(previous_layer_hashes.at(2 * node_index));
                    }

                    // If the right child was not computed, add it to the witness.
                    if prev_layer_queries
                        .next_if_eq(&(2 * node_index + 1))
                        .is_none()
                    {
                        decommitment
                            .hash_witness
                            .push(previous_layer_hashes.at(2 * node_index + 1));
                    }
                }

                // If the column values were queried, return them.
                let node_values = layer_columns.iter().map(|c| c.at(node_index));
                if layer_column_queries.next_if_eq(&node_index).is_some() {
                    queried_values.extend(node_values);
                } else {
                    // Otherwise, add them to the witness.
                    decommitment.column_witness.extend(node_values);
                }

                layer_total_queries.push(node_index);
            }

            // Propagate queries to the next layer.
            last_layer_queries = layer_total_queries;
        }

        (queried_values, decommitment)
    }
```

Let's break down the function step by step:

1. **Sort Columns by Length:**
   * As in the `commit` function, columns are sorted in descending order of length.

2. **Layer-by-Layer Decommitment:**
   * For each layer (from largest to smallest):
     * Collect all columns of the current size (`layer_columns`) and the previous layer's hashes (`previous_layer_hashes`).
     * Retrieve the queries for the current layer (`layer_column_queries`) and the previous layer's queries (`prev_layer_queries`).
     * For each node index to be decommitted in this layer:
       * Check if the child node hashes of the current node can be computed by the verifier, and if not, add the missing child node hashes to the `hash_witness` in the decommitment.
       * If the node index is queried, fetch the corresponding column values and append them to `queried_values`.
       * If not queried, add the column values to the `column_witness` in the decommitment.
     * The set of node indices decommitted in this layer is propagated as queries to the next layer.

### Example: Decommit Process

For the column data in **Figure 1**, consider the query indices $[(2, [0]), (1, [1])]$, where the query indices are maps from log size to a vector of query indices for columns of that size. This corresponds to querying the following elements:

* The 0th element of columns of size $2^2=4$: $\textcolor{red}{a}$ from $\textcolor{red}{\text{\textit{Column} 0}}$ and $\textcolor{green}{p}$ from $\textcolor{green}{\text{\textit{Column} 1}}$.
* The 1st element of the column of size $2^1=2$: $\textcolor{blue}{v}$ from $\textcolor{blue}{\text{\textit{Column} 2}}$.

Because columns of equal length are committed together, the same indices are opened together in the decommitment. For example, for query $(2, [0])$, both $\textcolor{red}{a}$ and $\textcolor{green}{p}$ are opened together.

Below is a walkthrough of the main loop in the `decommit` function, showing the state of key variables for each `layer_log_size`:

* **layer\_log\_size = 2 (columns of size 4):**
  * `layer_columns`: $[\textcolor{red}{\text{\textit{Column} 0}}, \textcolor{green}{\text{\textit{Column} 1}}]$
  * `previous_layer_hashes`: `None` (first layer)
  * `prev_layer_queries`: empty
  * `layer_column_queries`: $[0]$
  * For `node_index = 0`:
    * `node_values`: $[\textcolor{red}{a}, \textcolor{green}{p}]$
    * Queried, so append to `queried_values`: $[\textcolor{red}{a}, \textcolor{green}{p}]$
    * Add `node_index = 0` to `layer_total_queries` (to propagate to next layer)
  * At this stage: `queried_values` = $[\textcolor{red}{a}, \textcolor{green}{p}]$, `decommitment` is empty.

* **layer\_log\_size = 1 (columns of size 2):**
  * `layer_columns`: $[\textcolor{blue}{\text{\textit{Column} 2}}]$
  * `previous_layer_hashes`: $[h_{00}, h_{01}, h_{10}, h_{11}]$
  * `prev_layer_queries`: $[0]$
  * `layer_column_queries`: $[1]$
  * For `node_index = 0`:
    * Add $h_{01}$ to `hash_witness` in decommitment.
    * `node_values`: $[\textcolor{blue}{u}]$
    * Not queried, so append to `column_witness` in decommitment.
    * Add `node_index = 0` to `layer_total_queries`.
  * At this stage: `queried_values` = $[\textcolor{red}{a}, \textcolor{green}{p}]$, `hash_witness` = $[h_{01}]$, `column_witness` = $[\textcolor{blue}{u}]$, `layer_total_queries` = $[0]$.
  * For `node_index = 1`:
    * Add $h_{10}, h_{11}$ to `hash_witness` in decommitment.
    * `node_values`: $[\textcolor{blue}{v}]$
    * Queried, so append to `queried_values`.
    * Add `node_index = 1` to `layer_total_queries`.
  * At this stage: `queried_values` = $[\textcolor{red}{a}, \textcolor{green}{p}, \textcolor{blue}{v}]$, `hash_witness` = $[h_{01}, h_{10}, h_{11}]$, `column_witness` = $[\textcolor{blue}{u}]$, `layer_total_queries` = $[0, 1]$ (to propagate to next layer).

* **layer\_log\_size = 0 (root):**
  * `layer_columns`: empty
  * `previous_layer_hashes`: $[h_{0}, h_{1}]$
  * `prev_layer_queries`: $[0, 1]$
  * `layer_column_queries`: empty
  * No values are added to `queried_values`, `hash_witness`, or `column_witness` in this layer.

**Final output:**

* `queried_values`: $[\textcolor{red}{a}, \textcolor{green}{p}, \textcolor{blue}{v}]$
* `decommitment`:
  * `hash_witness`: $[h_{01}, h_{10}, h_{11}]$
  * `column_witness`: $[\textcolor{blue}{u}]$

In the next section, we describe the verification process to verify the inclusion of the queried values using the Merkle decommitment.
