# Trees

## Introduction

Trees are specialized graphs with parent-child relationships between nodes. A solid understanding of trees is essential to tackle advanced subjects such as a algorithms.

## Tree

A tree is a connected undirected graph with no simple circuits In other words, an undirected graph is a tree if and only if there is a unique simple path between any two of its vertices.

In many trees, one particular vertex is known as a root and relationships within the tree are assigned with respect to the root vertex. A tree with a root is known as a rooted tree. A rooted tree is a directed graph such that a directed edge indicates a parent-child relationship. If there is a directed edge from a vertex $a$ to vertex $b$ in a rooted tree, then the vertex $a$ is known as the parent of the child $b$. Since all relationships begin at the root, the root has no parents.

Vertices with the same parent are known as siblings. The ancestors of a vertex $a$ are all the vertices in the path from the root to the vertex $a$. Conversely, the descendents of a vertex $b$ are all the vertices that have the vertex $b$ as their ancestor.

A vertex in a rooted tree is called a leaf if it does not have any children. Vertices that have children are known as internal vertices. The root is also an internal vertex, unless it is the only vertex in the graph, in which case, it is a leaf.

If $a$ is a vertex in a tree, then the subtree with $a$ at its root is the subgraph of the tree consisting of $a$ and its descendents and all the edges incident to these descendents.

## $m$-ary tree

A rooted tree is known as an $m$-ary tree if every internal vertex has no more than $m$ children. The tree is called a full $m$-ary tree if every internal vertex has exactly $m$ children. An $m$-ary tree with $m = 2$ is known as a binary tree.

## Number of edges in a tree

A tree with $n$ vertices has exactly $n - 1$ edges.

#### Proof

The above statement can be proven with mathematical induction.

Base step: Easy to see that it is true when $n = 1$, since there are no edges in such a tree. For $n = 2$, there is exactly one edge connecting the two vertices.

Inductive hypothesis: Suppose it is true for $n = k$. This means, there are $k - 1$ edges in a tree with $k$ vertices.

Inductive step: Then, an additional vertex can be added to the tree by introducing one more edge to connect the new vertex to an existing one.

Thus, for a tree with $k + 1$ vertices, there are $k - 1 + 1 = k$ edges.

$$\qed$$

In fact, any connected graph with $n$ vertices and $n-1$ edges is a tree. This is because $n - 1$ edges are the minimum number of edges required for a connected graph with $n$ nodes. Removal of any edge renders the graph disconnected, implying that there are no circuits in a graph with $n - 1$ edges. Thus, it must be a tree.

## Properties of trees

Some useful properties of trees

1. A tree with $n$ vertices has $n - 1$ edges
2. A full $m$-ary tree with $i$ internal vertices contains $n = mi + 1$ vertices
3. There are at most $m^h$ leaves in an $m$-ary tree of height $h$.
4. If an $m$-ary tree of height $h$ has $l$ leaves, then $h \ge \left\lceil \log_m l \right\rceil$. The equality occurs when the tree is full and balanced.

For an ordered rooted tree, the universal address system is labeling procedure that works as follows

1. Label the root as $0$
2. Label the $k$ children of the root from left to right as $1, 2, \ldots, k$.
3. For a vertex that has been labeled $a$, label its children $k_a$ from left to right as $a.1, a.2, \ldots, a.k_v$.

Thus, the universal address system provides a lexicographic ordering of the vertices of a tree in the form of their labels.

## Preorder traversal

Let $T$ be an ordered tree rooted at $r$. If $r$ is the only vertex in the tree, then preorder traversal involves visiting $r$.

Otherwise, if $T_1, T_2, T_n$ are the subtrees of $T$ at $r$ from left to right, then preorder traversal involves visiting $r$ followed by the preorder traversal of $T_1$, then $T_2$, and so on, finally culminating in traversing $T_n$ in preorder.

Note that in a preorder traversal, the root is visited before any of its subtrees are traversed.

In the above figure, we have numbered the order of traversal of each of the nodes, starting at the root.

## Inorder traversal

Let $T$ be an ordered tree rooted at $r$. If $r$ is the only vertex in the tree, then inorder traversal involves visiting $r$.

Otherwise, if $T_1, T_2, T_n$ are the subtrees of $T$ at $r$ from left to right, then inorder traversal involves inorder traversal of $T_1$, followed by visiting $r$, and then the inorder traversal of $T_2$, then $T_3$, and so on, finally culminating in traversing $T_n$ in inorder.

Note that in an inorder traversal, the root is visited after its first subtree has been traversed.

In the above figure, we have numbered the order of traversal of each of the nodes, starting at the root.

## Postorder traversal

Let $T$ be an ordered tree rooted at $r$. If $r$ is the only vertex in the tree, then postorder traversal involves visiting $r$.

Otherwise, if $T_1, T_2, T_n$ are the subtrees of $T$ at $r$ from left to right, then postorder traversal postorder traversal of $T_1$, then the postorder traversal of $T_2$, then $T_3$, and so on until traversing $T_n$ in postorder, finally ending with the visit to $r$.

Note that in a postorder traversal, the root is visited after all of its subtrees are traversed.

In the above figure, we have numbered the order of traversal of each of the nodes, starting at the root.

## Spanning tree

A spanning tree of a simple graph $G$ is a subgraph of $G$ that is a tree containing every vertex of the graph $G$.

A simple graph is connected if and only if it has a spanning tree.

Two algorithms are popular for constructing a spanning tree of a connected graph. The breadth-first search algorithm starts at a node then visits all its children, followed by their children, till the entire graph has been traversed. The depth-first search algorithm starts at a node and completely traverses its first subtree, followed by the next subtree, till the entire graph has been traversed. Since the depth-first search algorithm goes back to traverse another subtree of the same parent, it is also known as backtracking.

## Minimum spanning tree

A minimum spanning tree in a connected weighted graph is a spanning tree that has the smallest possible sum of weights of its edges.

## Prim's algorithm

The Prim's algorithm for finding a minimum spanning tree is a greedy algorithm that starts by adding the edge with the smallest weight to the spanning tree. Then it adds another edge with the lowest weight among the weights that are incident on the vertices already in the spanning that does not lead to a circuit. It stops when $n - 1$ such edges have been added.

It can be proven that the Prim's algorithm produces a minimum spanning tree for any connected weighted graph.

## Kruskal's algorithm

The Kruskal's algorithm for finding a minimum spanning tree is a greedy algorithm that successively adds edges with minimum weight to the spanning tree that do not form a simple circuit with those edges that are not already in the spanning tree, until $n - 1$ edges have been added to the tree.

It can be proven that Kruskal's algorithm produces a minimum spanning tree for any connected weighted graph.

## Next-article

If you haven't already studied graphs, we recommend that you study these generalization of the tree concept to directed and undirected graphs. Else, you can check out other subjects in our curriculum.