The ultimate guide to master tree data structures step-by-step in Python and Javascript

merlox

merlox

Posted on December 19, 2019

The ultimate guide to master tree data structures step-by-step in Python and Javascript

The Tree data structure is one of the most common and efficient form of storage to keep data easily accessible in a descending structure that looks like a pyramid. It is used in databases and all sorts of applications so you need to master it if you want to become a better programmer. Plus, it's one of the most asked data structures in programming interviews.

Besides, it's Christmas soon and we all know trees are extremely important to keep us breathing that fresh, clean air why not learn how they work in the computer world?

If you wondered, these are some of the best use cases for the tree data structures:

  • For hierarchical systems such as an HTML page or a computer where the folder structure is made of folders and files stored inside multiple layers like this:
          C:/ Drive
        /   |    |  \
    Apps Games Music Desktop
  • For implementing indexes inside databases to access content extremely quickly.
  • To predict what you're typing on your phone given that each letter is searchable inside a Tree and give you suggestions to improve your speed.

Definitions

To begin understanding trees, you must familiarize yourself with the following concepts:

  • Node is each tree element containing a value. Think of them as capsules connected to each other going down.
  • Root is the first node of the tree.
  • Edge is the line that connects 2 nodes.
  • Child is a node that has a parent node.
  • Parent node is the one that contains children, lower nodes.
  • Leaves are nodes that don't have a child, the last nodes in the tree.
  • Height is the length (starting at 1) of the longest path to a leaf.
  • Depth is the length of a path to the root. Essentially a smaller height for a specific item so you can say "Item 5 is at the depth 7". We will use it to create algorithms for our trees.

Binary Trees

Now that you understand how trees are made of, let's take a look at a specific type of tree. There are many variations but the binary tree is one of the most used ones because of its simplicity and speed and they look like this:

             Root
            /     \
        Left       Right
       /    \     /     \
   Left   Right Left     Right

These are trees where each node can contain maximum 2 children called left and right nodes. In code, the tree is a bunch of nested objects:

root => {
   value: 1,
   left: {
      value: 3,
      left: {
      value: 2,
      left: null,
      right: null,
      },
      right: {
      value: 8,
      left: null,
      right: null,
      }
   },
   right: {
      value: 3,
      left: null,
      right: null,
   }
}

That particular tree looks like this:

                                   1
                                 /   \
                                3     3
                              /   \
                             2     8

Each node is a javascript object made of the node value and the left and right sub-nodes. The sub-nodes can be null but the value can't be empty.

In this guide you'll see python and javascript code snippets to understand the implementation in 2 different languages with its own differences. So here's the code to generate that particular tree above in python:

class Node:
    value = None
    left = None
    right = None
    def __init__(value, self):
        self.value = value
    def add_left(value, self):
        self.left = Node(value)
    def add_right(value, self):
        self.right = Node(value)

tree = Node(11)
tree.add_left(8)
tree.add_right(16)
print(tree.value) # 11
print(tree.left.value) # 8
print(tree.right.value) # 16

# Let's add the next items
tree.left.add_left(5)
tree.left.add_right(10)
tree.right.add_right(18)

print(tree.value) # 11
print(tree.left.value) # 8
print(tree.right.value) # 16

print(tree.left.left.value) # 5
print(tree.left.right.value) # 10
print(tree.right.right.value) # 18

The class Node contains the state variables value, left and right. The constructor initializes the root node value while the add_left and add_right methods are used to setup the left and right nodes.

Note that the left and right values are another instance of the class Node. They are not plain values, they are Nodes with value, left and right.

Tree algorithms

Here comes the fun part. To navigate the trees and access specific values, we use something called tree traversal algorithms which allow you to move in a specific direction to find the values faster. Let's take a look at some of the most popular ones...

There are 2 main traversal methods:

  1. The Depth-First-Search (DFS) method
  2. The Breadth-First-Search (BFS) method

Then, inside the DFS there are 3 subcategories depending on the order used:

  1. Pre-order
  2. In-order
  3. Post-order

Let's take a quick look to understand each traversal method in < 1 minute each:

Pre-order

The steps to execute this one are the following:

  • First print the value of the current node.
  • Then go to the left node.
  • Then go to the right node.

Before checking the algorithm, here's the implementation of a simple tree data structure in javascript that we'll use for the examples:

class Tree {
  value = null
  left = null
  right = null
  constructor(value) {
    this.value = value
  }
  addLeft(value) {
    this.left = new Tree(value)
  }
  addRight(value) {
    this.right = new Tree(value)
  }
}
const myTree = new Tree(1)

It's done recursively so you don't have to worry about anything else. Here's the code in javascript:

preOrder() {
  console.log(this.value)
  if (this.left) this.left.preOrder()
  if (this.right) this.right.preOrder()
}

And here's the complete implementation using a tree class in modern javascript:

class Tree {
  value = null
  left = null
  right = null
  constructor(value) {
    this.value = value
  }
  addLeft(value) {
    this.left = new Tree(value)
  }
  addRight(value) {
    this.right = new Tree(value)
  }
  preOrder() {
    console.log(this.value)
    if (this.left) this.left.preOrder()
    if (this.right) this.right.preOrder()
  }
}
const myTree = new Tree(1)

In-order

The steps to execute this one are the following:

  • Go to the left node.
  • Then print the value of the current node.
  • Then go to the right node.

Here's the code in js:

inOrder() {
  if (this.left) this.left.inOrder()
  console.log(this.value)
  if (this.right) this.right.inOrder()
}

And the full implementation:

class Tree {
  value = null
  left = null
  right = null
  constructor(value) {
    this.value = value
  }
  addLeft(value) {
    this.left = new Tree(value)
  }
  addRight(value) {
    this.right = new Tree(value)
  }
  inOrder() {
    if (this.left) this.left.inOrder()
    console.log(this.value)
    if (this.right) this.right.inOrder()
  }
}
const myTree = new Tree(1)

Post-order

The steps to execute this one are the following:

  • Go to the left node.
  • Then go to the right node.
  • Finally print the value of the current node.

Here's the code in js:

postOrder() {
  if (this.left) this.left.postOrder()
  if (this.right) this.right.postOrder()
  console.log(this.value)
}

Breadth-First Search (BFS)

The last methods to traverse our trees is using the level-order from the Breadth-First Search which consists on getting the values by layers. You traverse the tree horizontally until you access all values.

This one is a bit more complicated because we need an additional data structure: a Queue. Think of queues as arrays where you can only get the first element. You can add items at the end of the queue and remove them inversely from the beginning. Just like a cinema where people buy their tickets in order, following a queue. This is called FIFO where First In is First Out. For instance:

let queue = []
queue.push(1)
queue.push(2)
queue.push(3) // Now our queue is [1, 2, 3]
queue.get() // Returns "1"
queue.shift() // Removes "1"
console.log(queue) // Returns [2, 3]

The steps are the following:

  • Print the value of the first queue element.
  • Add the entire left node to the queue.
  • Add the entire right node to the queue.

You start by adding the entire root node to the queue. Here's how it looks in javascript:

bfs() {
  let queue = []
  queue.push(this)
  while (queue.length > 0) {
    const active = queue[0]
    console.log(active.value)
    if (active.left) queue.push(active.left)
    if (active.right) queue.push(active.right)
    queue.shift()
  }
}

Here's the full implementation:

class Tree {
  value = null
  left = null
  right = null
  constructor(value) {
    this.value = value
  }
  addLeft(value) {
    this.left = new Tree(value)
  }
  addRight(value) {
    this.right = new Tree(value)
  }
  bfs() {
    let queue = []
    queue.push(this)
    let counter = 0
    while (queue.length > 0 && counter < 100) {
      const active = queue[0]
      console.log(active.value)
      if (active.left) queue.push(active.left)
      if (active.right) queue.push(active.right)
      queue.shift()
      counter++
    }
  }
}
const myTree = new Tree(1)

Now this is the complete implementation of the Depth-First Search and Breadth-First Search tree traversal algorithms that you've just seen for your future reference:

class Tree {
  value = null
  left = null
  right = null
  constructor(value) {
    this.value = value
  }
  addLeft(value) {
    this.left = new Tree(value)
  }
  addRight(value) {
    this.right = new Tree(value)
  }
  preOrder() {
    console.log(this.value)
    if (this.left) this.left.preOrder()
    if (this.right) this.right.preOrder()
  }
  inOrder() {
    if (this.left) this.left.inOrder()
    console.log(this.value)
    if (this.right) this.right.inOrder()
  }
  postOrder() {
    if (this.left) this.left.postOrder()
    if (this.right) this.right.postOrder()
    console.log(this.value)
  }
  bfs() {
    let queue = []
    queue.push(this)
    let counter = 0
    while (queue.length > 0 && counter < 100) {
      const active = queue[0]
      console.log(active.value)
      if (active.left) queue.push(active.left)
      if (active.right) queue.push(active.right)
      queue.shift()
      counter++
    }
  }
}
const myTree = new Tree(1)

And in python:

from Queue import Queue

class Node:
    value = None
    left = None
    right = None
    def __init__(self, value):
        self.value = value
    def add_left(self, value):
        self.left = Node(value)
    def add_right(self, value):
        self.right = Node(value)
    def pre_order(self):
        print(self.value)
        if self.left:
            self.left.pre_order()
        if self.right:
            self.right.pre_order()
    def in_order(self):
        if self.left:
            self.left.in_order()
        print(self.value)
        if self.right:
            self.right.in_order()
    def post_order(self):
        if self.left:
            self.left.post_order()
        if self.right:
            self.right.post_order()
        print(self.value)
    def breadth_first_search(self):
        queue = Queue()
        queue.put(self)
        while not queue.empty():
            active = queue.get()
            print(active.value)
            if active.left:
                queue.put(active.left)
            if active.right:
                queue.put(active.right)
# Some sample data
tree = Node('html')
tree.add_right('body')
tree.right.add_left('h1')
tree.right.add_right('h2')
# A few examples here
tree.pre_order()
print('\n')
tree.in_order()
print('\n')
tree.post_order()
print('\n')
tree.breadth_first_search()

The Binary Search Tree data structure

Now that you've seen how to create and use trees, let's take a look at a specific implementations of the tree data structure called the Binary Search Tree.

These types of trees are known for being ordered, for having all the elements sorted vertically. The way it works is the following:

  • Start with the root element, for instance the number 20.
  • If the next node in the tree is smaller than the root, place it to the left.
  • If it's larger, place it to the right.

Let's suppose you have to create a binary search tree from these inputs: 20, 10, 49, 28, 59, 29

Start with the root node, in this case 20.

The next node, 10, will be placed to the left because 10 is smaller than 20 so we have:

   20
  /
10

Then the next node, 49, goes to the right of 20 because it's larger:

   20
  /  \
10    49

Next, the node 28 goes to the left of the 49 node. Why? because we can't modify existing nodes, all we can do is add elements to the current tree so it goes to the left of 49 like this:

   20
  /  \
10    49
     /
   28

After that we add the node 59 to the right of 49 because it's larger:

    20
   /  \
 10    49
      /  \
    28    59

Finally we add the element 29 to the right of 28 like so:

    20
   /  \
 10    49
      /  \
    28    59
      \
       29

Now you may be asking, why not add 29 to the left of the node 59? After all, 29 is smaller then 59 so it could go there. Good question. The reason why we add it to the right of 28 instead of the left of 59 is because the binary search tree must be ordered vertically, meaning if you go from left to right, you'll see that elements are increasing in value. Take a look at the current structure. First we have then to the left, then 20 slighty to the right and on top of it, then 28 below 20, then 49, 29 below 49 and 59. It wouldn't make sense to go lower after going to 59.

Just to be clear, what we're following is this algorithm:

  • Start at the root.
  • Is the node that we want to insert larger than the root? Go to the right, else to the left.
  • Continue going down the levels checking if the node is larger or smaller until you find the right place.

Here's the breakdown of the last element we added, the node 29:

  1. Is 29 larger or smaller then 20? Larger, go to the right.
  2. Is 29 larger or smaller than 49? Smaller, go to the left.
  3. Is 29 larger or smaller then 28? Larger, go to the right.
  4. Because there aren't any more nodes, save that position.

That's why we didn't place the node 29 to the left of 59 even though it may seem logical at first.

Here's the long awaited code in python and javascript for the binary search tree implementation:

In javascript:

class BinarySearchTree {
  value = null
  left = null
  right = null
  constructor (value) {
    this.value = value
  }
  addNode (value) {
    if (value >= this.value && !this.right) {
      this.right = new BinarySearchTree(value)
    } else if (value < this.value && !this.left) {
      this.left = new BinarySearchTree(value)
    } else if (value >= this.value && this.right) {
      this.right.addNode(value)
    } else if (value < this.value && this.left) {
      this.left.addNode(value)
    }
  }
  bfs() {
    let queue = []
    queue.push(this)
    let counter = 0
    while (queue.length > 0 && counter < 100) {
      const active = queue[0]
      console.log(active.value)
      if (active.left) queue.push(active.left)
      if (active.right) queue.push(active.right)
      queue.shift()
      counter++
    }
  }
}
// BST with 20, 10, 49, 28, 59, 29
let b = new BinarySearchTree(20)
b.addNode(10)
b.addNode(49)
b.addNode(28)
b.addNode(59)
b.addNode(29)
b.bfs()

In python:

import Queue from Queue
class BinarySearchTree:
    value = None
    left = None
    right = None
    def __init__(self, value):
        self.value = value
    def add_node(self, value):
        if value >= self.value and self.right is None:
            self.right = BinarySearchTree(value)
        elif value < self.value and self.left is None:
            self.left = BinarySearchTree(value)
        elif value >= self.value and self.right is not None:
            self.right.add_node(value)
        elif value < self.value and self.left is not None:
            self.left.add_node(value)
    def breadth_first_search(self):
        queue = Queue()
        queue.put(self)
        while not queue.empty():
            active = queue.get()
            print(active.value)
            if active.left:
                queue.put(active.left)
            if active.right:
                queue.put(active.right)
# BST with 20, 10, 49, 28, 59, 29
binarySearchTree = BinarySearchTree(20)
binarySearchTree.add_node(10)
binarySearchTree.add_node(49)
binarySearchTree.add_node(28)
binarySearchTree.add_node(59)
binarySearchTree.add_node(29)
binarySearchTree.breadth_first_search()

As you can see I've added the breadth first level search to check that the values are being added correctly from the nodes that we saw previously to verify that they are being added in the right order.

The add_node method uses recursion to add nodes by calling the same function over and over until the element is placed. That means all the recursive calls will be stored in memory so if you're working in a small memory computer you may run into problems if you create a massive binary search tree. If you're curious about the topic of the memory limits in recursive functions, check this page https://rosettacode.org/wiki/Find_limit_of_recursion where you'll see the actual limits of each language.

For instance, in javascript the maximum recursion depth in chrome is 10473 while in firefox it is 3000. Although it may vary from version to version. In my case I get 11402 levels deep before it reaches the memory limit. You can try it yourself on your computer by running this function in the chrome developer tools (by right clicking and selecting "Inspect" in any page to then open the console tab where you can paste this code):

function recurse(depth)
{
 try
 {
  return recurse(depth + 1);
 }
 catch(ex)
 {
  return depth;
 }
}

let maxRecursion = recurse(1);
console.log("Recursion depth on this system is " + maxRecursion)

Coming back to the BST tree, notice how the add_node method places identical values to the right and smaller values to the left. That's in case you add the same node with the a repeated value. You can setup your own rules for exceptional cases like those where we have repeated values or invalid ones.

The only difference from a binary tree and a binary search tree is the way elements are placed. In a binary tree you add nodes to the left or right depending on your preference while in the binary search tree you let the algorithm decide the right place in an ordered fashion with the add_node method. This allows for very efficient search methods to exist since you don't have to check the entire tree, just a branch to the right value.

Finding values in Binary Search Trees

Now let's take a look at how we find values in our binary search tree. The process to find if a given value exists somewhere goes like this:

  1. Given a value X, check if X is larger or smaller than the root node.
  2. If larger continue checking in the next level to the right or to the left otherwise.
  3. If there are no more nodes to check or if the value is the same, see if the current node value is the one we're looking for.
  4. Return true or false depending on if the value exists in the binary search tree.

Here's the implementation in javascript:

existsNode (value) {
  if (value > this.value && this.right) {
    return this.right.existsNode(value)
  } else if (value < this.value && this.left) {
    return this.left.existsNode(value)
  }
  return value == this.value
}

And in python:

def exists_node(self, value):
  if value > self.value and self.right:
    return self.right.exists_node(value)
  elif value < self.value and self.left:
    return self.left.exists_node(value)
  return value is self.value

Deleting values in Binary Search Trees

When it comes to deleting nodes of a binary search tree, we need to keep in mind the possible differences that can occur. Let's take a look at 3 scenarios to illustrate what happens in different trees with variable nodes:

Deleting a node with no children

Here's how it looks like:

      10                         10
     /  \                          \
    8    20    –Delete 8–>          20

Essentially we are leaving an empty space. So in code it should do the following:

  1. Find the node to delete
  2. Check if the node we are deleting has no children
  3. If so, simply empty it's value by replacing it with null or None depending on your language

For this method we need to pass the value we want to delete and the parent since that's important to complete the deletion. Alternatively you could implement a method that returns the parent of the node you want to delete but in this case we'll keep it simple and just pass the parent.

Also, the method will return true or false depending on whether it finds the node to delete or not. We'll use recursion. Note that you can't delete the root node since it doesn't make sense to do it when you can simply replace the entire tree so we'll not allow that possibility.

So in javascript:

deleteNode (value, parent) {
  // Find the node to delete using recursion
  if (this.value == value) {
    if (parent.left && parent.left.value == value) {
      parent.left = null
      return true
    } else {
      parent.right = null // One of the branches must be the exact value so we remove the right if the left is not the right value
      return true
    }
  } else if (value > this.value && this.right) {
    return this.right.deleteNode(value, this)
  } else if (value < this.value && this.left) {
    return this.left.deleteNode(value, this)
  } else {
    return false // We didn't find the element
  }
}

You can see how it compares the value to the current node to see if the value is equal, larger or smaller to then call the function recursively until we find the node to delete or we run out of nodes. This is only possible because our binary search tree is ordered so we can move across the tree in a logical manner.

Deleting a node with just one children

When you want to delete a node and that node has exactly one child, what we do in that case is replace the current node with the child instead of leaving it empty. Here's a visual:

    10                          10
   /  \                        /  \
  8    20    –Delete 20–>     8    11
      /
     11

And the code in javascript. You'll have the final code in python too at the end of this guide:

deleteNode (value, parent) {
  // Find the node to delete using recursion
  if (this.value == value) {
    if (parent.left && parent.left.value == value) {
      // If the element we are trying to delete has a child and only one we replace it with that child
      if (this.left && !this.right) {
        parent.left = this.left
      } else if (!this.left && this.right) {
        parent.left = this.right
      } else if (this.left && this.right) {
        // The node we want to delete has 2 children
      } else {
        // The node we want to delete has no children
        parent.left = null
      }
    } else {
      // One of the branches must be the exact value so we remove the right if the left is not the right value
      if (this.left && !this.right) {
        parent.right = this.left
      } else if (!this.left && this.right) {
        parent.right = this.right
      } else if (this.left && this.right) {
        // The node we want to delete has 2 children
      } else {
        // The node we want to delete has no children
        parent.right = null
      }
    }
    return true
  } else if (value > this.value && this.right) {
    return this.right.deleteNode(value, this)
  } else if (value < this.value && this.left) {
    return this.left.deleteNode(value, this)
  } else {
    return false // We didn't find the element
  }
}

There we just update the node to be deleted with the appropriate child. If the node to delete has 2 children we don't do anything yet. The remaining logic stays the same given that the traversal process hasn't changed.

Remember that when we replace a node we are moving all the nodes underneath it so they are not affected. Is not removing a branch, is cutting a piece of that branch to put it back together to the same place.

Deleting a node with 2 children

Now when it comes to deleting a node with 2 children we need to setup some rules. You can replace the node with the left child or with the right child knowing that you'll have to place the other side to the leftmost or rightmost of that "upgraded" node. Consider this situation:

     10                          10
    /  \                        /  \
   8    20    –Delete 20–>     8    14
       /  \                        /  \
     14    23                    11    17
    /  \                                \
  11    17                               23

We removed the node 20 then we decided to replace that empty space with the left node, elevating it, knowing that the right side of the deleted node, 23, will be placed at the rightmost place of the elevated node. So the node 23 will be placed at the largest node of the branch 14, the last element on the rightmost place.

Hope that makes sense. If not, remember that all elements on the left are always smaller than the current node and elements on the right are always larger than the current node. That's why when we elevate the left node, we want to place the hanging right node to the last right place of our elevated node. Remember that binary trees can only have 2 children.

Here's how it looks in javascript:

deleteNode (value, parent) {
  // Find the node to delete using recursion
  if (this.value == value) {
    if (parent.left && parent.left.value == value) {
      // If the element we are trying to delete has a child and only one we replace it with that child
      if (this.left && !this.right) {
        parent.left = this.left
      } else if (!this.left && this.right) {
        parent.left = this.right
      } else if (this.left && this.right) {
        // The node we want to delete has 2 children
        // 1\. We will replace the deleted node with the left one so find the largest node on the left and place the hanging right node there
        // 2\. Then replace the node after the hanging one has been taken care of (we can only have 2 children in binary trees)
        parent.left = this.left
        let currentNode = this.left
        let isAdded = false
        while (!isAdded) {
          if (currentNode.right) {
            currentNode = currentNode.right
          } else {
            // Add the hanging right node to the end of the left branch from the right
            currentNode.right = this.right
            isAdded = true
          }
        }
      } else {
        // The node we want to delete has no children
        parent.left = null
      }
    } else {
      // One of the branches must be the exact value so we remove the right if the left is not the right value
      if (this.left && !this.right) {
        parent.right = this.left
      } else if (!this.left && this.right) {
        parent.right = this.right
      } else if (this.left && this.right) {
        // The node we want to delete has 2 children
        parent.right = this.left
        let currentNode = this.left
        let isAdded = false
        while (!isAdded) {
          if (currentNode.right) {
            currentNode = currentNode.right
          } else {
            // Add the hanging right node to the end of the left branch from the right
            currentNode.right = this.right
            isAdded = true
          }
        }
      } else {
        // The node we want to delete has no children
        parent.right = null
      }
    }
    return true
  } else if (value > this.value && this.right) {
    return this.right.deleteNode(value, this)
  } else if (value < this.value && this.left) {
    return this.left.deleteNode(value, this)
  } else {
    return false // We didn't find the element
  }
}

Here's the explanation:

  • When the node to delete has 2 children, we run this code:
    // The node we want to delete has 2 children
    parent.right = this.left
    let currentNode = this.left
    let isAdded = false
    while (!isAdded) {
      if (currentNode.right) {
        currentNode = currentNode.right
      } else {
        // Add the hanging right node to the end of the left branch from the right
        currentNode.right = this.right
        isAdded = true
      }
    }
  • What it does is first replace the node to delete with the left child including all the sub-children since they are connected automatically by references.
  • After that, we need to position the hanging right child to the end of the right branch of the left child. We run a while loop that accesses the left child, the one we elevated right now, and we find the rightmost node. This is the node with the highest value.
  • We know that the last right node doesn't have any children so we use an if statement and set the hanging right node as the child of the last right node from the left child.
  • Now the deleted node has been replaced by the left child and the right child has been placed at the bottom of the left child since it's the largest value because we are using ordered binary trees.

It's confusing at first so be sure to write your own samples to verify the functionality and let me know if you find any errors to fix or improvements to include.

Be sure to join my email list to receive exclusive guides via email that you won't find elsewhere written by me from 5 years of experience working in programming here: http://eepurl.com/dDQ2yX

💖 💪 🙅 🚩
merlox
merlox

Posted on December 19, 2019

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related