Data Structures: Graphs
Tamerlan Gudabayev
Posted on May 23, 2021
Introduction
Graphs.
Everyone talks about them.
They always ask them in interviews.
But you don't know them.
It doesn't have to be this way, that's why in this week we are tackling the graph data structure.
Learning Outcomes
Today, you will learn:
- What is a graph.
- Where are graphs used.
- Types of Graphs
- How to implement a graph
- How to traverse a graph
Table of Contents
- What is a graph?
- Types of Graphs
- Weighted Graphs
- Applications
- Types of Implementations
- Code Implementation
- Traversing a Graph
- Conclusion
- Additional References
What is a graph?
A graph is a non-linear data structure that helps us describe entities and it's relationships to other entities.
For example, you may use the graph data structure to represent friends in a social media website.
Graphs are very useful because they can describe a great deal of problems we have in the real world.
But before we move on, we have to break down the graph even more.
Essential Terminology
A graph (V,E) is made out of two parts:
- Collection of Vertices (V): Your basic node (i.e user, place, event, etc...)
- Collection of Edges (E): They represent the relationships between nodes. In mathematical terms, they can be represented by coordinates of two nodes (u,v).
We can represent the graph above by:
V = {0, 1, 2, 3, 4}
E = {
(0,1), (0,4), (1,0),
(1,2), (1,3), (1,4),
(2,1), (2,3),
(3,1), (3,2), (3,4),
(4,0), (4,1), (4,3),
}
G = {V, E} // A graph is a collection of vertices and edges
Types of Graphs
While researching, I've come across many different types of graphs, but to save you time I've narrowed it down to two basic types:
- Directed Graph
- Simple Graph
Directed Graphs
A graph where all edges are directed from one vertex to another vertex is called a directed graph.
The connections are not bidirectional, meaning that if we have a connection from A to B, that doesn't mean we have a connection from B to A.
But you might ask "Why will we use something like this?"
Well, there are many use cases for directed graphs, here are a few:
- Twitter: When you follow someone, that's a direct connection, the other person doesn't need to follow you.
- Flights: You might be able to fly from Paris to London, but there's no guarantee that you can fly from London to Paris.
- Road Networks: Some roads are one-way, others are two-way, you get the gist.
Simple Graphs
Graphs that are always two-way connections are called simple graphs. Our first example (at the beginning of the post) was a simple graph (also called undirected graphs).
PS. Through out this post, whenever I say graphs, I'm gonna mean simple graphs.
Weighted Graphs
So far we assume that every edge is equal to each other. Unfortunately, life doesn't work that way, the distance between France and Germany is shorter than France and the USA. This is why we add a weight to each edge. A graph with weighted edges is called a weighted graph.
Applications
Graphs are very useful, they can represent many real-life models.
Here are a few notable use cases:
- Social Graphs: This helps showcase the relationship between you and the people, places and thing you interact with online.
- Knowledge Graphs: This is notably used at Google, to connect data to each other. For example, the painting "Mona Lisa" is associated to "Leonardo Davinci", in which he is associated to his other painting, and the rest of his information.
- Recommendation Engines: Graphs helps us build recommendation engines, for example in Netflix, if you like "The Witcher", you will be recommended other shows by people who also liked "The Witcher". Everything is connected to each other.
- Path Optimization Algorithms: What's the best way to get from Tokyo to London, or from your local grocery store to your nearest home. These types of questions can be modelled and solved using graphs. This is already being used in most map applications such as Google Maps.
Types of Implementations
Now that we have the theory covered, let us move into more practical stuff.
There are three main ways to implement graphs:
- Adjacency Matrix
- Adjacency Lists
- Adjacency Sets
Let us cover each implementation with it's pros and cons.
Adjacency Matrix
In adjacency matrix, we represent the graph in a two-dimensional array of V x V vertices.
If any element of graph[i][j]
is 1, this means that there is an edge connecting vertex i with vertex j.
This approach if very fast for look-ups. Because we can instantly get the value of graph[i][j]
But it's also very heavy on memory because we have to reserve a space in memory for every possible edge (V x V).
Adjacency List
Adjacency list represents graphs as an array of linked lists.
The index of the array represent the vertex and each element in the linked list represents the vertices it connects to, making an edge.
This approach saves us a lot of space, but it's slower than adjacency matrix on hookups.
Adjacency Set
This is a new alternative to adjacency lists, instead of having an array of linked lists, have it be an array of sets.
But why?
Well the reason is that sets are ordered and do not allow for duplicates.
We can use binary search, making our search times O(log n).
Code Implementation
For this example, we are gonna use the adjacency list implementation because it's the most common.
We will also use python because I like it.
# A class to represent the adjacency list of the node
class AdjNode:
def __init__(self, data):
self.vertex = data
self.next = None
# A class to represent a graph. A graph
# is the list of the adjacency lists.
# Size of the array will be the no. of the
# vertices "V"
class Graph:
def __init__(self, vertices):
self.V = vertices
self.graph = [None] * self.V
# Function to add an edge in an undirected graph
def add_edge(self, src, dest):
# Adding the node to the source node
node = AdjNode(dest)
node.next = self.graph[src]
self.graph[src] = node
# Adding the source node to the destination as
# it is the undirected graph
node = AdjNode(src)
node.next = self.graph[dest]
self.graph[dest] = node
# Function to print the graph
def print_graph(self):
for i in range(self.V):
print("Adjacency list of vertex {}\n head".format(i), end="")
temp = self.graph[i]
while temp:
print(" -> {}".format(temp.vertex), end="")
temp = temp.next
print(" \n")
# Driver program to the above graph class
if __name__ == "__main__":
V = 5
graph = Graph(V)
graph.add_edge(0, 1)
graph.add_edge(0, 4)
graph.add_edge(1, 2)
graph.add_edge(1, 3)
graph.add_edge(1, 4)
graph.add_edge(2, 3)
graph.add_edge(3, 4)
graph.print_graph()
Traversing a Graph
Most common interview questions that include graph will include some sort of traversal of it's very useful to know how to traverse one.
There are two main algorithms:
- Breadth First Search
- Depth First Search
Breadth First Search
Breadth-First Search (BFS) is a vertex-based technique for finding the shortest path. In a nutshell it everything in the topmost layer before moving down.
Here's the code:
def BFS(self, s):
# Mark all the vertices as not visited
visited = [False] * (max(self.graph) + 1)
# Create a queue for BFS
queue = []
# Mark the source node as
# visited and enqueue it
queue.append(s)
visited[s] = True
while queue:
# Dequeue a vertex from
# queue and print it
s = queue.pop(0)
print (s, end = " ")
# Get all adjacent vertices of the
# dequeued vertex s. If a adjacent
# has not been visited, then mark it
# visited and enqueue it
for i in self.graph[s]:
if visited[i] == False:
queue.append(i)
visited[i] = True
Depth First Search
Depth-First Search (DFS) is an edge-based technique to find the shortest path. It uses the Stack data structure, performs two stages, first visited vertices are pushed into the stack and second if there is no vertices then visited vertices are popped.
# A function used by DFS
def DFSUtil(self, v, visited):
# Mark the current node as visited
# and print it
visited.add(v)
print(v, end=' ')
# Recur for all the vertices
# adjacent to this vertex
for neighbour in self.graph[v]:
if neighbour not in visited:
self.DFSUtil(neighbour, visited)
# The function to do DFS traversal. It uses
# recursive DFSUtil()
def DFS(self, v):
# Create a set to store visited vertices
visited = set()
# Call the recursive helper function
# to print DFS traversal
self.DFSUtil(v, visited)
Conclusion
We covered a lot today, and I hope you got something new out of it. This is the ninth post in our data structures series, time flies by. Anyways, that we have covered most of the main data structures, it's time to move onto algorithms. So stay tuned, and I hope you have a nice day.
Additional References
Posted on May 23, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
December 6, 2022