Introduction to algorithms: Big O notation, time complexity
Elegberun Olugbenga
Posted on September 27, 2022
We live in a world of algorithms. From the posts we see on Twitter to the people we swipe on Tinder. Algorithms surround us, controlling machines, computers, and robots, but what exactly are they, how do we analyze them, and why do we need them to get a job?
This article discusses algorithms, how to represent them using asymptotic notations like Big O, and how to analyze their time complexities.
What is an algorithm?
An algorithm is a step-by-step procedure that details how to solve a problem. A more formal definition is "an unambiguous specification of how to solve a class of problems." Going by these definitions, the recipe for baking a cake, the method for solving long division problems, and the process of planning your route for a trip are all algorithms.
Algorithmic programming is all about writing a set of rules that instruct the computer on how to perform a task. Algorithms are written using a particular syntax, depending on the programming language being used.
Suppose we have this array of numbers in ascending order, and we want to search if this array contains the number 9.
There are several ways we can tackle this problem. The first that comes to mind is:
Approach 1: Linear Search
- Start from the beginning and iterate through the entire array
- Compare the value at the current index of the array to the target value.
- Repeat the process until the target value is found
- Return the target value
Even though this approach has solved our problem, we must consider if it is the most optimal solution. Let's consider another method.
Approach 2: Binary Search
a. Find the midpoint of the array and split it into two halves.
b. Compare the value at the midpoint of the array to the target value.
c. Since the target value is greater than our midpoint value, the value we are searching for must be on the right side of the array division. So, we can discard the left side of the array.
d. We repeat the process of splitting the right side into two halves and comparing the value at the midpoint to our target value until the target value is found
e. Return the target value
In this algorithm, we reduce the search area of our array by half each time we iterate through it. Since we are searching only half of the array in each iteration, we can find the target element faster.
If we compare the two approaches, while it took nine comparisons to guess the target element in the first approach, the second only took 3. This approach to solving problems by dividing them into smaller subsets is called a divide and conquer approach. And the algorithm used in approach B is used in many search engines and databases today and is referred to as binary search.
We cannot use the same algorithm for every problem; knowing when to use a particular algorithmic approach is essential. For example, even though the binary search was more efficient for solving our example above, it can only work if the input array is sorted.
How do we solve this same problem if we are given an array of unsorted numbers and asked to find a target element? If we use a binary search, we will have to sort the array first before performing the binary search, which will be less efficient than the first linear search approach. Therefore, we must constantly analyze our algorithmic approach to ensure we are considering the optimal solution.
Why do we analyze algorithms?
We analyze algorithms to evaluate their performance and suitability for the problem we want to solve. When we design algorithms, we are not just interested in the algorithm's correctness but also its efficiency and scalability as the input size grows.
In the previous example, even though the linear search approach was slower, the speed difference was negligible because the input size was small. However, think about this problem from the perspective of companies like Facebook and Google, with billions of users. Imagine how long it will take to search for data in a dataset containing billions of users using the linear search approach vs. the binary search approach. We can see how the binary search approach is much more efficient in that case.
How to analyze algorithms
We analyze an algorithm's complexity in two ways, time and space.
Time complexity is crucial because we need our programs to run as quickly as possible to deliver the results. Space complexity is essential because machines have only a limited amount of memory to spare.
A good algorithm takes less time in execution and saves space during the process. Ideally, we want to optimize for both time and space, but sometimes that may not be possible, and we can settle for a middle ground. The method we use in analyzing algorithms must:
a. Be independent of the machine and its configuration on which the algorithm is running: This is important because we do not know the actual specifications of the computer that our algorithm will run on, so we cannot factor in hardware or CPU specs when analyzing algorithms.
b. Show a direct correlation between the algorithm and the input size: How does the algorithm perform as our input size approach infinity (an asymptote)?
When analyzing the complexity of any algorithm in terms of time and space, we can never provide an exact number to define the time required and the space needed by the algorithm. So instead, we express them using standard notations, known as Asymptotic Notations.
What are asymptotic notations?
Asymptotic notations are the mathematical notations used to describe the running time of an algorithm when the input tends towards a particular or infinite value.
The complexity of algorithms can be represented using three asymptotic notations: Big O, Big Theta (Θ), and Big Omega (Ω).
Big O notation
Big O notation represents the upper bound running time complexity of an algorithm and "can" be used to describe the worst-case scenario of an algorithm. A common misconception about Big O notation is that it represents the worst-case scenario of an algorithm. This is somewhat incorrect as it doesn't mean the worst case, but "can" be used to describe the worst case. We can also use it to represent the best and average cases.
Look at it this way, Big O notation answers the question, "In the worst case, what's the longest time this algorithm will take to run?
It can also answer the question. "In the best case, what's the longest time our algorithm will take to run." And for the average case, "In the average case, what's the longest time our algorithm will take to run." We can see the common factor here is the longest time(upper bound) our algorithm will take to run. Best, worst, and average are scenarios, and we are upper bounding those scenarios using Big O.
Because we're always interested in the longest running time of the worst-case scenario, we mainly represent our algorithms using Big O.
Big Omega (Ω)
Big O notation represents the lower-bound running time complexity of an algorithm and "can" be used to describe the best-case scenario of an algorithm. Similar to Big O, Big Ω can be used to represent the best, average, and worst-case scenarios. Big Omega (Ω) answers the questions. "In the best/worst/average case, what's the fastest time this algorithm will take to run? We can use Big Ω to lower-bound our best, worst, and average running time scenarios.
Big Theta (Θ)
Big Theta (Θ) represents the upper and the lower bound of the running time complexity of an algorithm. Like the first two, this "can" represent the best, average, and worst cases. For example, "In the average/best/worst case scenario, what's the average time this algorithm will take to run"
As stated earlier, we mainly use Big O notation because we're always interested in measuring the longest time an algorithm runs in the worst-case scenario. For the rest of the series, we will measure our algorithms' complexity in Big O.
How to find the time complexity of an algorithm
We typically consult Big-O because we must always plan for the worst case. In our linear search example, the best case is that the number we are searching for is the first in the list, and the worst case is that the number we are searching for is at the end of the array. So, for an array of size 10, we iterate 10 times, and for size 1000, we iterate 1000 times. We can say that the running time of our linear search algorithm grows as the input increases. We represent this as O(n).
In the binary search approach, the best case complexity will be when the target element is at the center of the list, and the worst case will be when the target element is at either extremity of the list. However, since we half the array at every iteration, we are reducing the size of the problem by half. 𝑛,𝑛/2,𝑛/4,𝑛/8
We call this type of growth logarithmic (O(log N)). O(log n) means that as the input increases exponentially, operation time increases linearly. So if it takes binary search 1 second to compute 10 elements, it will take 2 seconds to compute 100 elements, 3 seconds to compute 1000 elements, and so on.
The common algorithmic runtimes from slowest to fastest are:
O(1) < O(log n) < O(n) < O(n log n) < O( n2) < O(n2 logn) < O(n3) < O(2n) < O(n!)
Plotting them on a graph of operation time vs. input size, we can see that the rate of growth of the running times of the best algorithms is much slower than the input size growth.
Let's look at some Big O notation examples.
Big O Notation examples
O(1)
Print the first element in an array.
[apple, orange, cherry, mango, grapes]
apple
This would always be constant time irrespective of whether the input is 100 or 10,000
O(n)
Iterate through an array and print all items.
[apple, orange, cherry, mango, grapes]
apple
orange
cherry
mango
grapes
The time complexity of this algorithm will always depend on the input size.
O(log n)
Using binary search to find a given element in an array. Like in our example above
O(n²)
Print all possible combinations of elements in an array.
[a,b,c,d,e] => [abcde, bacde, cabde, dabce, eabcd]
This time complexity of this algorithm grows exponentially as the input size grows. This is because, for each item in the array, you need to iterate through the rest of the array. The number of operations your program has to do is the number of inputs (or n) times itself (n squared).
General tips for asymptotic analysis:
- When an entire array gets iterated over, it is most likely in O(n) time.
- When you see an algorithm where the number of elements in the problem space gets halved after each iteration, it will probably be in O(logn) runtime.
- Whenever you have a nested loop, the problem is most likely in quadratic time.
Improve your algorithmic thinking with practice
Algorithms are behind most of the impressive applications we use every day. Learning common algorithms are helpful, but what's even better is improving our algorithmic thinking. Algorithmic thinking allows us to break down problems, analyze them and implement solutions. Like all skills, algorithmic thinking is learnable, and with enough practice, we can train our brains to become better at it.
Helpful links
Posted on September 27, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.