Introduction to algorithm and the big O notation
Fakorede Damilola
Posted on November 27, 2020
Algorithms are very important in programming, every programmer will definitely end up writing an algorithm one way or another. In this article, I am going to explain
- The term algorithm
- The best possible solution for an algorithm
- The big-O notation
- Measuring performance (asymoptoic function)
The term algorithm
An algorithm is a sequence of steps (instructions) used to solve a clearly defined problem. There are two things you should note here, a sequence of steps and a clearly defined problem. So basically, an algorithm is any piece of code that you write (a line, 20 lines etc), that solves a problem. In as much has the problem follows the same pattern, that program you write should be able to solve it.
Let's look at an example.
write a program that sums up all the numbers from 1 to 10.
This can be easily done with a for loop. I will use JavaScript to solve this question
function sumNumber(){
let sum=0
for (let i=0;i<=10;i++){
sum =sum+i
}
return sum
}
console.log(sumNumber())
This function above sumNumber should be an algorithm. Why, because it solves a clearly defined problem (i.e it summed all the number from 1 to 10) which is what I asked for. But notice I said should be an algorithm and not it is an algorithm. This is not because it is a five line code and the bosses in programming write a single line, or because it is a simple for loop and real nerds with use reduce to solve it. But this is not an algorithm simply because this function is solving an exact question. In the definition above, I said it is a sequence of step that solves a clearly defined problem. We got the sequence of steps part (our five liner was awesome and we did not have to move through the whole planet or write 500 lines to get it done), but our algorithm solved the question for a clearly defined and exact problem. As programmers, we solve problems. We write code that helps solve the difficulties our users might have. So imagine a user wants to sum up all the numbers from one to 100 and comes to our code. Luckily our code won't break up, but it will give a devastating result which we don't want. So how can we write this so that it can solve a clearly defined but not exact problem, which is what all algorithms should do. What if instead of assuming that every number a user wants to add up will always be 10, why don't we take the longer route and assume it is unknown and only becomes known when our user inputs that number. That way our function will always loop to the number the user inputs and not 10.
function sumNumber(n){
let sum=0
for (let i=0;i<=n;i++){
sum =sum+i
}
return sum
}
console.log(sumNumber(100))
So by changing the number in the parenthesis, we can change the value of n, and therefore change our result. Therefore adhering to the definition altogether i.e sequence of steps (This five line is awesome) and clearly defined (no matter the number inputted, has long as the problem is to add up all the numbers from 1 to that number, our algorithm is more than able to solve it.)
The best possible solution for an algorithm
There is a popular saying in programming that you can solve one thing in 1000 different ways. A developer could decide to use the reduce higher order array or even a math formula etc. The fact is different people solve the same problem different ways. But then one method of solving a problem should to an extent be considered better than another (i.e the best possible solution). So the question now is what is the criteria for picking the best possible solution. Is it the
- Minimum amount of code (whoa one-liner, this is the best algorithm programmer :) )
- Best performance (the algorithm that takes the least amount of time to finish)
- Least amount of memory (the algorithm that does not take a lot of resources on the computer) or
- Personal preference (you like code A over code B)
Oftentimes, the best solution is the best performance (takes the least amount of time). So with the problem above, let's try to measure the best performance. In the browser, there is a performance object on which we can call the now method and this gives the current time stamp. So, we can easily get the timestamp before and after our program runs and also their differences to see how long the program ran.
function sumNumber(n){
let sum=0 for (let i=0;i<=n;i++){
sum =sum+i
}
return sum
}
let start,end
start =performance.now()
sumNumber(3)
end =performance.now()
console.log(end-start) //1.0576668876
I am guessing your result is not the same has mine, and that is OK. The problem with getting the performance this way is that it depends on a lot of factors such has the speed of your laptop, the amount of programs running in the background etc. There are too many variables that can affect your result and this can definitely lead to errors if performance was judged this way. But for now let's stick with this. If you try increasing the number to 5, then 10 you will see that the result is literally the same thing. Same thing goes with 100 and even 1000. But If you try 100000, 1000000 and 10000000 (try not to go too large), you will begin to notice a trend
start =performance.now()
sumNumber(100000)
end =performance.now()
console.log(end-start) //20.1
start =performance.now()
sumNumber(1000000)
end =performance.now()
console.log(end-start) //20.8
start =performance.now()
sumNumber(10000000)
end =performance.now()
console.log(end-start) //198.2
It tends to take a longer time for bigger numbers and that is the way it should be, but it is at the rate of ×10 of the previous number when we add an additional zero to that number. E.g if we double the number, the performance is also doubled and if we multiply the number by 10, the performance is also multiplied by 10.
But for performance, we should not really care about the values gotten, cause like I said earlier, this values depends on a number of factors. But the general trend should be observed, especially for larger numbers.
Considering this on a graph, we would draw a linear graph i.e has the values get larger so does the time and with the same factor. The varying values is generally due to other factors and that is how you judge algorithms, not with numbers but with the general trend. So with this trend, we can write a statement about the performance of an algorithm based on the time it take in relation to the input. This is called time complexity. The function above can be said to have a linear time complexity( has the value increases, the time increases at the same rate i.e linearly).
The big O Notation
From above, we have seen that has our performance (time) increases by the same factor which our value increases which we called linear time complexity. But that is not the only time complexity we have. There is also the constant time complexity. Can you think of a way the algorithm above can have a constant time complexity??
What if instead of looping everytime we want to get the sum of numbers we use a simple math formula. So basically instead of our algorithm stopping at the for loop and running that same line for e.g 10,000 times which might take 5s, it just simply uses our formula and run once i.e
function sumNumber(n){
return (n/2)*(n+1)
}
Now when you try this for all n and calculate the performance with performance.now(), you will get almost the same values and remember, we don't really care about the values because they are polluted by other activities running on our conputer, but instead we care about the general trend which is has the values (n) increases, the time remains the same. No matter the value of n i.e constant time complexity. So from this two algorithms, which one do you feel is the best method to use and solve this problem i.e Is it the linear time complexity (has the value increases the time increases by the same value) or the constant time complexity (has the value increases the time remains the same). I feel it is the constant time complexity. But the issue is, it is not for every algorithm you can find a math formula or a way to get the trend to be a constant time complexity. Sometimes you just have to stick With the linear time complexity. But there are other time complexity such as quadratic Time complexity (has the value increases, the time doubles by that factor), the cubic Time complexity etc.
But when talking to other people, developers especially, there is a way to describe this time complexity using the big O notation. For example, the linear time complexity can be written has o(n) pronounced has (o of n). This is written in terms of the performance that is has n values increase, the time increases by the same value (n). Can you guess constant time complexity :). This will be o(1) has the value increases, the performance remains constant i.e 1. quadratic Time complexity o(n^2), cubic Time complexity o(n^3), logarithmic time complexity o(log n) (i.e has the values increase, the performance increases by a value of log of that number).
Measuring performance (asymoptoic function)
Now that we understand big O Notation and how to get the performance, the next question is how can we know the time complexity of a given algorithm. We could follow the route above and calculate for specific values and take note of the general trend (but that will take a while and some trends are not so straight forward e.g logarithmic trend ), or we could try to memorize it for each algorithm (that sounds fine but then we will have to start cramming and look for all possible algorithm etc).
But there is a way we can get the big O via asymoptoic analysis. We can do this through three steps.
- Define the function (not the algorithm function but the math function. I Will explain this)
- Find the fastest growing term
- Remove the coefficients
Let's take for example the sumNumber algorithm above and talk about this three things in details.
Define the function
From above, I said the function I am talking about is not the algorithm sumNumber but the mathematical time complexity function. Now how do we get the mathematical time complexity function? In this case, that of function sumNumber. We need to find the number of expression execution i.e each expression. Basically, each line of code and we will count how many times it takes that line to run. So let's test for n=1 and n=n.
function sumNumber(n){
let sum=0
for (let i=0;i<=n;i++){
sum =sum+i
}
return sum
}
So for n=1,
The first line let sum=0 runs once. That is this algorithm defines that line just once and that is all.
The second line for (let i=0;i<=n;i++){ also runs once. This defines the condition for the loop.
The third line sum =sum+i is inside the loops and this will keep on running based on the value of n, i.e it runs from 1 to n which in our case is one, so it runs once.
The fifth line return sum also run once. It returns the answer once.
For n=n,
This is quite similar to n=1 above, the first and second line run once each like above.
The third line sum =sum+i will run from i=1 all through n, and in this case the n is actually n, so it will run n times.
Now, we will add all the values together.
For n=1
That is 1+1+1+1 = 4.
For n=n
The sum will be 1+1+n+1 =3+n.
Now remember, since in algorithm we are not solving for an exact problem but for unknown values, it will only make sense to use the result gotten from n=n. I used n=1 just to help you understand.
For n values, the math function = 3+n. We can rewrite this has 1*n + 3 (remember 1*n is still n). Like I said earlier, we don't really care about numbers but trends because number tends to be polluted. So we could easily turn that math function into a trend T =a*n + b, i.e the performance(T) for n no of values is this.
Fastest growing term
So from above, we already have this function T =a*n + b , now the next thing is to find the fastest growing term.
From the function, it is pretty obvious that b will remain the same no matter the value of n, it is a constant. But not a. As the value of n increases so those the value of a. Therefore a is the fastest growing term and we can reduce our function to T= a*n.
Remove the coefficients
We are left with T=a*n, removing the coefficients (a), T=n. Which is our finally statement i.e T increases has n increases with the same factor o(n).
Now can you try this method for the constant time complexity. Let me know your answer in the comment section below.
Thank you for reading to this point, you can follow me on twitter @fakoredeDami
Posted on November 27, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.