Sieve of Eratosthenes
Rusu Emanuel
Posted on June 10, 2022
Heeelloooo everyone!🔥😎
As I told you in the previous post, I am going to write about the Sieve of Eratosthenes. Commonly, this algorithm comes together with the primality of a number problem. It is used in contests, websites with algorithmic problems, competitive programming, Olympiads, and maybe at university admission competitions. Sieve of Eratosthenes finds all prime numbers between an interval. It is a time-efficient algorithm because instead of checking all numbers between a given interval, we can mark some numbers as not primes
without even checking them. This way, we avoid unnecessary analyses.
On the other hand, it is not a space-efficient algorithm. We have to keep track of numbers marked as not primes
and numbers marked as primes
if we don't store this information (number, marked/unmarked)
we can't implement the Sieve of Eratosthenes idea. Enough introduction. Let's dig in.
Problem to solve: Print all prime numbers between 2
and n
.
Idea🤔
We have to analyze numbers from 2
to n
. Firstly, we write them down:
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 ... n
Next, we will do something which is called marking
. To illustrate this, I will use strikethrough digits
. A strikethrough digit
means a marked number
, so not a prime number
; a simple number
means an unmarked number
, so a prime number
.
Then we begin from 2
till n
, and for every number, we mark all its multiples till n
inclusive.
We are at 2
, so we mark 4
, 6
, 8
, 10
, 12
, 14
, 16
, 18
, 20
, 22
, 24
, 26
and all numbers 2 * k
, with k > 1
and stop when 2 * k > n
.
2 3 ̷4 5 ̷6 7 ̷8 9 ̷1̷0 11 ̷1̷2 13 ̷1̷4 15 ̷1̷6 17 ̷1̷8 19 ̷2̷0 21 ̷2̷2 23 ̷2̷4 25 2̷6 ... n
We move at 3
so we mark 6
, 9
, 12
, 15
, 18
, 21
, 24
and all numbers 3 * k
, with k > 1
and stop when 3 * k > n
.
2 ̷3 ̷4 5 ̷6 7 ̷8 ̷9̷ ̷1̷0 11 ̷1̷2 13 ̷1̷4 ̷1̷5 ̷1̷6 17 ̷1̷8 19 ̷2̷0 ̷2̷1 ̷2̷2 23 ̷2̷4 25 2̷6 ... n
We continue this process by taking every number till n
and marking all its multiples till n
inclusive.
By the end of this algorithm, the numbers that remained unmarked
are only prime numbers
.
Let's take an example:
Let's find all prime numbers until 22(n)
.
List all numbers till n = 22
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Start from 2
and mark all its multiples smaller or equal than 22
.
2 3 ̷4 5 ̷6 7 ̷8 9 ̷1̷0 11 ̷1̷2 13 ̷1̷4 15 ̷1̷6 17 ̷1̷8 19 ̷2̷0 21 ̷2̷2
Move to 3
and mark all its multiples smaller or equal than 22.
2 3 ̷4 5 ̷6 7 ̷8 ̷9̷ ̷1̷0 11 ̷1̷2 13 ̷1̷4 ̷1̷5 ̷1̷6 17 ̷1̷8 19 ̷2̷0 ̷2̷1 ̷2̷2
If we want to move to 4
, we can see that it's multiples smaller
than 22
are already marked, similarly with 5
, 6
, 7
, ...
,
22
.
2 3 ̷4 5 ̷6 7 ̷8 ̷9̷ ̷1̷0 11 ̷1̷2 13 ̷1̷4 ̷1̷5 ̷1̷6 17 ̷1̷8 19 ̷2̷0 ̷2̷1 ̷2̷2
Write down unmarked numbers: 2 3 5 7 11 13 17 19
Done!
Congratulations!🥳 You found all primes numbers using the Sieve of Eratosthenes algorithm.
Maybe you want to see the code but let me explain it before.👨🏫
First, I will use a straightforward algorithm and then optimize it as we go deeper and deeper.
As I said in the beginning, we need to write down the numbers from 2
to n.
We do that by using a boolean array like this:
#define MAX 1000000001
bool sieve[MAX];
Why do we use such a big number? Because we don't know how big the interval is, we give as much freedom as possible; note that this number may not work for your computer, and you will need to make it smaller.
Then I said that we needed to mark numbers somehow. In Sieve of Eratosthenes, we use a global array
, and global variables
are automatically initialized with 0
. We use that to our advantage instead of a for
to set all memory spaces manually with 0
.
We are doing marking like this:
0 - prime
1 - not prime
Why? I know that using 1
for primes and 0
for not primes would have been more intuitive but declaring an array in global scope, it got initialized with 0
, so it is just a waste to overwrite all memory spaces with 1
.
Then we mark in advance 0
and 1
because we know these numbers are not prime.
sieve[0] = sieve[1] = 1;
Now we start to iterate from 2
till n
.
for (unsigned long long i = 2; i <= n; ++i)
For every i
, we mark all its multiples smaller or equal than n
.
for (unsigned long long j = 2; j <= n / i; ++j) sieve[i * j] = 1;
We iterate till n / i
because this division gave us the biggest number, which can be multiplied by j
, which is smaller or equals to n
, so iterating till this number, we can guarantee that we marked all multiples of i
smaller than n
. sieve[i * j]
is the multiple
of i
.
The last thing we have to do is to iterate the interval and print all unmarked values
. We do that by asking if sieve[i] == 0
in a for
loop like this:
for (unsigned long long i = 2; i <= n; ++i)
{
if (sieve[i] == 0) cout << i << ' ';
}
The first version of our algorithm is done:
#include <iostream>
using namespace std;
#define MAX 1000000001
bool sieve[MAX];
void PrintSieveofEratosthenes(unsigned long long n)
{
for (unsigned long long i = 2; i <= n; ++i)
{
if (sieve[i] == 0) cout << i << ' ';
}
}
void SieveofEratosthenes(unsigned long long n)
{
sieve[0] = sieve[1] = 1;
for (unsigned long long i = 2; i <= n; ++i)
{
for (unsigned long long j = 2; j <= n / i; ++j) sieve[i * j] = 1;
}
}
int main()
{
unsigned long long n; cin >> n;
SieveofEratosthenes(n);
PrintSieveofEratosthenes(n);
return 0;
}
I like to refactor this a bit. I consider that the function call for printing is a bit unnecessary because we can print numbers inside the first loop. We know that if sieve[i] == 0
we found a prime number. Here is another version that I like more:
#include <iostream>
using namespace std;
#define MAX 1000000001
bool sieve[MAX];
void SieveofEratosthenes(unsigned long long n)
{
sieve[0] = sieve[1] = 1;
for (unsigned long long i = 2; i <= n; ++i)
{
for (unsigned long long j = 2; j <= n / i; ++j) sieve[i * j] = 1;
if (sieve[i] == 0) cout << i << ' ';
}
}
int main()
{
unsigned long long n; cin >> n;
SieveofEratosthenes(n);
return 0;
}
Now we get into optimizations. Sieve of Eratosthenes is more than two loops. If you read till now, congratulations!🥳
We know that every even number
greater than 2
is not prime. We can mark in advance these numbers by beginning from 4
and iterating from 2
to 2
.
for(unsigned long long i = 4; i <= n; i += 2) sieve[i] = 1;
Because we marked even numbers before
, we don't need to analyse them again so we can start from i = 3
and increment it by 2
. This way we take into consideration only odd
numbers.
Then we can iterate only till √n
using i * i <= n
.
sieve[0] = sieve[1] = 1;
for (unsigned long long i = 4; i <= n; i += 2) sieve[i] = 1;
for (unsigned long long i = 3; i * i <= n; i += 2)
{
for (unsigned long long j = 2; j <= n/i; ++j) sieve[i*j] = 1;
}
We can optimize how we iterate through multiples of every number. We can begin from i * i
because all numbers till i * i
have been already marked; and increment j
with i
.
sieve[0] = sieve[1] = 1;
for (unsigned long long i = 4; i <= n; i += 2) sieve[i] = 1;
for (unsigned long long i = 3; i * i <= n; i += 2)
{
for (unsigned long long j = i * i; j <= n; j+=i) sieve[j] = 1;
}
We can iterate with j += 2 * i
to ignore odd numbers.
sieve[0] = sieve[1] = 1;
for (unsigned long long i = 4; i <= n; i += 2) sieve[i] = 1;
for (unsigned long long i = 3; i * i <= n; i += 2)
{
for (unsigned long long j = i * i; j <= n; j += 2 * i) sieve[j] = 1;
}
Here is the final code:
#include <iostream>
using namespace std;
#define MAX 1000000001
bool sieve[MAX];
void PrintSieveofEratosthenes(unsigned long long n)
{
for (int i = 2; i <= n; ++i)
{
if (sieve[i] == 0) cout << i << ' ';
}
}
void SieveofEratosthenes(unsigned long long n)
{
sieve[0] = sieve[1] = 1;
for (unsigned long long i = 4; i <= n; i += 2) sieve[i] = 1;
for (unsigned long long i = 3; i * i <= n; i += 2)
{
for (unsigned long long j = i * i; j <= n; j += 2* i) sieve[j] = 1;
}
}
int main()
{
unsigned long long n; cin >> n;
SieveofEratosthenes(n);
PrintSieveofEratosthenes(n);
return 0;
}
This block of code gives time complexity:
for (unsigned long long i = 3; i * i <= n; i += 2)
{
for (unsigned long long j = i * i; j <= n; j += 2 * i) sieve[j] = 1;
}
The first loop performs √n
operations because we iterate till i * i <= n
. The idea of the second loop is to mark all multiples of i
. So, it will perform n / i
steps. It is a known formula that if you want to find out how many divisors of a number till n
are, you divide n / your_number
. So for every i
we will perform n / i
steps. (technically, we perform fewer steps because I increased j
with 2 * i
to avoid iterating through even multiples of i
, but ignore that. What I did is not enough to decrease the order of complexity anyway).
For every i
till √n
we perform n / i
operations, where i is prime
.
The ecuation is:
√n * (n/3 + n/4 + n/5 + n/6 + n/7 + ...)
= √n * n * (1/3 + 1/4 + 1/5 + 1/6 + 1/7 + ...)
The third factor, according to Euler grows the same as ln(ln(n))
So the time complexity is: O(√n * n * ln(ln(n)))
.
Observations❗:
Note that I used
unsigned long long
for variables, and operations likeraising to power
can easily exceedunsigned long long
, so be careful with input numbers.You can play around with
#define MAX 1000000001
and#define MAX_PRIMES 100000001
. Try to find which is the maximum limit accepted by your computer.To make this algorithm space-efficient, you can use a
vector container
from theSTL
library like thisvector <bool> vector_name
. Before that, includevector library
like#include <vector>
. That container is not a regular one. It is a memory-optimization oftemplate classes
likevector <T>
, which uses onlyn/8
bytes of memory. Many modern processors work faster with bytes because they can be accessed directly.vector <T>
template class works directly with bits. But these things require knowledge abouttemplate classes
, which are very powerful inC++
.This algorithm can be optimized even more using
bits operations
to achieve linear time. Butln(ln(n))
can be approximated toO(1)
.
And here is the bonus algorithm for testing primality I spoke about in the post before(probably the fastest for checking primality)🏎️:
#include <iostream>
using namespace std;
#define MAX 1000000001
#define MAX_PRIMES 100000001
bool sieve[MAX];
unsigned long long primeNumbers[MAX_PRIMES];
void SieveofEratosthenes(unsigned long long n)
{
sieve[0] = sieve[1] = 1;
for (unsigned long long i = 4; i <= n; i += 2) sieve[i] = 1;
for (unsigned long long i = 3; i * i <= n; i += 2)
{
for (unsigned long long j = i * i; j <= n; j += 2 * i) sieve[j] = 1;
}
unsigned long long k = 0;
for (int i = 2; i <= n; ++i)
{
if (sieve[i] == 0) primeNumbers[k++] = i;
}
}
bool IsPrime(unsigned long long Number)
{
unsigned long long k = 0;
while (Number > 1)
{
unsigned long long power = 0;
if (Number % primeNumbers[k] == 0)
{
Number /= primeNumbers[k];
++power;
}
if (power) return false;
++k;
if (primeNumbers[k] * primeNumbers[k] > Number) return true;
}
}
int main()
{
unsigned long long number; cout << "Enter number: "; cin >> number; cout << '\n';
SieveofEratosthenes(number);
IsPrime(number) == 1 ? cout << "Prime\n" : cout << "Not prime\n";
return 0;
}
Emanuel Rusu
You can visit my GitHub
-
Or you can find me on [Linkedin]
Rusu Emanuel Next topic: Binary search trees🌳
See you next time ! 👋
Posted on June 10, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.