Replacing For Loops with Vectorization in Python
chamodperera
Posted on January 31, 2023
Loops are an integral part of programming languages, as they allow for the repetition of a block of code a specific number of times or until a certain condition is met. However, loops can be slow and resource-intensive, especially when the operations inside the loop are repeated many times. This can be a disadvantage when working with large datasets, as loops can limit performance.
One way to improve the performance of these types of operations is through a technique called Vectorization
. With this approach, operations can be performed on entire arrays or datasets at once, rather than looping through each element individually. This can be much more efficient, particularly when working with large datasets or when the operations are repeated many times. Vectorization is a useful tool for tasks such as data analysis, machine learning, and financial modeling. In this article, let's see how this technique can enhance the performance of these types of operations.
What is Vectorization
Vectorization is a technique that allows for faster computation by storing and manipulating data in an array or vector format rather than as individual units. The fundamental structure for storing and manipulating data in a vectorized manner is the N-dimensional array, or ND array, which is a data structure that represents a multi-dimensional grid of values.
Vectorized operations enable element-wise operations to be performed on entire ND arrays using a single function call, rather than iterating over the array and applying the operation to each element individually. When working with ND arrays that have more than two dimensions, vectorized operations can still be used by processing each element in the array one at a time. For example, with a 2D array, you can iterate over the rows and process one row at a time.
for i in range(len(A)):
B[i] = A[i] * 2
This loop multiplies each element in the array A by 2 and stores the result in the corresponding element of B.
# Vectorized implementation
B = A * 2
This vectorized operation performs the same operation as the loop, but the operation is performed on the entire array at once.
Getting Started with NumPy
NumPy
is the most used library for working with vectorized operations in Python. It provides N-dimensional arrays, a wide variety of vectorized operations, and an easy-to-use API. NumPy also provides functions for performing vectorized operations on ND arrays, including mathematical operations, statistical operations, and operations related to sorting, searching, and indexing. These features make NumPy well-suited for working with vectorized operations on multi-dimensional arrays.
Installation :
pip install numpy
import numpy as np #import package
Create an ND array
a = np.array([[1,2,3],[4,5,6]]) #2-Dimentional array
In the above example, you've created a 2D array. The dimensions of the array are represented by sets of square brackets, []
. The dimensions of a ND array are called axes
. The number of axes is called the rank
. You can check the no of dimensions, and shape
( A tuple having the no of elements in each dimension ) by calling the following methods.
a.ndim
This returns the no of dimensions
>> 2
a.shape
This returns a tuple containing the no of elements in each dimension
>> (2,3)
Why Vectorization
When you perform a vectorized operation on an ND array using NumPy, the operation is applied to each element in the array simultaneously, rather than iterating over the array and performing the operation on each element individually. This can be much faster than using a loop to perform the same operation, particularly when working with large datasets or when the operation is computationally expensive. Numpy vectorized operations are implemented using low-level, highly optimized code that can take advantage of the specialized architecture of modern processors, resulting in significant performance improvements NumPy. Learn more. Let’s see an example,
b = np.random.rand(10000,1)
Here b
is a two-dimensional array of shape (10000,1)
. random.rand()
is a numpy method that assigns a random value (between 0,1) to every element in the array.
Now Let's get the sum of all the values in
using *For loops* and *Numpy*.
![](https://i.imgur.com/H72n7eg.png)
Calculate sum using For loops
```python
sum = 0
for i in range(b.shape[0]):
sum += i
sum = np.sum(b)
Performance comparision,
import timeit #timeit is a package used for tracking elapsed time for both operations
Measure the execution time of the for loop implementation
for_loop_time = timeit.timeit(
# Define the for loop code as a string
stmt="""
sum = 0
for i in range(b.shape[0]):
sum += i
""",
# Define the variables used in the code as a string
setup="import numpy as np\n" +
"b = np.random.rand(10000,1)",
# Execute the code 1000 times
number=1000
)
print("For loop elapsed time: {:.5f} seconds".format(for_loop_time))
>> For loop elapsed time: 0.61476 seconds
Measure the execution time of the vectorized implementation
vectorized_time = timeit.timeit(
# Define the vectorized code as a string
stmt="sumt = np.sum(b)",
# Define the variables used in the code as a string
setup="import numpy as np\n" +
"b = np.random.rand(10000,1)",
# Execute the code 1000 times
number=1000
)
print("vectorized elapsed time: {:.5f} seconds".format(vectorized_time))
>> vectorized elapsed time: 0.01074 seconds
The vectorized implementation is nearly 50 times faster than the non-vectorized version. This performance difference becomes more significant for larger calculations.
From now on, we'll get into some important methods of NumPy that will be essential for your project if you use vectorization. By understanding how these operations work, you can make more informed decisions about how to structure your code and choose the most efficient approaches for your specific use case. In some cases, working with multi-dimensional arrays can be challenging. The reshape()
function in NumPy is a useful tool for converting an array to a different shape. It allows you to change the dimensions of an array by specifying the new shape as an argument. Remember you can always use this to convert your multidimensional array into a 2D or 1D array.
b = np.random.rand(10,5,8) #3 dimensional array
a = np.reshape(b,(400)) #convert to 1D array
a.shape
>> (400,)
Note that here, the total number of elements in an array should match the new shape when reshaping the array. For example, think that the original dimensions of the array are d1, d2, d3,..., dN
and the new reshaped dimensions are r1, r2,..,rM
. The condition that must be satisfied in order to reshape the array is d1 * d2 * d3 * ... * dN = r1 * r2 * r3 * ... * rM
.
Operations on Vectors
Now let's look at some common vectorized operations using NumPy
and compare the performance of for loop implementations to vectorized implementations.
Element-wise operations
Element-wise operations are performed on each element of an array individually. Some examples of element-wise operations include addition, subtraction, multiplication, and division.
Addition
To add arrays using a for loop, we can iterate over their elements and add them element-wise. For example, consider the following 2D arrays.:
Define two 2D arrays
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])
Add the arrays using a nested for loop
result = np.empty(a.shape) # Initialize an empty result array
for i in range(a.shape[0]):
for j in range(a.shape[1]):
result[i, j] = a[i, j] + b[i, j]
print(result)
>> [[ 6 8] [10 12]]
To add two vectors using vectorization, we can simply use the +
operator.
Add the arrays using vectorization
result = a + b
print(result)
>> [[ 6 8] [10 12]]
To compare the performance of these two approaches, we can use the %timeit
magic command in IPython or Jupyter notebooks, or the timeit
module in a script.
Measure the execution time of the for loop implementation
for_loop_time = timeit.timeit(
# Define the for loop code as a string
stmt="""
for i in range(a.shape[0]):
for j in range(a.shape[1]):
result[i, j] = a[i, j] + b[i, j]
""",
# Define the variables used in the code as a string
setup="import numpy as np\n" +
"a = np.array([[1, 2], [3, 4]])\n" +
"b = np.array([[5, 6], [7, 8]])\n" +
"result = np.empty(a.shape)",
# Execute the code 1000 times
number=1000
)
print("For loop elapsed time: {:.5f} seconds".format(for_loop_time))
For loop elapsed time: 0.00324 seconds
Measure the execution time of the vectorized implementation
vectorized_time = timeit.timeit(
# Define the vectorized code as a string
stmt="result = a + b",
# Define the variables used in the code as a string
setup="import numpy as np\n" +
"a = np.array([[1, 2], [3, 4]])\n" +
"b = np.array([[5, 6], [7, 8]])",
# Execute the code 1000 times
number=1000
)
print("Vectorized elapsed time: {:.5f} seconds".format(vectorized_time))
Vectorized elapsed time: 0.00051 seconds
Similarly, subtraction, multiplication, and division can be performed using the same method.
Broadcasting
is a powerful feature in element-wise operations that allows operations to be performed between arrays of different shapes. It allows the smaller array to be "broadcast"
or resized
to match the shape of the larger array so that the element-wise operation can be performed.
Multiplying by a scalar
To multiply the array a by a scalar using a for loop, we can iterate over its components and multiply them by the scalar. For example,
Define a 2D array
a = np.array([[1, 2], [3, 4]])
Define a scalar
s = 2
Multiply the array by the scalar using a nested for loop
result = np.empty(a.shape) # Initialize an empty result array
for i in range(a.shape[0]):
for j in range(a.shape[1]):
result[i, j] = a[i, j] * s
print(result)
>> [[2 4] [6 8]]
To multiply a vector by a scalar using vectorization, we can simply use the *
operator. Here, the scalar value s
is broadcast
to a temporary array of the same shape as a
, and the multiplication is performed element-wise. This allows us to easily multiply a vector by a scalar value without having to write a loop.
Multiply the array by the scalar using vectorization
result = a * s
print(result)
>> [[2 4] [6 8]]
Performance comparition,
Measure the execution time of the for loop implementation
for_loop_time = timeit.timeit(
# Define the for loop code as a string
stmt="""
for i in range(a.shape[0]):
for j in range(a.shape[1]):
result[i, j] = a[i, j] * s
""",
# Define the variables used in the code as a string
setup="import numpy as np\n" +
"a = np.array([[1, 2], [3, 4]])\n" +
"s = 2\n" +
"result = np.empty(a.shape)",
# Execute the code 1000 times
number=1000
)
print("For loop elapsed time: {:.5f} seconds".format(for_loop_time))
For loop elapsed time: 0.00320 seconds
Measure the execution time of the vectorized implementation
vectorized_time = timeit.timeit(
# Define the vectorized code as a string
stmt="result = a * s",
# Define the variables used in the code as a string
setup="import numpy as np\n" +
"a = np.array([[1, 2], [3, 4]])\n" +
"s = 2",
# Execute the code 1000 times
number=1000
)
print("Vectorized elapsed time: {:.5f} seconds".format(vectorized_time))
Vectorized elapsed time: 0.00106 seconds
Dot product
The dot product
, also known as the inner product or scalar product, is a binary operation that takes two arrays and returns a scalar. It is originally defined as the sum of the element-wise products of the two 1D arrays. However, when applied to multidimensional arrays, the output can be different than a scalar. It may return an array of higher dimensions, depending on the implementation.
Vector dot product
The vector dot product
takes two 1D arrays
(vectors) and returns a scalar. For example, given two vectors a and b with n elements each, the vector dot product is calculated as a[0]*b[0] + a[1]*b[1] + ... + a[n-1]*b[n-1]
.
To calculate the vector dot product of two 1D arrays (vectors) using a for loop, we can iterate over the elements of each array and multiply them element-wise, then sum the results.
Define two vectors
a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
b = np.array([10, 9, 8, 7, 6, 5, 4, 3, 2, 1])
dot_product = 0
for i in range(len(a)):
dot_product += a[i] * b[i]
print(dot_product)
>> 385
To calculate the vector dot product using vectorization, use the np.dot()
function.
dot_product = np.dot(a, b)
print(dot_product)
>> 385
Performance comparision,
Measure the execution time of the for loop implementation
for_loop_time = timeit.timeit(
# Define the for loop code as a string
stmt="""
dot_product = 0
for i in range(a.shape[0]):
dot_product += a[i] * b[i]
""",
# Define the variables used in the code as a string
setup="import numpy as np\n" +
"a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])\n" +
"b = np.array([10, 9, 8, 7, 6, 5, 4, 3, 2, 1])",
# Execute the code 1000 times
number=1000
)
print("For loop elapsed time: {:.5f} seconds".format(for_loop_time))
For loop elapsed time: 0.00366 seconds
Measure the execution time of the vectorized implementation
vectorized_time = timeit.timeit(
# Define the vectorized code as a string
stmt="dot_product = np.dot(a, b)",
# Define the variables used in the code as a string
setup="import numpy as np\n" +
"a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])\n" +
"b = np.array([10, 9, 8, 7, 6, 5, 4, 3, 2, 1])",
# Execute the code 1000 times
number=1000
)
print("vectorized elapsed time: {:.5f} seconds".format(vectorized_time))
vectorized elapsed time: 0.00193 seconds
Matrix multipication
The matrix multipication
is the dot product applied to two 2D arrays
(matrices) which returns another 2D array. It is calculated by taking the dot product of each row of the first matrix with each column of the second matrix. For example, given two matrices A with m rows and n columns, and B with p rows and q columns, the matrix multipication is poerformed as follows:
result[i, j] = A[i, 0]*B[0, j] + A[i, 1]*B[1, j] + ... + A[i, n-1]*B[n-1, j]
This results in a 2D array with m
rows and q
columns.
For example, given the following matrices:
A = [[1, 2, 3], [4, 5, 6]] # A has 2 rows and 3 columns
B = [[7, 8], [9, 10], [11, 12]] # B has 3 rows and 2 columns
result = [[58 64], [139 154]] # Result has 2 rows and 2 columns
To calculate the dot product of two matrices using a for loop, we can iterate over their elements and multiply them element-wise, then sum the products. For example:
Define two 2D arrays
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])
result = np.zeros((a.shape[0], b.shape[1]))
for i in range(a.shape[0]):
for j in range(b.shape[1]):
for k in range(a.shape[1]):
result[i, j] += a[i, k] * b[k, j]
print(result)
>> [[19 22] [43 50]]
To calculate the dot product of two matrices using the numpy library, we can use the np.matmul()
function.
result = np.matmul(a, b)
print(result)
>> [[19 22] [43 50]]
Performace comparision,
Measure the execution time of the for loop implementation
for_loop_time = timeit.timeit(
# Define the for loop code as a string
stmt="""
for i in range(a.shape[0]):
for j in range(b.shape[1]):
for k in range(a.shape[1]):
result[i, j] += a[i, k] * b[k, j]
""",
# Define the variables used in the code as a string
setup="import numpy as np\n" +
"a = np.array([[1, 2], [3, 4]])\n" +
"b = np.array([[5, 6], [7, 8]])\n" +
"result = np.zeros((a.shape[0], b.shape[1]))",
# Execute the code 1000 times
number=1000
)
print("For loop elapsed time: {:.5f} seconds".format(for_loop_time))
For loop elapsed time: 0.00811 seconds
Measure the execution time of the vectorized implementation
vectorized_time = timeit.timeit(
# Define the vectorized code as a string
stmt="result = np.matmul(a, b)",
# Define the variables used in the code as a string
setup="import numpy as np\n" +
"a = np.array([[1, 2], [3, 4]])\n" +
"b = np.array([[5, 6], [7, 8]])",
# Execute the code 1000 times
number=1000
)
print("Vectorized elapsed time: {:.5f} seconds".format(vectorized_time))
Vectorized elapsed time: 0.00131 seconds
However, both np.dot()
and np.matmul()
functions can be used to perform a matrix-matrix dot product, but they may return different results when applied to 1D arrays. When applied to 1D arrays, np.dot()
performs the vector dot product, while np.matmul()
performs element-wise multiplication. For example,
Define two 1D arrays
u = np.array([1, 2, 3])
v = np.array([4, 5, 6])
Calculate the dot product using np.dot()
result = np.dot(u, v)
print(result)
>> 32
Calculate the dot product using np.matmul()
result = np.matmul(u, v)
print(result)
>> [4 10 18]
Using np.matmul()
is the recomended way to perform matrix multipication.
Advanced techniques for manipulating elements in NumPy arrays
NumPy provides a set of built-in methods for working with multidimensional arrays that are similar in functionality to those provided by Python for working with lists. Some examples of these methods include np.sort()
, np.append()
, etc.
concatenate()
The concatenate()
function allows you to combine multiple arrays into a single array.Here is an example of concatenating two 3D arrays using a for loop,
Define two 3D arrays
array_1 = [[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]]
array_2 = [[[13, 14, 15], [16, 17, 18]], [[19, 20, 21], [22, 23, 24]]]
Iterate over the elements in each array and append them to the combined list
combined_array = [] # Initialize an empty list to store the combined arrays
for array in array_1:
combined_array.append(array)
for array in array_2:
combined_array.append(array)
# Convert the list to a numpy array
combined_array = np.array(combined_array)
print(combined_array)
>> [[[ 1 2 3]
[ 4 5 6]]
[[ 7 8 9]
[10 11 12]]
[[13 14 15]
[16 17 18]]
[[19 20 21]
[22 23 24]]]
The default behavior of the concatenate()
function is to concatenate the arrays along the first axis
(axis 0). This means that the arrays are joined together one after the other along the first axis, resulting in a new array with the same number of dimensions as the original arrays but with an increased size along the first axis. You can change the axis along which the arrays are concatenated by providing the axis
parameter.
Use the concatenate()
function to combine the two arrays
combined_array = np.concatenate((array_1, array_2))
print(combined_array)
>> [[[ 1 2 3]
[ 4 5 6]]
[[ 7 8 9]
[10 11 12]]
[[13 14 15]
[16 17 18]]
[[19 20 21]
[22 23 24]]]
On the other hand hstack()
and vstack()
functions can also be used to combine arrays horizontally or vertically, respectively. The hstack()
function is used to stack arrays in sequence horizontally
(along the second axis), whereas the vstack()
function is used to stack arrays in sequence vertically
(first axis).
Using hstack to stack the arrays horizontally,
result_hstack = np.hstack((array_1, array_2))
print(result_hstack)
>> [[[ 1 2 3] [ 4 5 6] [13 14 15] [16 17 18]]
[[ 7 8 9] [10 11 12] [19 20 21] [22 23 24]]]
Using vstack to stack the arrays vertically,
result_vstack = np.vstack((array_1, array_2))
print(result_vstack)
>> [[[ 1 2 3] [ 4 5 6]]
[[ 7 8 9] [10 11 12]]
[[13 14 15] [16 17 18]]
[[19 20 21] [22 23 24]]]
sort()
The sort()
function sorts the elements in a NumPy array in ascending order. Here is an example of sorting an array using a for loop,
Define a 3D array
data = [[[3, 1, 4, 3], [2, 1, 4, 3]], [[4, 5, 6, 4], [5, 6, 7, 5]]]
Iterate over the elements in the array and add them to the list
sorted_elements = [] # Initialize an empty list to store the sorted elements
for array in data:
for subarray in array:
for element in subarray:
sorted_elements.append(element)
# Sort the list in ascending order
sorted_elements.sort()
print(sorted_elements)
>> [1, 1, 1, 2, 3, 3, 3, 4, 4, 4, 5, 5, 6, 6, 7]
The same example using the sort()
function,
sorted_elements = np.sort(data)
print(sorted_elements)
>> [1, 1, 1, 2, 3, 3, 3, 4, 4, 4, 5, 5, 6, 6, 7]
where()
The where()
function in NumPy
allows to return elements from one of two possible arrays depending on whether a given condition is met. The elements where the condition is true will come from the first array, while the elements where the condition is false will come from the second array.Here is an example,
Define arrays
x = np.array([1, 2, 3, 4, 5])
y = np.array([10, 20, 30, 40, 50])
condition = x > 2
The condition
variable is a boolean array that indicates which elements of x are greater than 2.
Iterate over the elements of the input arrays and the condition array, and use an if statement to decide which element to add to the output array.
result = []
for i in range(len(x)):
if condition[i]:
result.append(x[i])
else:
result.append(y[i])
result = np.array(result)
print(result)
>>> [10 20 3 4 5]
The np.where()
function uses the condition to select elements from x
where the condition is True
and elements from y
where the condition is False
.
result = np.where(condition, x, y)
print(result)
>>> [10 20 3 4 5]
Working with a dataset
Working with a dataset can often involve calculating statistical measures such as the mean, standard deviation, and variance. This can be a tedious process if you have to do it manually using for loops, especially for large datasets. Fortunately, NumPy provides vectorized functions that can perform these calculations efficiently on an entire array of data at once.
In the other hand Python's built-in math
module also provides functions for these calculations like the mean, standard deviation, and variance. The real advantage of using NumPy
for statistical calculations comes with the axis argument, which allows you to calculate statistical measures along a specific dimension of a multidimensional array. This cannot be done with Python's built-in math
module. Let's see some examples,
Define a 2D array
data = np.array([[1, 2, 3], [4, 5, 6]])
Median
The median of a data set is the middle value when the data is arranged in ascending or descending order. If the data set has an odd number of values, the median is the middle value. If the data set has an even number of values, the median is the average of the two middle values. Learn more
Calculate the median of each row using a for loop
medians = np.empty(data.shape[0])
for i in range(data.shape[0]):
medians[i] = np.median(data[i, :])
print(medians)
>> [2. 5.]
To calculate the median using Numpy
library, you can use the np.median()
function.
medians = np.median(data, axis=1)
print(medians)
>> [2. 5.]
In these examples, data is a 2D NumPy array and the axis=1
argument tells the function to compute the median for each row
. If you want to compute the median for each column
you can change the axis argument to axis=0
.
Mean
The mean (average) of a data set is found by adding all numbers in the data set and then dividing by the number of values in the set. Learn more
Calculate the mean of each column using a for loop
means = np.empty(data.shape[1])
for i in range(data.shape[1]):
means[i] = np.mean(data[:, i])
print(means)
>> [2.5 3.5 4.5]
To calculate the mean using Numpy
library, you can use the np.mean()
function.
means = np.mean(data, axis=0)
print(means)
>> [2.5 3.5 4.5]
Standard Deviation
The standard deviation is a measure of how spread out the data is from the mean
. It is the square root of the average of the squared deviations from the mean. std = sqrt(mean(x)), where x = abs(a - a.mean())**2
. A low standard deviation indicates that the data is tightly grouped around the mean, while a high standard deviation indicates that the data is more spread out. Learn more
# Calculate the standard deviation of each column using a for loop
stds = np.empty(data.shape[1])
for i in range(data.shape[1]):
stds[i] = np.std(data[:, i])
print(stds)
>> [1.5 1.5 1.5]
To calculate the mean using Numpy
library, you can use the np.std()
function.
# Calculate the standard deviation of each column using np.std() with the axis argument
stds = np.std(data, axis=0)
print(stds)
>> [1.5 1.5 1.5]
Can For loops be completely replaced?
Now you know how to use vectorized operations and Numpy functions to optimize the performance and readability of your code. However, in some cases, the sequential nature of for loops may still be necessary. For example, when iteratively updating values in a function, where the output at time t depends on the output at time t-1, a for loop may be the only option. This is because vectorized operations are typically performed in parallel, so the output at t-1 cannot be used to calculate the output at t in a vectorized manner.
def update_values(prev):
new = prev + np.log(prev + 2)
return new
x = 0
y = 0
iterations = 10
for i in range(iterations):
x += update_values(x)
y += x + 10
print(x)
print(y)
>> 1150.00069088284
>> 2367.2066419361945
Conclusion
Considering vectorization other than explicit For Loops where necessary can make your next project more robust since it is an accelerated way to carry out calculations. But as above there can be times when using Loops is the only option. So choose wisely by analyzing the problem you are working on. If it can be solved in a parallel nature, moving to vectorization is the best option. Here I've shown you some important methods of NumPy to implement vectorization efficiently. You can also find the documentation here . Make sure to check that out to find some more important details.
Posted on January 31, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.