Boost Python performance

rohilvarma

Rohil Varma

Posted on January 14, 2023

Boost Python performance

Despite being one of the most popular scripting language. The primary disadvantage at hand with python is always performance. Owning to its dynamic nature and versatility, it takes a toll in the area of performance. Here a few tips that when incorporated in your code might help you improve the performance a bit:

1. Use the latest version of Python

Using the latest version of python is always the better choice (as long as it doesn't break your existing application 😂) since it's updated and upgraded regularly, and every release is faster and more optimized.

2. Use List comprehension

Being an easier syntax for appending in a list as compared to for loop and append(), it performs better as it doesn't needs to load the append attribute of the list and call it as a function in each iteration as it happens in the traditional case, thereby slowing as compared to dynamic list creation.

3. Use pyarrow

Whenever you are loading a csv file in your code, always use the pyarrow engine. Here are some

from time import time
from pandas import read_csv

print('Without pyarrow engine')
start = time()

df = read_csv('./custom_2020.csv')
print(df.shape)
end = time()

print("Execution", end-start)

print('With pyarrow engine')

start = time()
new_df = read_csv('./custom_2020.csv', engine='pyarrow')
print(df.shape)
end = time()
print('Execution', end-start)
Enter fullscreen mode Exit fullscreen mode

Performance comparision for a 218MB csv file

4. Always use in-built functions

In-built functions are always superior in terms of performance as compared to the custom functions because they are highly optimized. Moreover the custom functions are more prone to errors as compared to the in-built ones.

5. Always import the functions not the entire module

Instead of using
import module

using
from module import xyz

offers slightly better performance as compared to the prior. Moreover its a good idea to distribute the imported packages all across the program instead of at the top. Though it may look like a good idea but for large scale projects, there might be cases of memory peaks when all the modules are loaded at once.

6. Concatenate strings with join

Whenever we declare a string, a value is stored in the heap. So anytime we append to a respective string, the program has to first copy the entire string all over to the other place and then perform the append. using ' '.join(str). This may look trivial but for huge applications it contains the potential to spike the memory usage.

7. Use appropriate Data Structure

Whenever working with any code it is extremely important to use the right Data Structure and the optimal algorithm that tailors around the use of the application. For instance, it would be stupid to use a Stack instead of a Queue when building a feature for queueing songs on a music app.


Wrapping Up

While Python is an incredible language with a plethora of tools at our disposal, it is crucial to know the tools and the methods that would help us achieve higher performance. There are a lot of packages that use C bindings that make our code faster by leaps and bounds. I have listed all the methods that I feel are the most important and (should do the job) but that does not mean that this it. One must always explore all the options/alternatives available to them before fixing on one of the approaches.

Thanks for reading!

💖 💪 🙅 🚩
rohilvarma
Rohil Varma

Posted on January 14, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related