Automate Excel with Python

stokry

Stokry

Posted on September 30, 2020

Automate Excel with Python

With Python, you can merge or combine Excel workbooks if you have multiple workbooks to analyze or anything similar.

It can be painful to manually retrieve the data. You would have to open every single file and you may end up working in a confusing environment. We can automate that in Python in less than 10-15 lines of code.

We have two files data1 and data2 and we will merge these two files into one.

First of all, we want to import pandas module in our code.
And for those of you who don't know what it is, it's a module used mainly to manipulate data in Python.

import pandas.as pd
Enter fullscreen mode Exit fullscreen mode

After that, we want to specify the location of the Excel files that we want to merge.
This is the list we are going to loop through later to make sure we go over each file.
Also, we need to create a blank DataFrame and stored in a variable merge.

excel_files = ['location_of_your_first_excel_file', 'location_of_your_second_excel_file']
merge = pd.DataFrame()
Enter fullscreen mode Exit fullscreen mode

It's now time to loop through our Excel list and read those files in a DataFrame. I also make sure to not copy the header from the second workbook file by using the skiprows argument. We are telling pandas to ignore the first row of the second workbook, we don't wanna that in our file. We just need the merged data.

for file in excel_files:
   df = pd.read_excel(file, skiprows = 1)
   merge = merge.append(df, ignore_index = True)

merge.to_excel('Merged_Files.xlsx')
Enter fullscreen mode Exit fullscreen mode

We finally append the results to merge and we should be done with logic.
At the end we need to have all in the output file, we can call this "Merged_Files"

Whole code looks like this:

import pandas.as pd
excel_files = ['location_of_your_first_excel_file', 'location_of_your_second_excel_file']
merge = pd.DataFrame()

for file in excel_files:
   df = pd.read_excel(file, skiprows = 1)
   merge = merge.append(df, ignore_index = True)

merge.to_excel('Merged_Files.xlsx')
Enter fullscreen mode Exit fullscreen mode

That is all for today.
Have a nice day.

💖 💪 🙅 🚩
stokry
Stokry

Posted on September 30, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related