How to specify the data format when saving Excel via Python?

luca1iu

Luca Liu

Posted on January 4, 2024

How to specify the data format when saving Excel via Python?

In this article we will solve the following questions:

  1. how to use xlsxwriter to save Excel?

  2. how to save multiple dataframes to worksheets using Python?

  3. how do use Python to stylize the Excel formatting?

  4. how to ensure that specific data types such as datetime, integer, string, and floating data are saved accurately when using Python and xlsxwriter to save a dataframe to Excel?

Question Description:

please imagine that you have the following 2 dataframes and you want to save them into one Excel file with different sheets and stylize the Excel formatting

Image description

Solution: Save Excel through xlsxwriter

1. import xlsxwriter package and create Excel file

use Workbook to create a new Excel file.



# import package
import xlsxwriter
# create excel file
workbook = xlsxwriter.Workbook("new_excel.xlsx")


Enter fullscreen mode Exit fullscreen mode

2. create two sheets

use .add_worksheet to create a sheet because we have two dataframes that need to be saved, we have to create two sheets.



worksheet1 = workbook.add_worksheet('df1_sheet')
worksheet2 = workbook.add_worksheet('df2_sheet')


Enter fullscreen mode Exit fullscreen mode

3. set header format and save the header

first, we set header format and use .write_row to save our header.

.write_row has four arguments, they are: worksheet.write_row(row, col, data, cell_format).



header_format = workbook.add_format({
 'valign': 'top',
 'fg_color': '#002060',
 'border': 1,
 'font_color': 'white'})
worksheet1.write_row(0, 0, df1.columns, header_format)
worksheet2.write_row(0, 0, df2.columns, header_format)


Enter fullscreen mode Exit fullscreen mode

4. create some format

format.set_num_format():This method is used to define the numerical format of a number in Excel. It controls whether a number is displayed as an integer, a floating point number, a date, a currency value, or some other user-defined format.

The numerical format of a cell can be specified by using a format string or an index to one of Excelโ€™s built-in formats, for more information you can check xlsxwriter-format

for example: 14 represents m/dd/yyyy, so format.set_num_format(14) = format.set_num_format('m/dd/yyyy')



# set datetime format "m/d/yy"
format_datetime = workbook.add_format({'border': 1})
format_datetime.set_num_format(14) # based on above table, 14 means "m/d/yy"
format_datetime.set_font_size(12) # set font size

# set General format
format_general = workbook.add_format({'border': 1})
format_general.set_num_format(0) # 0 means general
format_general.set_font_size(12)

# set integer format "0"
format_integer = workbook.add_format({'border': 1})
format_integer.set_num_format(1)
format_integer.set_font_size(12)

# set float format "0.00"
format_float = workbook.add_format({'border':1})
format_float.set_num_format(2)
format_float.set_font_size(12)

# set integer format with thousands separators "#,##0"
format_integer_separator = workbook.add_format({'border': 1})
format_integer_separator.set_num_format(3)
format_integer_separator.set_font_size(12)

# set percent format "0.00%"
format_percent = workbook.add_format({'border':1})
format_percent.set_num_format(10)
format_percent.set_font_size(12)


Enter fullscreen mode Exit fullscreen mode

5. save the data into Excel with cell_format

the zero index row already saved our header, so we start from one index row(set the first argument to 1)



# .write_column(row, column, data, cell_format)

worksheet1.write_column(1, 0, df1.iloc[:, 0], format_datetime)
worksheet1.write_column(1, 1, df1.iloc[:, 1], format_integer)
worksheet1.write_column(1, 2, df1.iloc[:, 2], format_integer_separator)
worksheet1.write_column(1, 3, df1.iloc[:, 3], format_general)
worksheet1.write_column(1, 4, df1.iloc[:, 4], format_float)
worksheet1.write_column(1, 5, df1.iloc[:, 5], format_percent)
worksheet2.write_column(1, 0, df2.iloc[:, 0], format_datetime)
worksheet2.write_column(1, 1, df2.iloc[:, 1], format_integer)
worksheet2.write_column(1, 2, df2.iloc[:, 2], format_integer_separator)
worksheet2.write_column(1, 3, df2.iloc[:, 3], format_general)
worksheet2.write_column(1, 4, df2.iloc[:, 4], format_float)
worksheet2.write_column(1, 5, df2.iloc[:, 5], format_percent)


Enter fullscreen mode Exit fullscreen mode

6. set column width

for a better display of the data, we can also use .set_column to set the column width.



worksheet1.set_column('A:A', 15)
worksheet1.set_column('B:B', 15)
worksheet1.set_column('C:C', 15)
worksheet1.set_column('D:D', 15)
worksheet1.set_column('E:E', 15)
worksheet1.set_column('F:F', 15)
worksheet2.set_column('A:A', 15)
worksheet2.set_column('B:B', 15)
worksheet2.set_column('C:C', 15)
worksheet2.set_column('D:D', 15)
worksheet2.set_column('E:E', 15)
worksheet2.set_column('F:F', 15)


Enter fullscreen mode Exit fullscreen mode

*7. finish*



workbook.close()


Enter fullscreen mode Exit fullscreen mode

Below is the full Python code that utilizes the xlsxwriter package to save two dataframes into an Excel file with different sheets and stylized formatting. The code begins by importing the necessary pandas and xlsxwriter packages. Then, it defines the two dataframes, df1 and df2. It proceeds to create a new Excel file and two sheets using xlsxwriter.Workbook and add_worksheet functions, respectively. Following this, the code sets the header format, defines various cell formats for different data types, and writes the data into the Excel file with the specified cell formats. Finally, it sets the column widths, closes the workbook, and thus completes the process of saving the dataframes into an Excel file.



# generate two dataframes 
import pandas as pd

df1 = pd.DataFrame(
    {'Date': ['2022/12/1', '2022/12/1', '2022/12/1', '2022/12/1'], 'Int': [116382, 227393, 3274984, 438164],
     'Int_with_seperator': [1845132, 298145, 336278, 443816], 'String': ['Tom', 'Grace', 'Luca', 'Tessa'],
     'Float': [98.45, 65.24, 30, 80.88], 'Percent': [0.8878, 0.9523, 0.4545, 0.9921]})
df2 = pd.DataFrame({'Date': ['2022/11/1', '2022/11/1', '2022/11/1', '2022/11/1'], 'Int': [233211, 24321, 35345, 23223],
                    'Int_with_seperator': [925478, 23484, 123249, 2345675],
                    'String': ['Apple', 'Huawei', 'Xiaomi', 'Oppo'], 'Float': [98.45, 65.24, 30, 80.88],
                    'Percent': [0.4234, 0.9434, 0.6512, 0.6133]})
print(df1)
print(df2)

# import package
import xlsxwriter
# create excel file
workbook = xlsxwriter.Workbook("new_excel.xlsx")

# create two sheets
worksheet1 = workbook.add_worksheet('df1_sheet')
worksheet2 = workbook.add_worksheet('df2_sheet')

# set header format and save the header
header_format = workbook.add_format({
 'valign': 'top',
 'fg_color': '#002060',
 'border': 1,
 'font_color': 'white'})
worksheet1.write_row(0, 0, df1.columns, header_format)
worksheet2.write_row(0, 0, df2.columns, header_format)

# set datetime format "m/d/yy"
format_datetime = workbook.add_format({'border': 1})
format_datetime.set_num_format(14) # based on above table, 14 means "m/d/yy"
format_datetime.set_font_size(12) # set font size

# set General format
format_general = workbook.add_format({'border': 1})
format_general.set_num_format(0) # 0 means general
format_general.set_font_size(12)

# set integer format "0"
format_integer = workbook.add_format({'border': 1})
format_integer.set_num_format(1)
format_integer.set_font_size(12)

# set float format "0.00"
format_float = workbook.add_format({'border':1})
format_float.set_num_format(2)
format_float.set_font_size(12)

# set integer format with thousands separators "#,##0"
format_integer_separator = workbook.add_format({'border': 1})
format_integer_separator.set_num_format(3)
format_integer_separator.set_font_size(12)

# set percent format "0.00%"
format_percent = workbook.add_format({'border':1})
format_percent.set_num_format(10)
format_percent.set_font_size(12)

worksheet1.write_column(1, 0, df1.iloc[:, 0], format_datetime)
worksheet1.write_column(1, 1, df1.iloc[:, 1], format_integer)
worksheet1.write_column(1, 2, df1.iloc[:, 2], format_integer_separator)
worksheet1.write_column(1, 3, df1.iloc[:, 3], format_general)
worksheet1.write_column(1, 4, df1.iloc[:, 4], format_float)
worksheet1.write_column(1, 5, df1.iloc[:, 5], format_percent)
worksheet2.write_column(1, 0, df2.iloc[:, 0], format_datetime)
worksheet2.write_column(1, 1, df2.iloc[:, 1], format_integer)
worksheet2.write_column(1, 2, df2.iloc[:, 2], format_integer_separator)
worksheet2.write_column(1, 3, df2.iloc[:, 3], format_general)
worksheet2.write_column(1, 4, df2.iloc[:, 4], format_float)
worksheet2.write_column(1, 5, df2.iloc[:, 5], format_percent)

worksheet1.set_column('A:A', 15)
worksheet1.set_column('B:B', 15)
worksheet1.set_column('C:C', 15)
worksheet1.set_column('D:D', 15)
worksheet1.set_column('E:E', 15)
worksheet1.set_column('F:F', 15)
worksheet2.set_column('A:A', 15)
worksheet2.set_column('B:B', 15)
worksheet2.set_column('C:C', 15)
worksheet2.set_column('D:D', 15)
worksheet2.set_column('E:E', 15)
worksheet2.set_column('F:F', 15)

workbook.close()


Enter fullscreen mode Exit fullscreen mode

Explore more

Thank you for taking the time to explore data-related insights with me. I appreciate your engagement.

๐Ÿš€ Connect with me on LinkedIn

๐ŸŽƒ Connect with me on X

๐ŸŒ Connect with me on Instagram

๐Ÿ’– ๐Ÿ’ช ๐Ÿ™… ๐Ÿšฉ
luca1iu
Luca Liu

Posted on January 4, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

ยฉ TheLazy.dev

About