Django and Openpyxl: Extracting and Sending Django model data as excel file (.xlsx)
John Owolabi Idogun
Posted on May 18, 2022
Motivation
As business requirements evolve, certain features are expected to evolve as well. That was the case for me a few days ago when I had to implement a feature where Django model's data are converted to .xlsx excel format and sent, as an attachment, to a provided email address literally. It was a big problem for me to figure out how to do this with that file extension as most available solutions were using the old .xls Excel file extension and none really talked about sending Excel files as attachments of an email sent via Django. To solve this problem, I looked at a similar solution by Vitor Freitas who used xlwt to export the Django model's data to .xls Excel file format and served it as response data to the user for download. Though a great companion, I needed a different file extension and was solving a slightly different problem. From further research, I came across Working with Excel Files in Python, from Working with Spreadsheets using Python, which recommended openpyxl for "reading and writing Excel 2010 files (ie: .xlsx)" so I decided to try the package out. This experience birthed this article where I utilized coingecko's API(if you consider using the paid services of coingecko, use can use my referral CGSIRNEIJ) to periodically fetch cryptocurrency coin's data, store the data in a database, and then display as well as allow users to extract the data to Excel files sent to any email address provided. It's my deepest hope that you will find it helpful.
Assumptions
It is assumed that readers are pretty familiar with Django and JavaScript as well as the typed extension of Python using mypy, the typing built-in module, and the rules by PEP8.
You should also be familiar with writing tests for Django models, methods, views, celery tasks, and functions. You don't have to be militant at that though.
Knowledge of HTML, and CSS (and its frameworks — Bootstrap for this project) is also assumed.
Source code
The entire source code for this article can be accessed via:
Exporting Django model data as excel file (.xlsx) using openpyxl library and Google Spreadsheet API
django_excel
This repository accompanies this tutorial on dev.to. It has been deployed to Heroku and can be accessed live via this link.
NOTE: If you use Coingecko's API, when you use my code, CGSIRNEIJ, I get some commissions. That can be a good way to help me.
Run locally
It can be run locally by creating a virtual environment using any of venv, poetry, virtualenv, and pipenv. I used virtualenv while developing the app. Having created the virtual environment, activate it and install the project's dependencies by issuing the following command in your terminal:
Step 1: Start a project, an application, and configure settings
To incept building out the project, create a directory for it — mine was django_excel — start a virtual environment, start a django project — called mine django_excel as well — and create a django application — my app was named core. Then proceed to opening your project's settings in your preferred IDE/text editor. Add the following configurations:
# django_excel -> settings.py
...frompathlibimportPathfromtypingimportAny# add these lines for python-decouple and celery.
fromcelery.schedulesimportcrontabfromdecoupleimportCsv,config...INSTALLED_APPS:list[str]=['django.contrib.admin','django.contrib.auth','django.contrib.contenttypes','django.contrib.sessions','django.contrib.messages','django.contrib.staticfiles','core.apps.CoreConfig',# add your app
]...TEMPLATES:list[dict[str,Any]]=[{'BACKEND':'django.template.backends.django.DjangoTemplates','DIRS':[BASE_DIR/'templates'],# configure your templates
'APP_DIRS':True,'OPTIONS':{'context_processors':['django.template.context_processors.debug','django.template.context_processors.request','django.contrib.auth.context_processors.auth','django.contrib.messages.context_processors.messages',],},},]...STATIC_ROOT:str=BASE_DIR/'staticfiles'# add static_root for deployment purposes
...# Configure celery broker_url and other configurations
CELERY_BROKER_URL:str=config('REDIS_URL',default='amqp://localhost')CELERY_RESULT_BACKEND:str=config('REDIS_URL',default='')CELERY_ACCEPT_CONTENT:list[str]=['application/json']CELERY_TASK_SERIALIZER:str='json'CELERY_RESULT_SERIALIZER:str='json'# From celery 5.0, this is how to schedule tasks. Though we have not created them yet it will come in handy
CELERY_BEAT_SCHEDULE:dict[str,dict[str,Any]]={'get_coins_data_from_coingecko_and_store':{'task':'core.tasks.get_coins_data_from_coingecko_and_store','schedule':crontab(minute='*/1'),},}# For testing purposes, we need to set these.
CELERY_ALWAYS_EAGER=TrueCELERY_EAGER_PROPAGATES=TrueBROKER_BACKEND='memory'# Email configuration
ADMINS=(('Admin',config('EMAIL_HOST_USER',default='no-reply@django_excel.herokuapp.com')),)# We use the console to send emails during development and real email in production
ifDEBUG:EMAIL_BACKEND='django.core.mail.backends.console.EmailBackend'else:EMAIL_BACKEND='django.core.mail.backends.smtp.EmailBackend'EMAIL_HOST_USER=config('EMAIL_HOST_USER',default='')EMAIL_HOST='smtp.gmail.com'EMAIL_PORT=587EMAIL_USE_TLS=TrueEMAIL_HOST_PASSWORD=config('EMAIL_HOST_PASSWORD',default='')# Basic security and database settings for production-ready applications
ifnotDEBUG:importdj_database_url# ==============================================================================
# SECURITY SETTINGS
# ==============================================================================
CSRF_COOKIE_SECURE=TrueCSRF_COOKIE_HTTPONLY=TrueSECURE_HSTS_SECONDS=60*60*24*7*52# one year
SECURE_HSTS_INCLUDE_SUBDOMAINS=TrueSECURE_SSL_REDIRECT=TrueSECURE_BROWSER_XSS_FILTER=TrueSECURE_CONTENT_TYPE_NOSNIFF=TrueSECURE_PROXY_SSL_HEADER=('HTTP_X_FORWARDED_PROTO','https')SESSION_COOKIE_SECURE=Truedb_from_env=dj_database_url.config(conn_max_age=500)DATABASES['default'].update(db_from_env)
Though the snippets above tend to be a handful, they are just some basic configurations to set up our system for development and production environments. Comments are appended to shed light on what they do. If you ain't familiar with typed Python, no worries. We specify the type of a variable in Python using the following pattern:
# variable_name: variable_type = default_value, for example
ALLOWED_HOSTS:list[str]=config('ALLOWED_HOSTS',default='127.0.0.1,localhost',cast=Csv())
Since our ALLOWED_HOSTS variable is a list of strings, we specified this by using list[str]. One of the generic types in Python.
Moving forward, let's bring in celery. In the same directory as your settings.py file, create a file named celery.py and make it look like:
# django_excel -> celery.py
importosfromceleryimportCeleryfromdjango.confimportsettings# Set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE','django_excel.settings')app=Celery('django_excel')app.config_from_object('django.conf:settings',namespace='CELERY')# Load task modules from all registered Django apps.
app.autodiscover_tasks()ifsettings.DEBUG:@app.task(bind=True)defdebug_task(self):"""Debug method."""print(f'Request: {self.request!r}')
That was lifted from django celery documentation with few modifications to suit our project. Then proceed to the __init__.py file in that same directory and turn it into:
# django_excel -> __init__.py
# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from.celeryimportappascelery_app__all__=('celery_app',)
Also lifted from the same source. The comments on top say it all.
Before taking a break from configurations, let's visit our root urls.py file:
...# django_excel -> urls.py
fromdjango.contribimportadminfromdjango.urlsimportinclude,path# add include
urlpatterns=[path('admin/',admin.site.urls),path('',include('core.urls',namespace='core')),# include our app's urls.py file
]
Ensure you create a urls.py file in your core app. That's it with configurations. Now to the next step.
Step 2: Define your model, and views and write your tasks.
Let's go on to create our model. Open your app's models.py file in your editor and fill it with:
# core -> models.py
fromdjango.dbimportmodelsclassCoins(models.Model):name=models.CharField(max_length=200,null=True)symbol=models.CharField(max_length=200,null=True)image_url=models.URLField(null=True)current_price=models.DecimalField(decimal_places=2,max_digits=50,null=True)price_change_within_24_hours=models.DecimalField(decimal_places=2,max_digits=50,null=True)rank=models.IntegerField(null=True)market_cap=models.DecimalField(decimal_places=2,max_digits=50,null=True)total_supply=models.DecimalField(decimal_places=2,max_digits=50,null=True)def__str__(self)->str:"""Return model string representation."""returnf'{self.name} - {self.symbol}'
We are interested in the coins's name, symbol, image, current_price, price_change_within_24_hours, rank, market_cap, and total_supply. Coingecko's API(referral code CGSIRNEIJ) provides more fields but those are our current interests. You can now makemigrations and migrate but ensure you install all the dependencies we have introduced. This project's repository has a requirements.txt file with all of them.
Now, create a tasks.py file in your app's directory and make it look like:
# core -> tasks.py
fromioimportBytesIOimportrequestsfromceleryimportshared_taskfromdecoupleimportconfigfromdjango.confimportsettingsfromdjango.core.mailimportEmailMessagefromdjango.utilsimporttimezonefromopenpyxlimportWorkbookfromopenpyxl.stylesimportAlignment,Font,Protectionfromcore.modelsimportCoinsfromcore.templatetags.custom_tagsimportcurrency@shared_taskdefget_coins_data_from_coingecko_and_store()->None:"""Fetch data from coingecko API (referral code `CGSIRNEIJ`) and store."""base_url='https://api.coingecko.com/api/v3/coins/'market_currency_order='markets?vs_currency=ngn&order=market_cap_desc&'per_page='per_page=250&page=1&sparkline=false'final_url=f'{base_url}{market_currency_order}{per_page}'coin_data=requests.get(final_url).json()fordataincoin_data:coin,_=Coins.objects.get_or_create(name=data['name'],symbol=data['symbol'])coin.image_url=data['image']coin.current_price=data['current_price']coin.price_change_within_24_hours=data['price_change_24h']coin.rank=data['market_cap_rank']coin.market_cap=data['market_cap']coin.total_supply=data['total_supply']coin.save()@shared_taskdefexport_data_to_excel(user_email:str)->None:"""Send extracted model data and save in excel and send to email."""excelfile=BytesIO()workbook=Workbook()workbook.remove(workbook.active)worksheet=workbook.create_sheet(title='Latest Cryptocurrency Coins Data',index=1)workbook.security.workbookPassword=config('PASSWORD',default='12345data')workbook.security.lockStructure=config('PROTECT',default=True,cast=bool)workbook.security.revisionsPassword=config('PASSWORD',default='12345data')worksheet.protection.sheet=config('PROTECT',default=True,cast=bool)worksheet.protection.formatCells=config('PROTECT',default=False,cast=bool)worksheet.sheet_properties.tabColor='1072BA'worksheet.freeze_panes='I2'coin_queryset=Coins.objects.all().order_by('rank')columns=['Name','Symbol','Rank','Current price','Price change','Market cap','Total supply']row_num=1# Assign the titles for each cell of the header
forcol_num,column_titleinenumerate(columns,1):cell=worksheet.cell(row=row_num,column=col_num)cell.value=column_titlecell.alignment=Alignment(horizontal='center',vertical='center',wrap_text=True)cell.font=Font(bold=True)# Iterate through all coins
for_,coininenumerate(coin_queryset,1):row_num+=1# Define the data for each cell in the row
row=[coin.name,f'{coin.symbol}'.upper(),coin.rank,currency(coin.current_price),currency(coin.price_change_within_24_hours),currency(coin.market_cap),coin.total_supply,]# Assign the data for each cell of the row
forcol_num,cell_valueinenumerate(row,1):cell=worksheet.cell(row=row_num,column=col_num)cell.value=cell_valuecell.protection=Protection(locked=True)workbook.save(excelfile)now=timezone.now()message=EmailMessage(f'Coin data as of {now.date().isoformat()}',f'Generated at: {now.isoformat()}',settings.DEFAULT_FROM_EMAIL,[user_email],)message.attach('latest-coin-list.xlsx',excelfile.getvalue(),'application/vnd.ms-excel')message.send()
The first task, get_coins_data_from_coingecko_and_store, does exactly what its name implies — get coin's data from coingecko api and store them in our database. It uses the requests library to achieve this. After getting the data, it loops through them and uses Django's get_or_create QuerySet api to get a coin data from the database using it's name and symbol. If such data doesn't exist, it creates it and update the created intance with other data from the api. Since this task is periodic — we want it to run automatically at a specified time — it was appended to our CELERY_BEAT_SCHEDULE earlier:
We gave it a name, which is the normal name we defined it as, and gave the thorough path where it can be found 'core.tasks.get_coins_data_from_coingecko_and_store'. Then it was scheduled to run every 1 minute.
Pretty simple huh?!
The next and major task — export_data_to_excel — does the extraction of data to excel format. First, it instantiated a temporary memory to hold the excel file before sending using BytesIO. You could use StringIO as well but I favour the former. Then, an openpyxl Workbook was created. By default, after this initialization, a woorksheet instance active is created but since I just didn't want to work with that, I removed it and created mine using workbook.create_sheet(title='Latest Cryptocurrency Coins Data', index=1). create_sheet takes a compulsory argument title but I also included the index I want the wooksheet to occupy. Since I am security conscious and don't want anyone to fiddle with my excel data anyhow without my permission, I decided to lock the workbook and provided password incase some changes are inevitable. From experience, immediately worksheet.protection.sheet is set to True, your Microsoft excel toolbar will be grayed out. To prevent this, we set worksheet.protection.formatCells to True. Also, since we will be working with hundreds of coins and scrolling downwards might make us lose focus on the columns being investigated, I decided to make the colum headers fixed so that even when scrolling, you still see them. To achieve this, you go one cell more than the total header colum's row and column. Take for instance, if your headers have four colums, ['A', 'B', 'C', 'D'], to fix them, I will stop at the fifth column, 'E', and second row, '2'. So that my worksheet.freeze_panes will be set to E2. Got it? Cool 😎! Next, we defined our queryset and the colums we are interested in. Then, we start from row number one since openpyxl starts its row number from 1 unlike xlwt. We then iterate through these colums and populate the worksheet cell at each combination of row and column (row 1 and dynamic columns in this case since our row remains at 1). We also align the texts and bold them as well. You can do so much than those. Thanks to openpyxl 👏! After that, we iterated through the entire QuerySet previously defined. In each iteration, we first increment the row number so that previous rows won't be overriden and then populate each cell (a combination of row and colum) with their respective value. Double for loop 😨. There is a more nifty way to do this but I settled for this nonetheless. After populating the cells, I saved the workbook in the artificial memory created earlier and then sent the mail with the excel file as attachment. To get the excel file from the temporary memory, we used the .getvalue() method shipped with BytesIO. It's also available for StringIO. We also gave the file a name, latest-coin-list.xlsx and a type application/vnd.ms-excel. As far as I know, they are important for easy encoding and decoding by your e-mail service provider. That was mouthful 🤤!
Next is our views.py file.
# core -> views.py
importjsonfromdjango.httpimportHttpRequest,HttpResponse,JsonResponsefromdjango.shortcutsimportrenderfromcore.modelsimportCoinsfromcore.tasksimportexport_data_to_exceldefindex(request:HttpRequest)->HttpResponse:"""Index view."""coins=Coins.objects.all().order_by('rank')context:dict[str,str]={'coin_data':coins,}returnrender(request,'coin_data.html',context)defextract_and_send_coin_data_via_email(request:HttpRequest)->JsonResponse:"""Handle the post requests for sending emails."""ifrequest.method=='POST':request_data=json.loads(request.body)email=request_data['userEmail']export_data_to_excel.delay(email)returnJsonResponse({'message':'Coins data successfully extracted 💃!'},status=200)returnJsonResponse({'message':'Coins data failed to be extracted 😔!'},status=500)
Nothing much, just the normal function-based views. The first serves our templates and the other handles any export requests. It's basically an API endpoint that utilizes the export_data_to_excel task we discussed above. Notice that I didn't get my request data from request.POST.get(). This is because I used pure asynchronous fetch API to send post requests and data was passed via body option of the Request object as shown in the template:
Run your server now and navigate to 127.0.0.1:8000 in your browser. Did you see something? Yeah, you must see No coin data currently... That's because you don't have any data currently in your database. To have data, you need to run our celery task which is responsible for fetching data from coingecko API. To run it, open another terminal which has your virtual environment activated and issue this command:
(virtualenv) sirneij@pop-os ~/D/P/T/django_excel (main)> celery -A django_excel worker -l info -B
If your project name isn't django_excel, please modify that command appropriately. Now, you should see some streams of request and response 😊. If you now visit your browser after 1 minute, you should see something 💃!!!
Step 3: Testing, testing, and testing.
For almost a year now, I have been a strong proponent of TDD. That's because I struggled to adjust to that style initially but it's proved kinda awesome now. Try it out, it's worth the stress sometimes! From the foregoing, let's write some tests 😂! In your root directory, create a new folder named tests and make the folder look like:
We will be using pytest and it's django package to run our tests. You can delete core/tests.py. In tests/core/test_models.py, let have something like:
# tests -> core -> test_models.py
fromdjango.testimportTestCasefromcore.modelsimportCoinsclassCoinsModelTests(TestCase):defsetUp(self)->None:"""Create the setup of the test."""self.coin=Coins.objects.create(name='bitcoin',symbol='btc')deftest_unicode(self)->None:"""Test the model's __str__ method"""self.assertEqual(str(self.coin),f'{self.coin.name} - {self.coin.symbol}')
We are just checking if our unicode or string representation is correctly formatted. If we'd had bigger application structure, factory would have been great for this.
In these tests, we introduced the concept of mocking. It's used to 'mock' some behaviour that might otherwise be dependent on network calls or availability of the external API being called. Though this test doesn't guarantee 100% test coverage as I would have wanted but it does well above 80%. And finally, tests/core/test_views.py:
# tests -> core -> test_views.py
fromunittest.mockimportpatchfromdjango.testimportClient,TestCasefromdjango.urlsimportreverseclassIndexViewTests(TestCase):defsetUp(self)->None:"""Create the setup of the test."""self.client=Client()deftest_context(self)->None:"""Test index context and template."""response=self.client.get(reverse('core:index'))self.assertEqual(response.templates[0].name,'coin_data.html')classExtractAndSendCoinDataViaEmailTests(TestCase):defsetUp(self)->None:"""Create the setup of the test."""self.client=Client()self.data={'userEmail':'django_excel@django.com'}deftest_extract_and_send_coin_data_via_email_success(self):"""Test extract and send extracted data."""withpatch('core.views.export_data_to_excel.delay')asmock_export_data_to_excel:response=self.client.post(reverse('core:extract_data'),self.data,content_type='application/json')self.assertEqual(response.status_code,200)mock_export_data_to_excel.assert_called_once()deftest_extract_and_send_coin_data_via_email_failure(self):response=self.client.get(reverse('core:extract_data'),self.data,content_type='application/json')self.assertEqual(response.status_code,500)
We also used mocking to mimick our export_data_to_excel task. It should be noted that setting the following is important to really test celery tasks:
That brings us to the end of this surjourn 😎! Thanks for sticking around.
Bonus point
This article's repository has some other files to provide a more complete project. There are codes for currency custom filter, pytest configurations, some CI/CD using GitHub actions, code inspection to ensure all best practices are taken into consideration. You can go through them when you are opportune to.
Outro
Enjoyed this article? I'm a Software Engineer and Technical Writer actively seeking new opportunities, particularly in areas related to web security, finance, health care, and education. If you think my expertise aligns with your team's needs, let's chat! You can find me on LinkedIn: LinkedIn and Twitter: Twitter.
If you found this article valuable, consider sharing it with your network to help spread the knowledge!