AUTOMATICALLY CREATE RESUME FROM MONGODB USING PYTHON
I create article here to explain the code https://bit.ly/3fMr4hF
Posted on June 21, 2020
In this guide you will learn how to create resume from data you have in your mongodb database. The resume will be in form of pdf. We will also shorten all the link in the url so we can analyze how much visitor come through our resume for that purpose we will use bit.ly and bit.ly api. We will use docx template to generate the form. When you're finished you will be able to use it to automatically create your resume even if your resume got a lot of project so you dont need to type it one by one in docx.
python 3.6, install depend on your os. Install it from official python website here
docx template, for the template you can use whatever design you want to use, for this article i use docx template from Universitas Ciputra apple developer academy resume template you can download it here
pip install virtualenv and then python3 -m venv tutorial-env
bit.ly account
mlab or mongodb atlas account, for this one i use mlab and as of today you can't register to mlab anymore as it has been merge with mongodb atlas so if you're new to mongodb and want to try register to mongodb atlas
linux, the OS is actually whatever you want to use but for majority of this article i will use linux command for other OS, google is your friend
libreoffice, this is for convert to PDF later after we create docx
First of all we need to install some of the dependency that later we will use it for our need, dont forget to run the virtualenv for every dependency install so it won't be installed on the global scope to install dependency use the command below
pip install requests Pillow docxtpl pymongo wget decouple
or you can install it later using requirements.txt in github link i send you later like this
pip install -r requirements.txt
let's breakdown one by one why we need all of this dependencies here
First off we install requests which is a library for http request we need this to request api from bit.ly i will explain it later how to get access token for bit.ly for personal use
And then we install Pillow this is a library for manipulating image we use this to compress the image all my image is hosted on gyazo you can use any other image host let's say imgur or something for hosting your image or you can put your image in the same folder as your main python file. Sometimes the image file is so big like 3.5 mb and thus make the docx file size big also we won't be uploading that much of a file right and pretty sure reviewer will not do that either so lets make our life easier by making other life easier by compressing the image file with Pillow
After that we install docxtpl this is one of the most important library here as this is the one who make our life easier by helping us edit the docx template for python
pymongo is for communicating with mongodb host actually you can use any database you want for the data but for this article we will use mongodb
wget this one is for downloading image and put it in our folder as all my image is hosted on gyazo i need a tools to get it from there and put it on my folder and later we will use it for our project image in docx template
decouple is for getting the variable in .env
Why we use docx template first of all you can design actually your own docx template if you dont like the one which i put on the prequisite that one is just for example of how to create a docx template and make it able to communicate with our python to make sure our data is put in the docx easily so let's breakdown the docx template
as you can see here you might see a syntax like
{% for i in projects %}
that is a jinja2 syntax it is a syntax popularized by the python django framework for communicating with the view so we use that also for our docx so that python know how to edit the jinja syntax into our data in syntax above you can see that we are actually looping through our data List data in python called projects and input the data one by one through jinja syntax let's breakdown what is that jinja syntax is doing here
{% for i in projects %}
{{loop.index}}
{{i.title}}
{{name}}
{%if loop.index != loop.length%}{{r i.page_break}} {% endif %}
First of all as i said earlier you need token for accessing the bit.ly api using requests dependency of python so after your register using bitly you can get an access token in the dashboard click on the top right corner of your dashboard you will get a menu like below
click on Profile Settings
click on Generic Access Token
type your password and click generate token you will get your token to access the bitly api
MONGO_HOST=
BITLY_TOKEN=
NAME=
FROM=
CONTACT=
please fill all the environment variable for example like below
MONGO_HOST=mongodb://asdas:asdsa@ds14112.mlab.com:24122/resume
BITLY_TOKEN=12321321311112bubu2ubjjjjjjjjjj
NAME=Nori Roin
FROM=Public (Non Universitas Ciputra)
CONTACT=+6281336226985 noriroin@gmail.com
import requests
from decouple import config
def shorten(link,title):
headers = {'Content-Type': 'application/json',"Authorization":config('BITLY_TOKEN')}
payload = {'long_url': link,'title':title}
r = requests.post("https://api-ssl.bitly.com/v4/bitlinks", json=payload,headers=headers)
return r.json()["link"]
import sys
import subprocess
import re
def convert_to(folder, source, timeout=None):
args = [libreoffice_exec(), '--headless', '--convert-to', 'pdf', '--outdir', folder, source]
process = subprocess.run(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE, timeout=timeout)
filename = re.search('-> (.*?) using filter', process.stdout.decode())
if filename is None:
raise LibreOfficeError(process.stdout.decode())
else:
return filename.group(1)
def libreoffice_exec():
# TODO: Provide support for more platforms
if sys.platform == 'darwin':
return '/Applications/LibreOffice.app/Contents/MacOS/soffice'
return 'libreoffice'
class LibreOfficeError(Exception):
def __init__(self, output):
self.output = output
from shorten import shorten
import wget
import pprint
from pymongo import MongoClient
from docxtpl import DocxTemplate,R, InlineImage
from docx.shared import Mm,Inches
import os
from PIL import Image
from decouple import config
from pdf import convert_to
client = MongoClient(config('MONGO_HOST'))
context = {"name":config('NAME'),"from":config('FROM'),"contact":config('CONTACT'),"projects":[]}
db = client['resume']
projects=db.resumes
images=[]
doc = DocxTemplate("resume.docx")
for post in projects.find().sort("_id",-1):
imajin=InlineImage(doc, "a.jpg", height=Inches(3.97),width=Inches(6.25))
if post['i']:
image_filename = wget.download(post['i'])
foo = Image.open(image_filename)
im=foo.convert('RGB')
im.save(image_filename+".jpg","JPEG",optimize=True,quality=65)
images.append(image_filename)
images.append(image_filename+".jpg")
imajin=InlineImage(doc, image_filename+".jpg", height=Inches(3.97),width=Inches(6.25))
context['projects'].append({"page_break":R('\f'),'description':post['d'],'title':post['t'],'year':'2019','role':'programmer','link':shorten(post['p'],post['t']) if post['p'] else shorten(post['g'],post['t']) if post['g'] and post['t'] else '-'
,'image':imajin})
doc.render(context)
doc.save("generated_resume.docx")
for image in images:
os.remove(image)
convert_to("./","generated_resume.docx")
os.remove("generated_resume.docx")
Okay let's breakdown one by one what is this python index.py doing
from shorten import shorten
import wget
import pprint
from pymongo import MongoClient
from docxtpl import DocxTemplate,R, InlineImage
from docx.shared import Mm,Inches
import os
from PIL import Image
from decouple import config
from pdf import convert_to
client = MongoClient(config('MONGO_HOST'))
context = {"name":config('NAME'),"from":config('FROM'),"contact":config('CONTACT'),"projects":[]}
db = client['resume']
projects=db.resumes
as you can see i had the example data like this so i will break this down one by one why i create my data like this so the data column is t , g ,p ,d ,i . Thats actually stand for title , github , preview , description , image . I create it like this because later if you want to create for example a production mongodb database you can minimize the data you store on the mongodb because most of the database host like mlab for example they charge based on data size so the smaller the data the lower the cost. And maybe you notice that in i data i use gyazo to host my image gyazo is cropping tool for linux and other os i think to crop the part of our screen and its free for public image.
images=[]
doc = DocxTemplate("resume.docx")
for post in projects.find().sort("_id",-1):
imajin=InlineImage(doc, "a.jpg", height=Inches(3.97),width=Inches(6.25))
if post['i']:
image_filename = wget.download(post['i'])
foo = Image.open(image_filename)
im=foo.convert('RGB')
im.save(image_filename+".jpg","JPEG",optimize=True,quality=65)
images.append(image_filename)
images.append(image_filename+".jpg")
imajin=InlineImage(doc, image_filename+".jpg", height=Inches(3.97),width=Inches(6.25))
context['projects'].append({"page_break":R('\f'),'description':post['d'],'title':post['t'],'year':'2019','role':'programmer','link':shorten(post['p'],post['t']) if post['p'] else shorten(post['g'],post['t']) if post['g'] and post['t'] else '-'
,'image':imajin})
doc.render(context)
doc.save("generated_resume.docx")
after that we can save the docx in the same folder as the index.py
for image in images:
os.remove(image)
convert_to("./","generated_resume.docx")
os.remove("generated_resume.docx")
we remove the image convert the docx to pdf then remove the generated docx
python index.py
and voila! you will get file named generated_resume.pdf
in this article i've shown you how to create a resume pdf using python automatically using mongodb you can try to edit my code using other database for example postgresql or any database you want below is the github link for the source code of the article
I create article here to explain the code https://bit.ly/3fMr4hF
Posted on June 21, 2020
Sign up to receive the latest update from our blog.