Convert any .pdf file πŸ“š into an audio πŸ”ˆ book with Python

mustafaanaskh99

Mustafa Anas

Posted on January 7, 2020

Convert any .pdf file πŸ“š into an audio πŸ”ˆ book with Python

(edit: I am glad you all liked this project! It got to be the top Python article of the week!)

A while ago I was messing around with google's Text to Speech python library.
This library basically reads out any piece of text and converts it to .mp3 file. Then I started thinking of making something useful out of it.

My installed, saved, and unread pdf books πŸ˜•

I like reading books. I really do. I think language and ideas sharing is fascinating. I have a directory at which I store pdf books that I plan on reading but I never do. So I thought hey, why dont I make them audio books and listen to them while I do something else πŸ˜„!

So I started planning how the script should look like.

  • Allow user to pick a .pdf file
  • Convert the file into one string
  • Output .mp3 file.

Without further needless words, lets get to it.

Allow user to pick a .pdf file

Python can read files easily. I just need to use the method open("filelocation", "rb") to open the file in reading mode. I dont want to be copying and pasting files to the directory of the code everytime I want to use the code though. So to make it easier we will use tkinter library to open up an interface that lets us choose the file.

from tkinter import Tk
from tkinter.filedialog import askopenfilename

Tk().withdraw() # we don't want a full GUI, so keep the root window from appearing
filelocation = askopenfilename() # open the dialog GUI
Enter fullscreen mode Exit fullscreen mode

Great. Now we have the file location stored in a filelocation variable.

Allow user to pick a .pdf file βœ”οΈ

Convert the file into one string

As I said before, to open a file in Python we just need to use the open() method. But we also want to convert the pdf file into regular pieces of text. So we might as well do it now.
To do that we will use a library called pdftotext.
Lets install it:

sudo pip install pdftotext
Enter fullscreen mode Exit fullscreen mode

Then:

from tkinter import Tk
from tkinter.filedialog import askopenfilename
import pdftotext

Tk().withdraw() # we don't want a full GUI, so keep the root window from appearing
filelocation = askopenfilename() # open the dialog GUI

with open(filelocation, "rb") as f:  # open the file in reading (rb) mode and call it f
    pdf = pdftotext.PDF(f)  # store a text version of the pdf file f in pdf variable
Enter fullscreen mode Exit fullscreen mode

Great. Now we have the file stored in the variable pdf.
if you print this variable, you will get an array of strings. Each string is a line in the file. to get them all into one .mp3 file, we will have to make sure they are all stored as one string. So lets loop through this array and add them all to one string.

from tkinter import Tk
from tkinter.filedialog import askopenfilename
import pdftotext

Tk().withdraw() # we don't want a full GUI, so keep the root window from appearing
filelocation = askopenfilename() # open the dialog GUI

with open(filelocation, "rb") as f:  # open the file in reading (rb) mode and call it f
    pdf = pdftotext.PDF(f)  # store a text version of the pdf file f in pdf variable

string_of_text = ''
for text in pdf:
    string_of_text += text
Enter fullscreen mode Exit fullscreen mode

Sweet πŸ˜„. Now we have it all as one piece of string.

Convert the file into one string βœ”οΈ

Output .mp3 file πŸ”ˆ

Now we are ready to use the gTTS (google Text To Speech) library. all we need to do is pass the string we made, store the output in a variable, then use the save() method to output the file to the computer.
Lets install it:

sudo pip install gtts
Enter fullscreen mode Exit fullscreen mode

Then:

from tkinter import Tk
from tkinter.filedialog import askopenfilename
import pdftotext
from gtts import gTTS

Tk().withdraw() # we don't want a full GUI, so keep the root window from appearing
filelocation = askopenfilename() # open the dialog GUI

with open(filelocation, "rb") as f:  # open the file in reading (rb) mode and call it f
    pdf = pdftotext.PDF(f)  # store a text version of the pdf file f in pdf variable

string_of_text = ''
for text in pdf:
    string_of_text += text

final_file = gTTS(text=string_of_text, lang='en')  # store file in variable
final_file.save("Generated Speech.mp3")  # save file to computer
Enter fullscreen mode Exit fullscreen mode

As simple as that! we are done πŸŽ‡
(edit: I am glad you all liked this article! The intention of all my writings is to be as simple as possible so all-levels readers can understand. If you wish to know more about customizing this API, please check this page: https://gtts.readthedocs.io/en/latest/)

Buy Me A Coffee

I am on a lifetime mission to support and contribute to the general knowledge of the web community as much as possible. Some of my writings might sound too silly, or too difficult, but no knowledge is ever useless.If you like my articles, feel free to help me keep writing by getting me coffee :)

πŸ’– πŸ’ͺ πŸ™… 🚩
mustafaanaskh99
Mustafa Anas

Posted on January 7, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

Learning Python
javascript Learning Python

November 28, 2024

Calculate savings with Python!
beginners Calculate savings with Python!

November 26, 2024

UV the game-changer package manager
programming UV the game-changer package manager

November 24, 2024

Beginners Guide for Classes
python Beginners Guide for Classes

November 20, 2024