Create easily a Python app with Streamlit
Kev Castor
Posted on May 27, 2024
🇫🇷 version here.
For some time now, I have been learning Vietnamese. To do this, like many people these days, I signed up for the Duolingo app. This app is really good; it makes learning quick and fun.
But here’s the thing, like many apps these days, Duolingo offers a Freemium model, and its full potential is only unlocked with a subscription. One of the interesting features offered by the subscription is the ability to take a simple Vietnamese/English translation quiz with all the words learned since the beginning of the lessons. So I wondered, "How can I try to reproduce this exercise, which is quite simple, to practice without having to pay for a Duolingo subscription?"
In this article, I will try to show you how to recreate this exercise with the Streamlit framework and a bit of code.
First, the data
First, I need a dataset to carry out this project. Fortunately, on its website, Duolingo lists all the words I've learned since the beginning of my lessons, along with their English translations (I can only learn Vietnamese from English in the app).
A quick copy/paste of the word list into a text file (sorry Duolingo...), and here I am with a small dataset of 131 Vietnamese words with their corresponding English translations. All that's left is to reformat everything with a quick Python script, and I have a JSON file that contains the basic data for my future game.
{
"translations": [
{
"VN": "ga",
"EN": "station, gas"
},
{
"VN": "trạm",
"EN": "station, stations"
},
{
"VN": "đặt chỗ",
"EN": "make a reservation, reservation"
},
...
Streamlit
I had several options for creating my app. What I wanted was something relatively simple to implement, easily deployable so I could access it anywhere, and an opportunity to learn something while working on this project. So, what to do? Create a simple HTML/JS project, try to dive into a JS framework (React or Vue.js), or find another alternative?
The HTML/JS project? Yes, why not, but I'm not sure I would learn anything interesting.
Creating an app with a JS framework? Yes, that would be great, but I know these kinds of projects from having tried them several times before... They are great frameworks, but you still need to learn how to use them properly (I knew how to use React a few years ago, but it evolves so quickly that today I would have to start from scratch) to avoid doing things incorrectly, and you need to spend 2 weeks learning a framework to create an app with a few lines of code that takes 4 hours to implement.
Moreover, I'm no longer comfortable with JS...
So why not turn to something else? I'm a pythonista in my daily life, and I've heard about the Streamlit framework at work. It seems you can create cool apps with it. So, let's take a look 🕵️
Quick focus on the Framework
Go to the Streamlit website, and you'll see written in front of you: "A faster way to build and share data apps". That seems to be what we're looking for 😄. A few minutes browsing the documentation confirms it. It's simple to install, use, and deploy, and it's all in Python. Let's see if it fits the bill!
Installation
Installing the framework is quite simple and can be done in a few lines:
mkdir translation-exercise-app
cd translation-exercise-app
# make sure you have a version of python >= 3.8
python3 --version
python3 -m venv .venv
source .venv/bin/activate
pip install streamlit
And that's it. Simple, right? For the rest of the article, the only Streamlit command we'll use is streamlit run app.py
.
Let's get started with the implementation
With your favorite editor, open the translation-exercise-app
folder. Your .venv folder should already be there. Add the JSON file containing the initially generated dataset, then create a file named app.py
.
# app.py
import streamlit as st
st.text("Hello World!")
Now, use the command streamlit run app.py
, and a page in your browser should open displaying "Hello World!". First easy victory 💪 !
I won't explain each Streamlit component I'll use one by one. I prefer to provide you with links to the official documentation, which is much better than any explanations I could give. Instead, I'll show you how I use them to create my translation exercise app.
Some design first
What exactly do I want to do?
I thought that, for now, I would create a very simple app. Its purpose will be to display a word in Vietnamese taken from the base dataset, along with 4 English translations, one corresponding to the exact translation of the chosen Vietnamese word, and the other 3 being just random English words from the rest of the dataset. The player will have to find the correct English translation among the 4 options. We will display a score on the screen which will count the number of consecutive correct answers from the player. This score will reset to 0 in case of a mistake. We will also display the best score obtained.
Loading data into the app
For the app to function, we need to make the dataset available to it. So here we go:
import json
import streamlit as st
# We create a function that will read and load the data into a Python dictionary
def load_data() -> dict:
data = {}
with open("vn_en_words_translations.json") as fd:
data = json.load(fd)
return data
# We check if the data is in the Streamlit "session_state" (or cache)
# If it's not there, then we use the function to load the data
if "words_dict" not in st.session_state:
st.session_state["words_dict"] = load_data()
With this piece of code, I can load my data and store it in the Streamlit session state (I'll explain later what the session state is). It's quite simple and allows me to access my dataset throughout my file via st.session_state["words_dict"]
, which is quite handy.
Creating the Quiz Dataset
We are tackling the most "difficult" part (in reality, it's very simple, don't panic) in terms of logic for our application. The idea here is to create a function that will randomly select a Vietnamese word and 4 English words, one of which will be the translation of our Vietnamese word. I decided to use a simple data structure for this: a dictionary with two keys. One of them is associated with a list that will contain the Vietnamese word (this list will always be of size 1). The other is associated with a list that will contain the 4 English translations, always placing the correct translation at the beginning of the list (index 0).
import random
def select_quiz_words(words_dict: dict) -> dict:
# Create the dictionary
selected_words = {
"VN": [],
"EN": []
}
# Randomly determine an index to use for selecting a Vietnamese word
# and its English equivalent from our dataset to place at the start of the list.
selected_word_index = random.randrange(0, len(words_dict["translations"]))
selected_words["VN"].append(words_dict["translations"][selected_word_index]["VN"])
selected_words["EN"].append(words_dict["translations"][selected_word_index]["EN"])
# Thus, selected_words["EN"][0] will always be the correct translation of selected_words["VN"][0]
Next, we continue our function to randomly select the other three English words, ensuring these words meet the following two conditions:
- They must not match
selected_words["EN"][0]
, otherwise the correct answer will appear twice among our 4 options. - There must be no duplicates among our 4 options.
import random
def select_quiz_words(words_dict: dict) -> dict:
selected_words = {
"VN": [],
"EN": []
}
selected_word_index = random.randrange(0, len(words_dict["translations"]))
selected_words["VN"].append(words_dict["translations"][selected_word_index]["VN"])
selected_words["EN"].append(words_dict["translations"][selected_word_index]["EN"])
find_other_words = True
# Loop until 3 other words have been chosen
while find_other_words:
# Randomly determine an index to use for selecting an English word from our dataset
selected_en_word_index = random.randrange(0, len(words_dict["translations"]))
# Of course, this word must not be the one chosen above
if selected_en_word_index != selected_word_index:
# Nor should it already be in our list
if words_dict["translations"][selected_en_word_index]["EN"] not in selected_words["EN"]:
# If both conditions are met, then add it to our list
selected_words["EN"].append(words_dict["translations"][selected_en_word_index]["EN"])
if len(selected_words["EN"]) == 4:
find_other_words = False
return selected_words
We now have the data for our quiz.
The Content of the App
Alright, all of this is nice, but at the moment, our app still looks like a blank page that says hello. It's time to add some content! Let's start by adding two or three sentences that will explain to the player what they are doing here.
selected_words_dict = select_quizz_words(words_dict=st.session_state["words_dict"])
st.title("Hello Learners :wave:!")
st.subheader("Let's make a small game. I give you a Vietnamese word, and you try to give me the correct English translation. Let's go?")
st.write(f"What is the English translation of the word **{selected_words_dict['VN'][0]}**?")
It's a start. First, we load our quiz data into a global variable called selected_words_dict
using the function we wrote earlier. Then, we quickly explain the rules by displaying text very simply via st.title()
and st.subheader()
, and we present the Vietnamese word to be translated via st.write()
. You can find all the details of the Streamlit functions that allow you to display text on the screen here.
Next, let's propose the 4 English translation options to the user. We need to find a way to allow the player to interact with our app by making a choice. I've decided to use the Streamlit component st.button()
. I will display 4 buttons side by side horizontally, each displaying one of the 4 possible answers. The user will then have the choice to click on one of them to select the correct translation. Here is the code to implement these buttons:
# We 'copy' our list of English options and then shuffle it like a good cocktail
selected_en_words_dict_shuffled = selected_words_dict["EN"].copy()
random.shuffle(selected_en_words_dict_shuffled)
# We create a Streamlit layout composed of 4 columns.
col1, col2, col3, col4 = st.columns(4)
# And for each of them, we insert a clickable button containing the option
with col1:
word = selected_en_words_dict_shuffled[0]
st.button(label=word, key=word, on_click=check_result, args=[word], use_container_width=True)
with col2:
word = selected_en_words_dict_shuffled[1]
st.button(label=word, key=word, on_click=check_result, args=[word], use_container_width=True)
with col3:
word = selected_en_words_dict_shuffled[2]
st.button(label=word, key=word, on_click=check_result, args=[word], use_container_width=True)
with col4:
word = selected_en_words_dict_shuffled[3]
st.button(label=word, key=word, on_click=check_result, args=[word], use_container_width=True)
I decided to shuffle my array of answers because, remember, otherwise, we would end up with the correct answer always in the first position, which removes quite a bit of suspense from the exercise...
Next, we use the layout system of Streamlit to create 4 columns. For each of them, we add a clickable button representing one of the answers. Note that:
-
label
is the text of the button. -
key
is a key used by Streamlit to have unique buttons. -
on_click
is a callback method that we will discuss shortly. -
args
is an array of parameters that will be given to the callback method. We pass the button's option to validate or not the click. -
use_container_width
is a simple boolean allowing the button to know that it can take the entire width of the container it is in, here the column (and I didn't need to spend 5 hours on CSS to achieve this, which is a miracle 🙏).
This part of the code can probably be refactored, but I haven't found a correct way to do it yet. It might be the subject of a future article.
Validate the result
Our application is starting to have content. Now we need to validate the quiz result. As mentioned earlier, we will allow the player to see a score. We will also enable them to see their best score. So let's start by defining these two variables:
#...
if "words_dict" not in st.session_state:
st.session_state["words_dict"] = load_data()
if "score" not in st.session_state:
st.session_state["score"] = 0
if "best_score" not in st.session_state:
st.session_state["best_score"] = 0
# ...
Next, let's focus on the check_result callback method that was passed to our buttons earlier:
def check_result(choice):
if choice == selected_words_dict["EN"][0]:
st.session_state["score"] += 1
else:
if st.session_state["score"] > st.session_state["best_score"]:
st.session_state["best_score"] = st.session_state["score"]
st.session_state["score"] = 0
This method is very simple. If the user's choice matches the first element of our list of English options, then it's a jackpot, and we increment their score. Otherwise, we update the best score if it is strictly less than the current score, and then we reset the current score to 0.
We just need to display our scores:
st.subheader(body=f"Your current score is: {st.session_state['score']}")
st.subheader(body=f"Your best score is: {st.session_state['best_score']}")
And there you have it!
It's Time to Test
By using streamlit run app.py
, we can quickly launch and test our application.
So yes, it's not very pretty, but it's still presentable, and that without a single line of CSS. We end up with a little game that allows us to practice our Vietnamese. The goal we set seems to be achieved! Yay 🎉!
What's Happening Under the Hood?
At this point, it's important to emphasize a mechanism of Streamlit that everyone needs to keep in mind if you want to use this framework. Streamlit reloads this little Python script that we just implemented at EVERY interaction we have with the app. In our case, this means that every time the user clicks a button, the entire script is rerun. This explains several things...
It explains why some of my "variables," like the initial dataset, are loaded in this way:
if "words_dict" not in st.session_state:
st.session_state["words_dict"] = load_data()
Here, I leverage Streamlit's "cache" system to avoid having these data reloaded at every interaction. This is also what allows me to retain my score despite successive reloads. Indeed, at each rerun of the script, Streamlit checks if the st.session_state
dictionary contains an element associated with "words_dict"
(which it does) and therefore does not reload it.
This also explains why most variables and Streamlit statements are executed at the global scope of the script. For example, the proposals used for each question are loaded in the global scope:
selected_words_dict = select_quizz_words(words_dict=st.session_state["words_dict"])
I do it this way because I need this dictionary to be recreated each time to change the proposals and make the game, let's say... interesting! The buttons are thus also recreated each time with new values, which is our goal.
In Conclusion
Thanks to Streamlit, I managed to implement this little app in less than 100 lines of code and in just a few hours. So yes, I didn't start from scratch with Python, and the app is far from perfect, both technically and visually, but I'm quite happy with the result. I am very pleasantly surprised by the ease of use of the framework and the result.
The fact that the script is fully reloaded each time makes me think that the framework must have limitations for more complex apps. Moreover, the available components, although numerous, do not cover all use cases of more sophisticated apps. But for small prototypes and simple apps like the one presented in this article, it works just fine.
Regarding my app, I would like to make a few improvements in the future:
- Add a green message in case of a correct answer or a red one in case of an error, explaining the correct answer.
- Allow the user to choose the direction of the exercise: from Vietnamese to English or from English to Vietnamese.
- Allow the user to choose between a word translation exercise or a sentence translation exercise.
- Deploy my app online.
- Refactor some of the code.
- Add other features that I hope will come to mind later 🤔. This will probably be the subject of future articles.
That being said, I will continue to explore this framework that addresses one of my issues: being able to create simple apps in Python without having to spend several days learning to use a complex framework.
See you soon for new adventures in the simple and efficient world of Streamlit. Feel free to check the Gitlab repository to find the code for the article.
Posted on May 27, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.