How I used Google Cloud Platform to start investing in stocks
Muhammad Syuqri
Posted on October 6, 2018
I got interested in investing after attending a short talk recommended by a friend of mine. I decided to do some research and started reading The Little Book That Still Beats The Market by Joel Greenblatt. From the book, I found some formulas that could be useful to me when making decisions on whether or not to invest in the stocks of companies in Singapore. This post isn't to promote the book or its investment strategies, but more to showcase the following and how I did it:
- Interacting with Firestore through Python
- Running a Python script at specific time intervals on the Compute Engine
- Using Cloud Functions to retrieve data from Firestore
At first, I created a Python script for populating a Google Sheet with the financial details and self-calculated ratios from companies listed on the Singapore Exchange website. I found this a hassle as I had to do run the Python script everyday to get the updated prices of the stocks. I then decided to move this everyday process to the Google Cloud Platform so that I no longer have to do the everyday process myself, and leave it to the cloud to do it for me :D
The following will explain how I did what I did in hopes to help anyone else out there who might want to use the Google Cloud Platform in a similar fashion as I did.
Prerequisites
Before proceeding any further, I would like to note that the following have to been done first to keep this post short and simple. I have included links to get you started as well.
- Creating a Google Cloud Platform project
- Retrieving service account key
- Creating a Cloud Engine VM Instance
- Setting Up Firebase Cloud Functions
Overview
From the above diagram, the only thing I have to do is to make a GET request through the Cloud Functions HTTP API which will return all the already calculated formulas and values stored in the Firestore. Essentially, steps 1, 2 and 3 involve the Python script I have created. Steps 1 and 2 are done simply by using the Requests library.
Interacting with Firestore through Python
Firestore uses the concept of collections, documents and fields to store the data you want it to. So for instance, using the analogy of a book library, if you have a shelf of books, that is a collection in Firestore's viewpoint. The books themselves are documents, and each page in the book is a field on its own. Each document can have its own collection as well, but I will not get into that.
shelf [collection]
|--book1 [document]
|-- page1 [field]
|-- page2 [field]
|--book2 [document]
|-- page1 [field]
To interact and update data on the Cloud Firestore from your Python script, you first have to install the Google Cloud Firestore library via pip install google-cloud-firestore
. The following is the code snippet to initialize Firestore with your service account key that you have previously retrieved.
from google.cloud import firestore
db = firestore.Client.from_service_account_json('/path/to/service/key')
Well that is it actually! To write data to Firestore, simply do the following:
doc_ref = db.collection(u'name_of_collection').document(u'name_of_document')
doc_ref.set(data_to_update)
data_to_update
is a Python dictionary which holds the keys and respective values you would want the Firestore document to hold. The .set()
allows you to update or insert new fields into the document. For myself, I was putting the company name, stock prices, financial ratios and other fields here.
A point to note here is that even if the document or collection does not exist yet, .set()
function automatically creates the collection and document for you and populates the document with the fields as mentioned before.
Running a Python script on Compute Engine
There are a few ways of pushing your Python script to your VM Instance. How I did it was to create a repository in my Google Cloud project and pushed it there. The reason I created the repository was because I still wanted some form of version control as, knowing myself, I like to make changes and explore different ways to do things in my code and end up confusing myself in the end. Even though it is a small project, I felt it was a good practice for me personally. I then remotely accessed the VM Instance via SSH and cloned the repository into the instance.
Now for the scheduling of the Python script. Initially, I thought calling the Python script every 30 minutes was a good idea. However, after some consideration, I felt scheduling the script to run at 6pm (GMT +0800) was the ideal case because the Singapore Exchange opens at 9am and closes at 5pm, and I really only have time to view the stock prices after work anyway.
To schedule your Python script to run either at certain time intervals or at specific timings, you can use Cron jobs as I did. In the SSH session of your VM Instance, edit your user's Crontab using the crontab -e
command. At the end of the file, at your schedules in the following format
# m h dom mon dow command
0 10 * * 1-5 cd /path/to/python/folder && python main.py
The above snippet runs the Python script at 10am UTC (aka 6pm SGT), every weekday of the day, indicated by the 1-5
segment. If you would like your script to run after every time interval, you can do the following instead:
# Runs the command every hour at the 0th minute
0 */1 * * * <some command>
# Runs the command at the 0th minute every day
0 * */1 * * <some command>
Note: A mistake that I made during my first few times using Crontab in the VM Instance is the following:
# Runs the command every minute after every hour
* */1 * * * <some command>
My intention was to run it at every hour. But I missed the 0
at the minute mark of the cron job. So it was running the script at EVERY MINUTE AFTER EVERY HOUR. My script was taking around 3 minutes to run each time it was called. I did not mind the relatively long run time. However, since the script is being run every minute, and each takes 3 minutes to complete... Well, you can do the math. And silly me was trying to figure out why the CPU usage on my VM Instance was constantly at 150-200% and I could not even access it via SSH. That was a funny lesson :P
Using Cloud Functions to retrieve data from Firestore
For this step, I linked the Google Cloud project to Firebase. The reason I did this was for possible future versions in which I could host a website on Firebase Hosting, which taps on the data from the Cloud Firestore, allowing anyone to view the financial details at a glance. Another reason is also because I am much more familiar with Firebase and the requirements for Cloud Functions there.
I installed Express.js into my Cloud Functions folder via npm install --save express
. Express.js allows me to easily create web APIs as I needed multiple end-points for retrieving various company information from the Firestore I have.
var db = admin.firestore();
const express = require("express");
const app = express();
app.get('/:nameOfDocument',( req, res)=>{
const nameOfDocument = req.params.nameOfDocument;
var firestoreRef = db.collection("name_of_collection").doc(nameOfDocument);
res.setHeader('Content-Type', 'application/json');
firestoreRef.get().then((snapshot) => {
if (snapshot.exists) {
var returnObj = snapshot.data();
return res.status(200).json(returnObj);
}
else {
return res.status(422).json({error:"Invalid document name"});
}
}).catch(errorObject => {
return res.status(500).json({error:"Internal Server Error"});
});
})
exports.api = functions.https.onRequest(app);
Here is a step by step explanation of what is happening is the snippet above. Firstly, access to Firestore is initialized by var db = admin.firestore();
.
app.get('/:nameOfDocument',( req, res)=>{
...
}
The above tells the Express that we would like to create a GET request with the '/:nameOfDocument'
end-point, where :nameOfDocument
is a parameter in the URL. req
and res
are request and response objects which are received and going to be sent respectively. Currently, only the res
is being used, but more on that later.
const nameOfDocument = req.params.nameOfDocument;
This line takes the parameter from the URL, that is :nameOfDocument
in this case, and stores it as a variable called nameOfDocument
, which will be used in the next line.
var firestoreRef = db.collection("name_of_collection").doc(nameOfDocument);
This line essentially creates a reference to the document nameOfDocument
. The collection name is currently not a variable. You can also use include the name of collection as a parameter as such:
app.get('/:nameOfCollection/:nameOfDocument',( req, res)=>{
const nameOfDocument = req.params.nameOfDocument;
const nameOfCollection= req.params.nameOfCollection;
var firestoreRef = db.collection(nameOfCollection).doc(nameOfDocument);
...
}
This way, you can specify it in the URL without having to alter the code.
firestoreRef.get().then((snapshot) => {
if (snapshot.exists) {
var returnObj = snapshot.data();
return res.status(200).json(returnObj);
}
...
}
The above segment takes the reference mentioned earlier and checks if it exists. This is essential as a user might accidentally type a wrong document or collection name, and we would want to return the appropriate response. snapshot.data()
retrieves all the field key-value pairs and puts it in the object called returnObj
We then return this as a JSON object with a status code of 200.
exports.api = functions.https.onRequest(app);
This line tells Cloud Functions that when a request is made to <cloudfunctions.net url>/api
should be passed to the Express object called app
and handled accordingly based on the end-points specified in the app
object itself.
And that is it! You can now call your Cloud Functions from the link provided on the Firebase Cloud Functions page which will retrieve the relevant data that you want to work on from your Firestore.
P.S. This is my first tutorial/personal experience post. Kindly do let me know what can be improved and how I can be a better programmer as well. All constructive feedback are welcome. Thank you for reading through my post! :D
Posted on October 6, 2018
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.