Jorge
Posted on November 24, 2019
This is a simple 'prices scrapping' application to check a list of URLs of products and notify via Telegram if some of them are under or bellow a price
This post is an example to show how we can mix different technologies without spend money to have a daily alert if some prices are changed. Be aware this is not a "proffesional" way to do it and we can consider it as a "toy" or as a "lab" to learn
Requirements
In order to have a fully functional example you’ll need:
a bot Telegram
a Google Sheet plus
credentials from Google Project
a Gitlab account (you can use Github or similar but this example use Gitlab pipeline)
Telegram
We’ll use Telegram as the channel to notify us about the changes of prices. You’ll need to have two things:
Telegram installed into our mobile phone (also you can access via web browser)
a Telegram bot
First step it’s easy and similar as install other messanger applications.
To create our bot we’ll use the Telegram application to talk with 'BotFather' a bot from Telegram able to create bots (https://core.telegram.org/bots)
Basically we’ll order it to create a new boot writting "/newbot" and following his instructions (a name, a description and so on) to obtain a token similar as 12312312:AAAAAAAAAAAAAAAAAAAaaY . DON’T SHARE THIS TOKEN AND DON’T STORE IT IN YOUR REPO
To allow our bot to talk with you, you need to start the conversation, so search your bot with the Telegram’s search button and send it a hello with the '/start' command
Also we’ll need to know your telegram client id. You can use the existing bot '@userinfobot' who reply every message with info about your account to obtain your client id plus other information. YOU CAN SHARE THIS ID, IT’S NOT SO IMPORTANT, BUT AS WITH THE TOKEN WE’LL KEEP IT SECRET
Google Service
Probably this is the part most obscure of the process. If you have a Google account, you can create projects and deploy Google AppEngine, Kubernetes, and a lot of Google services.
So open https://console.cloud.google.com/ and follow instructions to create your first project (but for this tutorial you don’t need to deploy anything, only create the project)
Once create the project we’ll need to enable the Google Sheet API:
search Sheet and enable it
Also we’ll need to create a service and generate a credentials file from it:
create a SERVICE credentials
After create the service, Google will download automatically a JSON file. KEEP IT SECRET AND DON’T STORE INTO YOUR REPO
We’ll need the email of the service account (something similar to your-awesome-service@your-awesome-project.iam.gserviceaccount.com) in the next step
Google Sheet
The application will read a Google Sheet with a simple structure as this:
When you are editing the sheet you can find the ID of it in the URL:
You we’ll need this ID and IT’S BETTER NOT KEEP IT INTO YOUR REPO
In order the application can read this sheet we need to share it with the service email created previously so click on 'Share' and add your service as collaborator (don’t send the notification email because nobody will be listening and you will receive a notification error email)
Application
You can download the application from https://gitlab.com/jorge-aguilera/scrapping-prices
Basically is a "one only class application" who
reads a google sheet’s range "A1:E99" (yes, this example only works with a max of 99 articles),
opens a Geb Browser per row
use a custom css selector to find the price element.
if the value of the item is lower or upper than the associate rule it add the item to a list.
at the end it sends an http POST to the channel with the summary
Gitlab
The repo is allocate at Gitlab and uses the pipeline capability of it to run every day the run
task
Basically we need to configure some environment variables:
SHEET_ID (the id of the sheet)
TABS ("Sheet 1" or whatever you use)
TELEGRAM_TOKEN (the token obtained via BotFather)
TELEGRAM_CHANNEL (the telegram userId)
GOOGLE_APPLICATION_CREDENTIALS (use as File instead of variable and paste the content of the credentials.json)
And set the schedule we want to use, for example:
Conclusion
I’ll recapt:
you have a bot telegram with a TOKEN
you have your telegram id and you’ve started a conversation with your bot
you have a Google Sheet, you have the Sheet Id and you’ve added the service as collaborator
you have a credentials file in a safe place
you have a repo in Gitlab with several environments variables configured
Basically we have someone (Gitlab Pipeline) running our application every day, reading a Google Sheet (via a service account) and sending us a message (vía our bot telegram)
Posted on November 24, 2019
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.