Deploying Data Visualizer for Nepal Stock Exchange using Streamlit on Heroku

qviper

Viper

Posted on November 25, 2020

Deploying Data Visualizer for Nepal Stock Exchange using Streamlit on Heroku

Introduction

Visualise Nepal Stock Exchange Data and Deploy it On Heroku Using Streamlit and Plotly.

Originally Published on q-viper.github.io

If you are from Nepal then you already know what Nepal Stock Exchange means. Honestly I don't know how to describe stock market but after watching movie Pi, I wanted to take a look at NEPSE data by myself. I am not sure about how old data are present on the NEPSE but I think there are data available from 1980. I started to scrap data on 2019 January and it was very difficult task for me to write the right codes. But there are great and kind programmers who shares their great skills on GitHub. So I came across this repository. I started to scrap all the data of past transaction and it was quite frustrating after server hanged up so I found the best way of scrapping is by scrapping data by each year and then write it on CSV file. Please follow this mine repository for more scrapping code but note that We don't use any of those data so just look that repository for more information only (if available). And rest is on this blog.

Here is a demo Heroku app that we will be building.

Demo

Where was I lost?

From nearly a month, I had not written any blogs because we had our huge festival in Nepal, Dashain and Tihar. Also I was playing lots of games like Injustice Gods Among Us. And I have also been busy watching series like Silicon Valley and Mr. Robot. And good news about me must be I have joined NAAMI as Junior Unity Software Developer from last week.

Inspiration

2020 is not a great year for anyone because of the pandemic and students like me are suffering from hard anxiety of their career and I am one of them. But I have met some awesome people on the web whose work always keeps me inspiring to write more. And I have noticed that Jithendra's blogs and LinkedIn posts are very insightful for learning Data Science.

Contribution

If you are looking right into this blog please try to contribute to Agriculture field by finding a model to classify and detect corn leaf infection. I have worked hard to prepare dataset and made it live on kaggle. Also if you are interested in my more awesome work then Contour Based Writing System must be your first choice.

Getting Systems Ready

Of course we need python along with few other libraries. If you are wondering what should be the requirements then please follow the repository and download all the codes then TADAAAAA!!!! But to describe it on steps, please follow below steps:

  • Make a virtual environment using venv
  • Install below libraries best way to download is using pip install -r requirements.txt(Assuming that you have downloaded my repository above):

    beautifulsoup4==4.8.0
    numpy==1.17.3
    pandas==0.25.1
    plotly==4.12.0
    requests==2.18.4
    requests-oauthlib==1.2.0
    requests-toolbelt==0.9.1
    retrying==1.3.3
    streamlit==0.71.0
    urllib3==1.22
    uvicorn==0.12.2
    matplotlib==3.3.3
    html5lib==1.0.1
    lxml==4.4.1
    

Importing Dependencies

Make a python file nepse.py under your project folder.

import streamlit as st
import numpy as np
import pandas as pd
from bs4 import BeautifulSoup as BS
import requests
import urllib3
import time
import matplotlib.pyplot as plt
import plotly.express as px

Enter fullscreen mode Exit fullscreen mode
  • streamlit is used to make a web app and honestly this is the most easiest, fastest and awesome way of deploying ML APPS.
  • numpy as usual.
  • pandas for data processing (mostly masking, making dataframes).
  • BeautifulSoup, we can scrap the data from NEPSE's site.
  • requests for making HTTP requests.
  • urllib3 for handling HTTP client.
  • time as usual.
  • matplotlib for making simple visualization.
  • plotly for making cool visualization.

Method: company_names()

import streamlit as st
import numpy as np
import pandas as pd
from bs4 import BeautifulSoup as BS
import requests
import urllib3
import time
import matplotlib.pyplot as plt
import plotly.express as px

def company_names():
    http = urllib3.PoolManager()
    http.addheaders = [('User-agent', 'Mozilla/61.0')]
    web_page = http.request('GET', "http://www.nepalstock.com.np/company?_limit=500")
    soup = BS(web_page.data, 'html5lib')
    table = soup.find('table')
    company=[]
    rows = [row.findAll('td') for row in table.findAll('tr')[1:-2]]
    col = 0
    notfirstrun = False
    for row in rows:
        companydata =[]
        for data in row:
            if col == 5 and notfirstrun:
                companydata.append(data.a.get('href').split('/')[-1])
            else:
                companydata.append(data.text.strip())
            col += 1
        company.append(companydata)
        col =0
        notfirstrun = True

    df = pd.DataFrame(company[1:],columns=company[0])
    df.rename(columns={'Operations':'Symbol No'},inplace=True)
    df.index.name = "SN"
    df.drop(columns='',inplace=True)
    df.drop(columns='S.N.',inplace=True)
    return df
company_names()
Enter fullscreen mode Exit fullscreen mode
.dataframe tbody tr th:only-of-type {
    vertical-align: middle;
}

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
Enter fullscreen mode Exit fullscreen mode
Stock Name Stock Symbol Sector Symbol No
SN
0 10 % NMB DEBENTURE 2085 NMBD2085 Corporate Debenture 2850
1 10% Himalayan Bank Debenture 2083 HBLD83 Corporate Debenture 2873
2 10% Nabil Debenture 2082 NBLD82 Corporate Debenture 2892
3 10% Nepal SBI Bank Debenture 2086 SBIBD86 Corporate Debenture 2890
4 10% Prabhu Bank Debenture 2084 PBLD84 Corporate Debenture 2904
... ... ... ... ...
268 UNIVERSAL POWER COMPANY LTD UPCL Hydro Power 2810
269 Unnati Sahakarya Laghubitta Bittiya Sanstha Li... USLB Microfinance 2774
270 Upper Tamakoshi Hydropower Ltd UPPER Hydro Power 2792
271 Vijaya laghubitta Bittiya Sanstha Ltd. VLBS Microfinance 687
272 Womi Microfinance Bittiya Sanstha Ltd. WOMI Microfinance 706

273 rows × 4 columns

This method is used to get list of company names.

After calling the method above, a dataframe just like above must be shown.
It is clear that there are 273 companies. We will be using columns like Stock Symbol and Symbol No.

Visualising Only Limited Data on Streamlit

Why to show entire dataframe on webapp while we can show only desired? So we will be using iloc on dataframe i.e. slicing on index.

@st.cache(suppress_st_warning=True)
def load_data():
    # data = pd.read_csv(DATA_URL, nrows=nrows)
    data = company_names()

    lowercase = lambda x: str(x).lower()
    data.rename(lowercase, axis='columns', inplace=True)
    return data

st.subheader("View the list of Companies.")
def view_company_names():
    st.markdown("(Base URL is [http://www.nepalstock.com.np/company?_limit=500](http://www.nepalstock.com.np/company?_limit=500))")
    num = st.number_input("Enter how many records to view?", min_value=1, max_value=None, step=1)
    data_load_state = st.text('Loading data...')
    data = load_data()
    data_load_state.text("Done! (using st.cache)")
    if st.checkbox('Show raw data'):
        st.subheader('Raw data')
        st.write(data.iloc[:num])
    return data

cdf = view_company_names()
Enter fullscreen mode Exit fullscreen mode

2 methods are given above.

Method: load_data()

This method uses the decorator cache to store data in cache and to ignore the cache warning, suppress_st_warning=True is used. This method calls the method comapny_names() and gets the dataframe then makes all the column name to lowercase and then returns that dataframe.

Method: view_company_names()

This method is for visualizing actual data. It allows us to input the number and then the dataframe is sliced up to that index.

  • Streamlit allows us to take number input using number_input() method.
  • streamlit allows us to show text usign text() method.
  • Then we load our data from previous method and then after data has been loaded, show log like text.
  • streamlit allows us to use checkbox using checkbox() then if it is ticked then show data up to the number user has entered.
  • Also return the original data because it will be used later.

Now save the code above and then from terminal opening on same directory as this file(assume that file name is neplse.py), do

streamlit run nepse.py
Enter fullscreen mode Exit fullscreen mode

If everything is okay then streamlit will provide us a local link with unique port number. By clicking it, we can visualize our first ever NEPSE data visualizer. It should look like below.
image

View Company Details

st.subheader("View Company Details.")
def view_company_details():
    symbol = st.text_input("Enter Stock Symbol. (For Reference, see above record table.)")
    symbol_no = None
    if len(symbol)>=2:
        url = "http://www.nepalstock.com/company/"

        try:
            req = requests.post(url, data={"stock_symbol":symbol}, verify=False)
            symbol_no = cdf[cdf["stock symbol"]==symbol]["symbol no"]
        except requests.exceptions.RequestException as e:
            print(e)

        response = req.text
        soup = BS(response, "lxml")
        table = soup.find("table")

        for row in table.findAll("tr")[4:]:
            col = row.findAll("td")
            st.write(col[0].string,": ",col[1].string)

    return symbol_no

symbol_no = view_company_details()
if  symbol_no is not None:
    symbol_no = symbol_no.tolist()[0]
Enter fullscreen mode Exit fullscreen mode

The method above is used to visualize the company details.

  • Ask user for the Stock Symbol (ex NMB, ADBL etc).
  • Same as earlier method, it makes post request to url http://www.nepalstock.com/company/ and then uses that result to view details.
  • If the Stock Symbol matches then get symbol number assigned to it. Then do some data cleaning and visualize the result by st.write()
  • Return the Symbol Number.

The result must be something like below.
img

Visualize Data of some time period

First we need to get that data. As I stated earlier, sometimes scraping data of entire history will not be feasible and last time I tried that, my laptop crashed so scrapping part by part will be great idea.

Method: CompanyStocksTransactions()

Method: view_by_year()

  • Method takes start date, end date and symbol number of that company symbol.
  • Call the method CompanyStocksTransactions to get dataframe within that date.
  • Return that dataframe.

Before calling methods, we try to get date from the streamlit's date_input method and then pass that date to view_by_year method. Also make a checkbox whether to see raw dataframe or not.

st.subheader("Check Company's Progress in Years")
def CompanyStocksTransactions(SymbolNo,startDate, endDate):
    url="http://www.nepalstock.com.np/company/transactions/%s/0/?startDate=%s&endDate=%s&_limit=9000000"%(SymbolNo,startDate, endDate)
    #print("Connecting to %s "%url)
    http = urllib3.PoolManager()
    http.addheaders = [('User-agent', 'Mozilla/61.0')]
    web_page = http.request('GET',url)
    #print("Adding to DataFrame")
    soup = BS(web_page.data, 'html5lib')
    table = soup.find('table')
    FloorSheet=[]
    rows = [row.findAll('td') for row in table.findAll('tr')[1:-2]]
    for row in rows:
          FloorSheet.append([data.text.strip() for data in row])
    if(len(FloorSheet) != 0):
          FloorSheetdf = pd.DataFrame(FloorSheet[1:],columns=FloorSheet[0])
          FloorSheetdf['Date']=pd.to_datetime(FloorSheetdf['Contract No'], format='%Y%m%d%H%M%f', errors='ignore')
          return (1, FloorSheetdf)
    else:
          return (0, None)

@st.cache(suppress_st_warning=True)    
def view_by_year(start_date="2020-1-1", end_date="2020-1-2", symbol="2810"):
    st.write("From year %s to %s "%(start_date, end_date))

    success, dftest=CompanyStocksTransactions(symbol, start_date, end_date)
    if success==1:
        st.write("Successfully scrapped data. Showing results.")
        # st.write(dftest)
    else:
        st.write("Can't scrap data. Try using another symbol. Or Another date.")
    return dftest
start_date = st.date_input("Please input start date.")
end_date = st.date_input("Please input end date.", min_value=start_date)
dfyear = view_by_year(start_date=str(start_date), end_date=str(end_date), symbol=symbol_no)
show_df = st.checkbox("Show Data")

if show_df:
    st.write(dfyear)
Enter fullscreen mode Exit fullscreen mode

The result must be like below:
img

Do Real Visualization

Now is the time for doing visualization of our time series data. We have already scrapped the data of our interest now is the time to use image to tell story.

  • Prepare the checkboxes to visualize data.
    • Date vs Buyer Broker
    • Date vs Seller Broker
    • Date vs Amount
    • Date vs Rate
date_vs_bbroker = st.checkbox("Date Vs Buyer Broker")
date_vs_sbroker = st.checkbox("Date Vs Seller Broker")
date_vs_amount = st.checkbox("Date vs Amount")
date_vs_rate = st.checkbox("Date vs Rate")

def visualise_broker():
    if date_vs_bbroker:
        st.subheader("Date Vs Buyer Broker")
        fig = px.scatter(dfyear, x="Date", y="Buyer Broker")
        st.plotly_chart(fig)

    if date_vs_amount:
        st.subheader("Date Vs Amount")
        fig = px.scatter(dfyear, x="Date", y="Amount")
        st.plotly_chart(fig)

    if date_vs_sbroker:
        st.subheader("Date Vs Seller Broker")
        fig = px.scatter(dfyear, x="Date", y="Seller Broker")
        st.plotly_chart(fig)

    if date_vs_rate:
        st.subheader("Date Vs Rate")
        fig = px.scatter(dfyear, x="Date", y="Rate")
        st.plotly_chart(fig)

visualise_broker()
Enter fullscreen mode Exit fullscreen mode
  • Use plotly to scatter our data by giving dataframe and then x and y axis.
  • Use the figure made by plotly to view it on web app using streamlit's plotly_chart.

Our result must be like below.
img

If you have came this far then congratulations, you have just made your streamlit app to visualize NEPSE data. Now is the time to make it live on Heroku.

To deploy it on Heroku, we need to have a Heroku Account. Please make one. After making an account, make a new heroku app from here. Give it a proper name.

Preparing additional files

Assume that your main file name is nepse.py. I am not going to explain in detail but if you are concerned then please follow this blog where i have written about deploying Face Mask Classifier on Heroku. Additional files required are:

  • requirements.txt: Using

    pip freeze > requirements.txt
    
  • Procfile: Process file.

    web: sh setup.sh && streamlit run nepse.py
    
  • setup.sh: Setup file.

    mkdir -p ~/.streamlit
    
    echo "[server]
    headless = true
    port = $PORT
    enableCORS = false
    " > ~/.streamlit/config.toml
    

After making all files then make one Git repository and push all the changes to it because we will be integrating our repository to Heroku and Heroku itself will process our files but beware of free version because we have limited number of build number available.

  • Make sure you are selecting GitHub on Deployement Method.
    img

  • Then select the repository you are using to visualize on App Connected to GitHub.
    img

  • Chose the right branch.

  • If you want automatic deploys, i.e re deploy whenever changes found. Using manual deploy will be best for free version.

  • Click on Deploy Branch and let it finish the install dependencies and then after completing run launch the app.

  • If error occurs please let me know on comment section.

Finally

Thank you so much for reading my blog and I really hope that this blog has been some help for you too. I am always eager to meet awesome people like you so please find me on Twitter as QuassarianViper and LinkedIn as Ramkrishna Acharya.

Why not read more?

💖 💪 🙅 🚩
qviper
Viper

Posted on November 25, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related