End to End Deployment of Heart Disease Prediction Through Flask With Machine Learning Algorithm
Karan Choudhary
Posted on September 3, 2020
Artificial Intelligence directly translates to conceptualizing and building machines that can think and hence are independently capable of performing tasks, thus exhibiting intelligence. If this advancement in technology is a boon or a bane to humans and our surroundings is a never-ending debate.
Every coin has its two faces so it is difficult to judge on the basis as a human being and make a quick decision about it and say about the technology. Everything in this world whether it is living or no living it has its positive and negative impact over the world and is cache is popularity when it is fruitful for the society and living being other than its ill and major impact for the long term on the living beings.
Healthcare is no different. Particularly in the case of automation, machine learning, and artificial intelligence (AI), doctors, hospitals, insurance companies, and industries with ties to healthcare have all been impacted - in many cases in more positive, substantial ways than other industries.
Before starting we should discuss about the dataset we are using for the prediction .
Age,Sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,
thal, Target
These are some of the major independent value of the dataset which we are training for the prediction of the heart disease.
Installing Flask on your Machine
Installing Flask is simple and straightforward. Here, I am assuming you already have Python 3 and pip installed. To install Flask, you need to run the following command:
sudo apt-get install python3-flask
pip install flask
That's it! You're all set to dive into the problem statement take one step closer to deploying your machine learning model through flask.
Starting of implementation
Here we have folder structure of the machine learning deployment model through flask.
We will be implementing these code in jupyter and sublime text editor.Implementing the machine learning models lets go for importing library.
import libraries
import pandas as pd # for data manipulation or analysis
import numpy as np # for numeric calculation
import matplotlib.pyplot as plt # for data visualization
import seaborn as sns # for data visualization
import pickle #for dumping the model or we can use joblib library
Now next step is to load the data through pandas.
heart_df = pd.read_csv('heart_disease.csv')
Now next step is to see the data frame of the data.
Head of heart DataFrame
heart_df.head(6)
Info about the model(gives null value and count the non float values)
Information of heart Dataframe
heart_df.info()
Numerical description about the data (mean,median,25%,interquantile range and many other value of each feature.
Numerical distribution of data
heart_df.describe()
Heatmap
heatmap of DataFrame
plt.figure(figsize=(16,9))
sns.heatmap(heart_df)
Heatmap of a correlation matrix
heart_df.corr()#gives the correlation between them
Heatmap of Correlation matrix of breast heartDataFrame
plt.figure(figsize=(20,20))
sns.heatmap(heart_df.corr(), annot = True, cmap ='coolwarm', linewidths=2)
Split DataFrame in Train and Test
Input variable
input variable
X = heart_df.drop(['target'], axis = 1)
X.head(6)
Output variable
output variable
y = heart_df['target']
y.head(6)
Split dataset for training and
split dataset into train and test
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state= 5)
Feature scaling of data
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train_sc = sc.fit_transform(X_train)
X_test_sc = sc.transform(X_test)
Machine Learning Model Building
1.Suppor vector Classifier
from sklearn.metrics import confusion_matrix, classification_report, accuracy_score #for classification report
Svm model
from sklearn.svm import SVC
svc_classifier = SVC()
svc_classifier.fit(X_train, y_train)
y_pred_scv = svc_classifier.predict(X_test)
accuarcy_svm=accuracy_score(y_test, y_pred_scv)
print(accuarcy_svm)#Output 0.5789473684210527
2.Logistic Regression
Logistic Regression
from sklearn.linear_model import LogisticRegression
lr_classifier = LogisticRegression(random_state = 51, penalty = 'l1')
lr_classifier.fit(X_train, y_train)
y_pred_lr = lr_classifier.predict(X_test)
accuracy_score(y_test, y_pred_lr)
accuarcy_lr=accuracy_score(y_test, y_pred_lr)
print(accuarcy_lr)#Output 0.9736842105263158
3.Decision Tree Classifier
Decision Tree Classifier
from sklearn.tree import DecisionTreeClassifier
dt_classifier = DecisionTreeClassifier(criterion = 'entropy', random_state = 51)
dt_classifier.fit(X_train, y_train)
y_pred_dt = dt_classifier.predict(X_test)
accuarcy_dt=accuracy_score(y_test, y_pred_dt)
print(accuarcy_dt)
Output 0.9473684210526315
- XGboost classsifier # XGBoost Classifier from xgboost import XGBClassifier xgb_classifier = XGBClassifier() xgb_classifier.fit(X_train, y_train) y_pred_xgb = xgb_classifier.predict(X_test) y_pred_xgb=accuracy_score(y_test, y_pred_xgb) accuarcy_xgb=accuracy_score(y_test, y_pred_xgb) print(accuarcy_xgb) #Output 0.9823684210526315 Similarly, we have to do for test data and implement them on we can see that their will be no overfitting and underfitting the test data. their should low bias and low variance. Accuracy on test data and the similar code for test data . #accuracy of all the classifier test data Accuracy of Support vector Classifier - 0.5789456522520 Accuracy of Decision tree Classifier -0.8473684210526315 Accuracy of Logistic regression- 0.570456140350877 Accuracy of XGBoost Classifier - 0.982456140350877 As,we can conclude the test data is performing nearly good result in Xgboost classifier with low bias and variance. For further improving we should go for the tuning method such as randomised search and grid search on Xgboost because we want our accuracy to be more optimal and fix all contraints like precision ,recall ,beta value and support which are import to satisfy to overcome the Type I and Type II Error. Randomized search Applying randomized search on the model which works on sample of data and it works more faster than any search tuning method params={ "learning_rate" : [0.05, 0.10, 0.15, 0.20, 0.25, 0.30 ] , "max_depth" : [ 3, 4, 5, 6, 8, 10, 12, 15], "min_child_weight" : [ 1, 3, 5, 7 ], "gamma" : [ 0.0, 0.1, 0.2 , 0.3, 0.4 ], "colsample_bytree" : [ 0.3, 0.4, 0.5 , 0.7 ] } # Randomized Search from sklearn.model_selection import RandomizedSearchCV random_search = RandomizedSearchCV(xgb_classifier, param_distributions=params, scoring= 'roc_auc', n_jobs= -1, verbose= 3) random_search.fit(X_train, y_train) Finding the best and optimize parameter. random_search.best_params_#output {'min_child_weight': 1, 'max_depth': 12, 'learning_rate': 0.3, 'gamma': 0.3, 'colsample_bytree': 0.7} random_search.best_estimator_ #output XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.7, gamma=0.3, learning_rate=0.3, max_delta_step=0, max_depth=12, min_child_weight=1, missing=None, n_estimators=100, n_jobs=1, nthread=None, objective='binary:logistic', random_state=0, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, seed=None, silent=None, subsample=1, verbosity=1) # training XGBoost classifier with best parameters xgb_classifier_pt = XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.4, gamma=0.2, learning_rate=0.1, max_delta_step=0, max_depth=15, min_child_weight=1, missing=None, n_estimators=100, n_jobs=1, nthread=None, objective='binary:logistic', random_state=0, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, seed=None, silent=None, subsample=1, verbosity=1)xgb_classifier_pt.fit(X_train, y_train) y_pred_xgb_pt = xgb_classifier_pt.predict(X_test) Accuracy after model accuracy_score(y_test, y_pred_xgb_pt)#output - 0.9824561403508771 Grid search Applying grid search on the model which works on whole data. Training the model from sklearn.model_selection import GridSearchCV grid_search = GridSearchCV(xgb_classifier, param_grid=params, scoring= 'roc_auc', n_jobs= -1, verbose= 3) grid_search.fit(X_train, y_train) Now comes the implementing it xgb_classifier_pt_gs = XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.3, gamma=0.0, learning_rate=0.3, max_delta_step=0, max_depth=3, min_child_weight=1, missing=None, n_estimators=100, n_jobs=1, nthread=None, objective='binary:logistic', random_state=0, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, seed=None, silent=None, subsample=1, verbosity=1)xgb_classifier_pt_gs.fit(X_train, y_train) y_pred_xgb_pt_gs = xgb_classifier_pt_gs.predict(X_test) accuracy_score(y_test, y_pred_xgb_pt_gs) #output 0.9824561403508771 As,we are getting the nearly same accuracy after applying these tuning method so we will use grid search in this know comes the part of classification report and types of error. Confusion matrix It gives the value of true positive and false negative which will help to predict how much our model is optimized to predict it. from sklearn.metrics import confusion_matrix, classification_report cm = confusion_matrix(y_test, y_pred_xgb_pt) plt.title('Heatmap of Confusion Matrix', fontsize = 15) sns.heatmap(cm, annot = True) plt.show() Saving model for deployment pickle.dump(xgb_classifier_pt, open('heart_disease_detector.pickle', 'wb'))# load model heart_disease_detector_model = pickle.load(open('heart_disease_detector_detector.pickle', 'rb')) Now are model is dumb into pickle file.now its time for the flask for the model to deploy. Now we have to switch towards sublime text editor for the deployment.the main aim is to used html,css with flask in it. from flask import Flask,render_template,url_for,request import pandas as pd from sklearn.externals import joblib import numpy as np app = Flask(name) @app.route('/') def home(): return render_template('home.html') def getParameters(): parameters = [] #parameters.append(request.form('name')) parameters.append(request.form['age']) parameters.append(request.form['sex']) parameters.append(request.form['cp']) parameters.append(request.form['trestbps']) parameters.append(request.form['chol']) parameters.append(request.form['fbs']) parameters.append(request.form['restecg']) parameters.append(request.form['thalach']) parameters.append(request.form['exang']) parameters.append(request.form['oldpeak']) parameters.append(request.form['slope']) parameters.append(request.form['ca']) parameters.append(request.form['thal']) return parameters @app.route('/predict',methods=['POST']) def predict(): model = open("data/Heart_model.pkl","rb") clfr = joblib.load(model) if request.method == 'POST': parameters = getParameters() inputFeature = np.asarray(parameters).reshape(1,-1) my_prediction = clfr.predict(inputFeature) return render_template('result.html',prediction = int(my_prediction[0])) if name == 'main': app.run(debug=True) The code here depicts the loading of dumb model and then we are accessing the home.html file for the home page which we can discuss further. then we have the predict function which will be implemented when we will enter the input 13 constraints as an input in the box and then array of size 13 goes to the data frame for the prediction and return will make the after.html file which tell about the output as an healthy heart according to the value input by user and new webpage open with this classification whelther aperson suffering from it or not. In this file we have stored the data and then we will come to know about the input through the data entered by the user and we will come to output and with best prediction on the basis of input value we will enter .Basically this file will work when we have data input from the home.html file where user wll enter the data on the basis of his/her profile. Basically this file works as an home page to the web app through flask. We are counter with the input value on the basic of a person profile and how a person will be responsible for his health and how to mange its health for the future for long lasting . home of flask appThis webpage helps us to enter the details and then by clicking the predict button we will come to an output about personal heart's health. This will be predict that the prediction will be healthy heart . If you want to implement code from your hand then click the button and implement code end to end with explanation in brief. If u like to read this article and have common interest in similar projects then we can grow our network and can work for more real time projects. For more details connect with me on my Linkedin account! THANKS!!!!
Posted on September 3, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.