How To Evaluate Machine Learning Model Performance Using Accuracy Score Metric

Machine learning systems have been the go-to solution for most real-world problems. Just like any other system, a machine learning model is also prone to errors. It is always important to evaluate the performance and trustworthiness of a machine learning model.

Evaluation metrics vary across different machine learning algorithms. In this article, you will learn the common technique used to determine the performance of a machine learning classification algorithm.

Prerequisites

A basic understanding of machine learning models(you have built at least one classification model).
A basic understanding of evaluation metrics(this article will teach you how to implement accuracy score metrics).
A basic understanding of python and data science libraries(Numpy, Pandas, Matplotlib, Scikit learn, etc).

What is accuracy score

Accuracy score can be defined as the fraction of correct predictions by our machine learning model.
In other words, accuracy can be said to as the number of correct predictions divided by the total number of predictions.
Accuracy is not a good evaluation metric to use when you have an imbalanced dataset.

Implementing accuracy score

To implement accuracy score, it is compulsory to train a machine learning model.
Therefore, let's build a simple classification model and use the accuracy score to determine the model's performance.

We will use a Kaggle dataset to build a churn prediction model. This model will
classifies a customer as churning or not churning, based on the customer data.
Churn prediction is a system that detects sets of customers that a likely to stop using your products or services. For example, it can be applied to internet provider service, telecommunication service, and so on.

Let's start building.

Note: This article focuses on teaching you how to evaluate the performance of a machine learning model in making predictions, therefore other processes that precede and proceed building a model won't be elaborated.

This article won't show you how to clean, analyze, and preprocess data.

Step 1:

Import data science libraries

# import libraries 
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction import DictVectorizer
from sklearn.feature_extraction import DictVectorizer

Step 2:

Read the downloaded data

data=pd.read_csv(train.csv") #read the CSV data 
data.head().T #displays the first five columns of the data

Code output

Step 3:

Data preparation


data.churn=(data.churn=="yes").astype(int)
data_train,data_test= train_test_split(data,test_size=30,random_state=1)
Y_train=data_train["churn"]
Y_test=data_test["churn"]
del data_train["churn"]
del data_test["churn"]
train_dicts=data_train.to_dict(orient='record')
test_dicts=data_test.to_dict(orient='record')
dv=DictVectorizer(sparse=False)
dv.fit(train_dicts)
X_train=dv.transform(train_dicts)
X_test=dv.transform(test_dicts)

Explanation:

data.churn = (data.churn == "yes").astype(int) This line converts the "churn" column in the data DataFrame into numeric values. It assigns 1 to rows where the value is "yes" (indicating churn) and 0 to rows where the value is not "yes" (indicating no churn).

data_train, data_test = train_test_split(data, test_size=30, random_state=1) This line splits the data into training and testing datasets using the train_test_split function from the scikit-learn library. It randomly splits the data, with "30%" of the data assigned to the test dataset and the remaining 70% to the training dataset used for training the model.

Y_train = data_train["churn"]and Y_test = data_test["churn"] These lines extract the target variable ("churn") from the training and testing datasets and assign them to Y_train and Y_test, respectively.

del data_train["churn"] and del data_test["churn"] These lines remove the "churn" column from the training and testing datasets since it has been separated into the target variables.

train_dicts = data_train.to_dict(orient='record') and test_dicts = data_test.to_dict(orient='record') These lines create dictionaries for the training and testing datasets that include only the selected categorical and numeric columns.

dv = DictVectorizer(sparse=False) and dv.fit(train_dicts) These lines create an instance of the DictVectorizer class and fit it on the training dictionaries (train_dicts). The DictVectorizer class is used to convert the dictionaries into a numeric representation suitable for machine learning algorithms. It's a simple way of implementing one One-Hot encoding. To learn about One-Hot encoding with DictVectorizer check out this helpful guide

X_train = dv.transform(train_dicts) and X_test = dv.transform(test_dicts) These lines transform the training and testing dictionaries into feature matrices.

Step 4:

Logistic regression model

from sklearn.linear_model import LogisticRegression
model=LogisticRegression()
model.fit(X_train,Y_train)

We have successfully built a classification model using the LogisticRegression algorithm. We now have a model to evaluate.

Let's go on to evaluate the performance of this model on a test dataset by checking how accurately the model predicts customers that will churn.

One way of checking the accuracy of a model is by using the Scikit Learn library.
Let's implement it.

from sklearn.metrics import accuracy_score
prediction= model.predict(X_test)
accuracy=accuracy_score(Y_test,prediction)
print("Your Logistic Regression model has an accuracy of {}".format(accuracy))

Code output

Explanation:

from sklearn.metrics import accuracy_score This imports the accuracy_score function from the sklearn.metrics module.

prediction = model.predict(X_test) This line predicts the target variable (Y_test) using the trained Logistic Regression model (model) on the test features (X_test). The predicted values are stored in the prediction variable.

accuracy = accuracy_score(Y_test, prediction) The accuracy_score function takes two arguments: the true target variable (Y_test) and the predicted values (prediction). It compares the predicted values with the actual values and calculates the accuracy of the model.

print("Your Logistic Regression model has an accuracy of {}".format(accuracy)) This line prints the accuracy of the Logistic Regression model.

Note that,

Y_test refers to the actual values of the target variable from the test dataset.
X_test refers to the feature variables of the test dataset.

Source Code

Conclusion

Our model has an accuracy score of approximately 0.87, this indicates that our model's predictions are correct about 87% of the time.
In other words, when we apply the model to new, unseen data, we expect the model to be correct 87% of the time based on the accuracy score.
It is important to note that accuracy score alone may not provide a complete picture of our model performance, and it's beneficial to consider other metrics. Therefore, I will make continuations of this article, there you will learn other important model evaluation techniques.

If you have any questions or spot any errors, reach out to me on Twitter, or LinkedIn. I appreciate your engagement and look forward to hearing from you.

Blog