Support Vector Regression (SVR) Using Python : A practical approach to Predictive Modeling
Nitin Kendre
Posted on June 2, 2023
Introduction :
Support Vector Regression (SVR) is a powerful algorithm used to solve regression problems. It is a part of Support Vector Machines (SVM) which is used for nonlinear relationships between variables.
In this article we will learn how to implement it using python language.
Understanding SVR :
The goal of SVR is to find the hyperplane that best fits the data points, while allowing a margin of tolerance for errors. While traditional regression models focus on minimizing the errors, SVR focuses on data points within a specific margin. SVR Operates on the premise that only support vectors and the data point close to the margin, which significantly affects the model's performance.
For more information on SVR you can refer this blog post LINK.
Implementing SVR Using Python :
We will implement SVR algorithm using sklearn
library from pyhton
language.
Below are the steps of implementation -
Step 1 : Importing Necessary Libraries
import numpy as np
import pandas as pd
import matplolib.pyplot as plt
Step 2 : Loading the Dataset in preparing it
We have discussed all data preprocessing steps previous article. You can refer that for more tools to data preparation LINK
salary_data = pd.read_csv('Position_Salaries.csv')
## Below line will print first 10 rows from data.
salary_data.head(10)
creating Variables
we will create dependent and independent variables in this step.
## Creating independent variable
x = salary_data.iloc[:, :-1].values
## Creating dependent variable
y = salary_data.iloc[:, -1].values
Step 3 : Feature Scaling
It is an important step in machine learning that brings all features in similar scales. It ensures that no single feature dominates the learning process due to differences in their magnitude. By scaling features, algorithms can converge faster and perform better, leading to more accurate and reliable models.
Below is the code example to do feature scaling.
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import StandardScaler
x_sc = StandardScaler()
y_sc = StandardScaler()
Here y variable is a 1D array. But StandardScaler
takes 2D array. So we have to reshape it. Below is the code for reshaping
y_res = y.reshape(len(y),1)
Below code fits and transforms the x and y variable into similar scale.
x_scld = x_sc.fit_transform(x)
y_scld = y_sc.fit_transform(y_res)
Step 4 : Training The SVR Model
In this step we will use above scaled data to train our model.
## Importing SVR
from sklearn.svm import SVR
## Creating SVR model
svr_rbf = SVR(kernel='rbf') ## here `rbf` stands for Radial Basis Function kernel
## Training SVR model
svr_rbf.fit(x_scld, y_sclf)
Here rbf
is one of the kernel from SVR model. For more learning you can refer this Link
Step 5 : Predicting New Result
In this step we predict a random value using random input.
result = y_sc.inverse_transform(svr_rbf.predict(x_sc.transform([[6.5]])).reshape(-1,1))
print(result)
Here inverse_transform()
method will transform scaled data into original format.
Step 6 : Visualizing the result and data with Smooth Curve
## This line will create a array using min and max value of data with difference of 0.1
x_grid = np.arange(min(x),max(x),0.1)
## This line will reshape above 1D array to 2D array.
x_grid = x_grid.reshape(len(x_grid),1)
## This line will plot the scatter plot for x and y variable
plt.scatter(x,y,color='red')
## This line will connect the points using curve line
plt.plot(x_grid,y_sc.inverse_transform(svr_rbf.predict(x_sc.transform(x_grid)).reshape(-1,1)),color='blue')
## This line will give the title to our plot
plt.title('actual vs true (SVR) smooth')
## this line label the x axis
plt.xlabel('Level')
## this line will label the y axis
plt.ylabel('Salary')
## This line will help to show the plot
plt.show()
Above all steps are implemented personally by me. I learned them online. So if anyone finds any mistake please comment down. I will be happy to edit.
Conclusion :
SVR combined with data preprocessing can provide accurate predictions for regression tasks. By following above all steps anyone can implement it using python language.
Posted on June 2, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.