Machine Learning - Regression- Simple Linear Regression
Nikhil Dhawan
Posted on January 8, 2024
Hi all in this post we will see how we can train Simple Linear regression model. After we are done with preprocessing of data we can direct train our model with help of LinearRegression from sklearn linearmodel. First we train and then predict the testing values that will help to see how good our model is. To train we use fir method on LinearReression and using predict method on the same for testing data.
Before we move to code, let me tell you some BTS for SLR, our data is of shape as below:
so here we have salary for the years of experience and we have to make a mode to be able to predict salary in future based on years we input in model. SLR is usually depicted as:
yh= bo + b1X1
where yh is dependent variable, bo is y-intercept and is constant, b1 is slope co-officient and X1 is independent variable
Model tries to find best slope line such that we have ordinary least squares (sum of (y1-yh)^2 is minimized)
Let's move to code part
Training the model
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, y_train)
Predicting the test results
y_pred = regressor.predict(X_test)
Lets now visualize the plot for training set, for that we use pyplot from matplotlib
plt.scatter(X_train, y_train, color = 'red')
plt.plot(X_train, regressor.predict(X_train), color = 'blue')
plt.title('Salary vs Experience (Training set)')
plt.xlabel('Years of Experience')
plt.ylabel('Salary')
plt.show()
Plot for Test set
plt.scatter(X_test, y_test, color = 'red')
plt.plot(X_train, regressor.predict(X_train), color = 'blue')
plt.title('Salary vs Experience (Test set)')
plt.xlabel('Years of Experience')
plt.ylabel('Salary')
plt.show()
Here if we see closely we are using plot line same for both , that's because the slope line will be same for both as the model we used will use same predictor while predicting values for unknown values, even if you change that to X_test it should be same. Red dots in second graph shows actual values we have in data and values along line will be predicted values, and we can see most of the points are overlapping but some are not. No model can be 100% fitting.
Thanks for reading, hope it was useful ,see you next time
Posted on January 8, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 30, 2024