Intro to Machine Learning in Python: Part III
Brett Hammit
Posted on January 17, 2021
Table of Contents
- Model Evaluation
- Predictions From the Model
- Regression Evaluation Metrics
- End Thoughts
This is the last part in my 3 part series of Introduction to Machine Learning with Python, it has been a blast to write and hopefully some of you have got something out of reading these. In the last post of Intro to Machine Learning: Part II we talked about how to split, train, test and fit our linear regression model. In this one we are going to be diving into the process after we have built and tested our model. To do this we need evaluate the model by printing its intercept and coefficients, looking at the predictions from our models and then evaluating it from a quantitative approach. There's our summary so let's go see how this is done with examples and some code!we need evaluate the model by printing its intercept and coefficients, looking at the predictions from our models and then evaluating it from a quantitative approach. There's our summary so let's go see how this is done with examples and some code!
Model Evaluation
To start evaluating our linear regression model we are going to be printing the intercept and coefficients from our model. The first thing that we need to do is print our intercept with the code looking like this:
print(lm.intercept_)
This should then give you the intercept below. After you've completed this you can move on to printing the coefficients out. The coefficients are related to the columns in X_train. So by doing this we are going to create a data frame out of the coefficients to make them more readable and easier to interpret them. The code in Python could look like the following:
coeff_df = pd.DataFrame(lm.coef_,X.columns,columns=['Coefficient'])
coeff_df
This will make a data frame that has your related rows and then their corresponding coefficient next to them. From here we move onto how to interpret these coefficients.
To interpret the numbers to the right of your column it means that with holding everything fixed a 1 unit increase will result in an increase of however many units that coefficient is. So depending on the data and what question you are trying to answer interpreting these are going to be case dependent.
Now that we've gotten some insights from our coefficients we can go into looking at the predictions data from our model.
Predictions From the Model
To analyze our predictions from our model we need to:
*Store our predictions from our test data in a variable
*Use matplotlib to make a scatter plot
*Use seaborn to make a distplot
To store our data of our predictions in a variable its going to be one simple line of code and the help of the scikit-learn package and will look like:
predictions = lm.predict(X_test)
Now that we've got our predictions in a variable we can visualize this to qualitatively analyze our model by looking at linear model, and the distribution of it.
To do this we will use matplotlib to create a scatter plot of the test values versus the predictions by using:
plt.scatter(y_test,predictions)
Your result of the scatter plot should look a little something like this:
We want a tight linear straight line which means your predictions versus the test model are very close to one another.
The other way that we are going to qualitatively analyze our model is by looking at the distribution of it. To do this we will use the package seaborn and use the distplot by coding:
sns.distplot((y_test-predictions),bins=50);
After running this it should yield a plot that looks something like this:
We want the residuals of this plot to be evenly distributed bell curve like we had talked about in Part II to ensure us that we have fitted our model decently.
Now that we have qualitatively analyzed our model we can go one step further and quantitatively analyzing by using Regression Evaluation Metrics.
Regression Evaluation Metrics
When it comes to analyzing our linear regression model we want to not only visually see, but to also see it in numbers. Lucky for us scikit-learn can do this math for us. That doesn't mean though it isn't good to know how we got there.
Here are the 3 most common Regression Evaluation Metrics and the best way to compare these metrics is:
*MAE is basic because it is just the average error
*MSE tends to be better than MAE because it punishes larger errors which is more commonplace
*RMSE is the most popular of the 3 because it is more readable in terms of the "y" units
It is important to note that these are all loss functions so we are trying to minimize these as much as we can with our models. These will become more clear when you run the code to show these.
This code is going to look like:
from sklearn import metrics
print('MAE:', metrics.mean_absolute_error(y_test, predictions))
print('MSE:', metrics.mean_squared_error(y_test, predictions))
print('RMSE:', np.sqrt(metrics.mean_squared_error(y_test, predictions)))
This should give you all three of your Regression Evaluation Metrics listed with their errors. Remember we want LESS errors in our models, so the smallest number of the three will be the best model.
Now we have finally learned how to analyze our linear regression model qualitatively as well quantitatively. This is the last step and we've completed the series of Introduction of Machine Learning!
End Thoughts
This concludes my 3 part series and I hope you guys enjoyed maybe 1 of these articles or at least found something to takeaway from it. I am by no means an expert in this stuff and am writing to possibly help out people like me who might be starting their journey and looking for some tips to help the along the way. I know that I really enjoy it when I find good articles that speed up my process of learning so I'm that maybe some these articles can help someone reading this. Thank you for taking the time to read and if you didn't catch my past articles I'll link them below if anybody would like to read. If you enjoy this leave a like or a comment telling me things that I could do better and some stuff you'd like to see me write about next. Thank you again and good look on wherever you're at in your journey.
Intro to Machine Learning: Part I
Intro to Machine Learning: Part II
Posted on January 17, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.