AI Log #1: Machine Learning To Linear Regression

I am an experienced software engineer diving into AI and machine learning. Are you also learning/interested in learning? Learn with me! I’m sharing my learning logs along the way.

Disclosure: I have already learnt basic machine learning, but I am starting this learning log from the beginning because I need a refresher 😅.

Log 1: Linear Regression

My learning journey starts with Linear Regression because it is a fundamental building block for understanding machine learning.

However, before diving into linear regression, I will jog my memory on machine learning, supervised learning, and regression models.

Machine Learning

This definition was the one I found most helpful:

The use and development of computer systems that are able to learn and adapt without following explicit instructions, by using algorithms and statistical models to analyse and draw inferences from patterns in data — Bing

The way I understand it:

Machine Learning is when we use machines to determine an outcome without explicitly coding the logic that determines that outcome. Instead, we determine the outcome based on example data plus fancy math (algorithms and statistical models).

Here is an example I will use throughout this learning log.

Say I want to predict the amount of snow (centimetres) based on temperatures (degrees) below freezing.

Non-machine learning way - I use some hard-coded logic: ‘The amount of snow will be the absolute number of the temperature. So if it’s -1°C, we get 1 cm of snow.’

The non-machine learning method can, at times, be suitable. However, a clear set of rules cannot define all situations! Sometimes, finding patterns in data and using algorithms and statistical models is more appropriate, and this method would require machine learning. Unlike traditional programming, where we code all possible conditions and outcomes, machine learning algorithms learn to make decisions from data.

Interestingly, the machine learning method is the solution that I think of instinctively if someone asks me how I would predict the amount of snow based on temperature.

Machine Learning Lingo

Input Features (x): The data we feed in, like temperature in degrees.
Outcome Variable (y): What we’re predicting, like snowfall in centimetres.
Predicted Outcome (ŷ): The outcome our model predicts.
Model (f): The math that transforms ‘x’ into ‘ŷ’.
Example Data: The historical data we use for training.
We use ‘m’ to denote the amount of example data we have.
We refer to a specific example data as x_i_ or y_i_, where i is the example data’s index (or row).

All together now: f(x) = ŷ

I want to make something very clear:

f(x_i_) will not necessarily be y_i_ because the statistical model we determine may not predict every example perfectly; in fact, this is unlikely to happen as we would need a model that fits all the points perfectly!

For this reason, we have the symbol ŷ as the output of f(x); it is the predicted value and may differ from the actual value!

Supervised Learning

Machine learning has two primary subcategories: supervised and unsupervised learning.

Supervised learning is when the type of our desired outcome is known, a.k.a. we tell the model what we want it to predict.

We tell the model, ‘Here — given these inputs, I want you to predict this outcome type.’

The example data we train with must have input features and outcome variables.

The snow/temperature example I used above is an example of supervised learning. In our example: ‘Here — given negative temperature in degrees, I want you to predict the centimetres of snow.’

I will cover unsupervised learning in future diary entries, but out of curiosity, it is when we don’t tell the model the outcome we want. We give it data and ask it to find something interesting! The model finds patterns, and we can hopefully use the patterns to help us solve our problem.

Regression

Regression is a method of supervised learning that predicts a continuous numeric value. A continuous numeric value is any real value, like an integer or floating point number.

For example, the number of centimetres of snow that the model can predict is 3.00cm.
It can also be 2.33212cm.
It can also be 1.2393cm.
And so on.

I am using my snow/temperature example above because it, too, can be a regression problem. The outcome we want? The number of centimetres of snow.

Linear Regression

Linear regression is a supervised machine learning algorithm where the predicted output is continuous and has a constant slope. It’s used to predict values within a continuous range. — ML cheatsheet

How I make sense of it:

It falls under a regression method because its output is continuous.
It is ‘linear’ because its model relies on a mathematical equation with a constant slope; it models a linear relationship between input and output.

There are two main types: simple linear regression and multi-variable linear regression.

Simple linear regression.

It is also known as univariable linear regression.
Remember this straight-line equation from high school math that we used on 2D graphs (only with an x and y axis)?

y = mx + c

x is the value on the x-axis, y is the value on the y-axis, m is the gradient (the slope), c is the intersect

The equation for simple linear regression in machine learning is the same, but the gradient is ‘w’ for weight, and the intersect is ‘b’ for bias.

f(x) = ŷ = wx + b

The model function definition sometimes includes ‘w’ and ‘b’.

f_wb_(x) = wx + b

A simple linear regression equation could be
f(x) = -1/2x
If the temperature is -2 degrees, the predicted snow amount would be -1/2 * 2 = 1 cm.
Another simple linear regression equation could be
f(x) = -1/2x + 1
If the temperature is -2 degrees, then the predicted amount of snow would be (-1/2 * 2) + 1 = 2 cm

We are using math to predict the outcome! The logic is not explicitly declared or hard-coded. We use past examples to help us determine the equation we use to do it, and the equation’s weight and bias are likely to keep changing as we add more examples.

Multi-variable linear regression.

Multi-variable linear regression is a linear regression that relies on multiple variables. Very often, our models will depend on more than one variable! Regarding our snow/temperature example, we will likely want to use more than temperature to determine the amount of snowfall. We could also use the altitude, humidity, etc.

Multi-variable linear regression employs the same technique as linear regression but with a continuous slope across multiple dimensions.

Here is what its equation looks like:

f(x_1_, x_2_) = ŷ = w_1_x_1_ + w_2_x_2_ + b

Calculate the weights and bias.

Instinctively, for our linear regression equation to be most effective, we would want it to fit our example data as closely as possible. We want to find values for our weights and biases that minimise the error between ŷ and the actual value of y across all our example data.

We use a ‘cost function’ to quantify the difference across all example data; thus, we want to find the weights and biases that produce the minimal cost function output.

We can then use a method known as ‘gradient descent’ to determine which weights and biases produce the minimal cost function output. In future learning logs, I will cover cost functions and gradient descent in more detail.

Summary

Machine Learning = machines help us determine an outcome without explicitly coding the logic that determines that outcome.
Supervised learning = subcategory of machine learning where we know the outcome type we want.
Regression algorithm = A method of supervised learning where we predict a numeric outcome.
Linear regression = A type of regression algorithm where the predicted output is continuous and has a constant slope.

Disclosure

I am taking Andrew Ng’s Machine Learning Specialization, and these learning logs contain some of what I learned from it. It’s a great course. I highly recommend it!

Blog