Introduction to Machine Learning

The field of Machine learning has made a lot of breakthroughs in the last couple of years and it seems like it's only the beginning. The web itself has also evolved in many ways. It is no longer just used to host websites, it can also host web applications that can rival native apps. The beauty of the web is you can share anything with just one link.

In this series of articles, I will introduce Machine learning to you the web developer who is curious about the field. Throughout the series, we will learn various concepts and implement them in code. No prior knowledge of Machine Learning or (mathematics/statistics) is needed all you need to follow up, is to be familiar with JavaScript.

Intermediate knowledge of JavaScript might be needed like ES6 concepts, promises, async/await, and basic data structures. Even if you are not familiar with most of them you can learn them as you go.

What is Machine Learning

Machine learning is the act of teaching a machine to perform a task without giving it any actual instructions. Normally as a programmer, you give instructions to your computer in the form of code but in ML you don't give any instructions rather you give it data.

Data and Datasets

Data is the most important aspect of the Machine learning process. In Machine learning a collection of data is processed into a dataset. Depending on the task the dataset can either be structured as tables or unstructured as in images. Each data point in a dataset contains the actual data and its corresponding label.

Sentence	label
I hate you	0
I love you	1
Go f*ck yourself	0
Burn in Hades	0
Have a nice day	1
This is awesome	1

The table above is a dataset that contains data for the task of sentiment analysis. Sentiment analysis involves predicting the sentiment of a sentence whether it is positive or negative. This dataset has its data(the sentence) and corresponding label that tells you if a sentence is positive (1) or negative(0).

Datasets are usually divided into two:

Training sets
Testing sets

The training set is used to train the ML model while the testing set is used to check the performance of the model after training.

Let's take our sentiment dataset and split it into training and testing sets.

Training set

Sentence	label
I hate you	0
I love you	1
Go f*ck yourself	0
This is awesome	1

Testing set

Sentence	label
Burn in Hades	0
Have a nice day	1

Datasets are usually massive and can contain millions of data points. The more data that is available the better the model will perform.

What is a Model

A model is what takes in the data and produces a desired result. The model goes through a process called training. Training is the act of showing a model of data and its corresponding labels. This training is usually performed on the training set. After supplying the model with data and labels your model will be able to draw insights from the data.

A model could be a mathematical equation or an advanced algorithm that is capable of learning. For the sake of simplicity let's create a dummy model using Javascript classes.

class Model{

  constructor(){
    // Initialize model
  }

  train(X,y){
    // train model using the training set
  }

  test(Xtest, ytest){
    // test model using the testing set
  }

  predict(x){
     // predict the result of x 
  }
}

const model = new Model()

The Model above is abstract and currently hasn't been implemented but it will help show you how actual ML models work. The train method takes in the training set as X and y. X represents the data and y represents its labels. The train method handles the training of the model. The test method takes in a testing set and uses it to evaluate the performance of the model. The training phase usually returns an accuracy score.

model.test(Xtest, ytest)

When it is time to use your model you can use the predict method. Let us say we trained our model on the sentiment dataset above and we want to see if it can properly perform well on data it has never seen.

let sent = "I love this world"
model.predict(sent)

If our model was trained properly it should predict 1. Meaning the sentence is positive.

Machine Learning subcategory

Machine Learning is majorly divided into two:

Supervised ML
Unsupervised ML

Supervised Machine Learning: This branch of ML deals with Models that require labels to perform their task. The model we built above is a supervised model because it needed a label y and its data X. It also has its sub-categories that are:

Regression: This involves trying to predict a continuous value like the price of a product given a set of inputs.
Classification: This involves predicting discrete values. Like if a sentence is positive or negative.

Here are a few supervised learning models:

2.Unsupervised Machine Learning deals with models that don't require labels to perform their task. If our model was unsupervised here's how it will work.

const model = new Model()
model.train(X)

This train method doesn't require y label for training all it needs is data.

Here are a few Unsupervised models:

This article has served as an intro to the field of ML. It is not an in-depth guide. It is more of a build-up of what is to come in the series. For a more in-depth, guide check out the following:

Crash Course AI: A youtube series that explains Machine Learning concepts in a fun and exciting way.
Chapter 1 of Deep learning with JavaScript by Shanqing Cai et al.
What is Machine Learning?: Short video by Google cloud tech.
Intro to ML by the Tensorflow youtube channel.

In the next article, we will be looking at the fundamental data structure of ML, the tensor.

Blog