Introduction to Machine Learning
Neal Davis
Posted on January 26, 2022
If you haven’t heard about Machine learning (ML), Deep Learning (DL) or Artificial Intelligence (AI), you should consider catching up with the latest technology trends. These technologies are about to change the whole industry across numerous sectors.
What is Machine Learning?
Machine Learning, for instance, is being used in the automotive industry for predicting if a machine is likely to fail soon and earmarking them for immediate checkups (“Predictive maintenance”). It is also used in the healthcare industry for predicting if a patient has a particular disease either through their X-ray scan or even by analyzing the patient’s symptoms.
There have also been significant breakthroughs in fraud detection models which are capable of automatically identifying if a financial transaction is fraudulent or not.
Is artificial intelligence pure magic? How could a model possibly be able to solve these challenging problems? What is the difference between Artificial Intelligence, Machine Learning and Deep Learning? In this article you’ll get answers to all these questions and more.
The three terms included in the figure above are all included under the same umbrella which is Artificial Intelligence. In fact, Deep Learning is a subset of Machine Learning which is a subset of Artificial Intelligence. This hierarchy is time-driven: machine learning is a breakthrough of the older field “Artificial Intelligence” and the same applies for “Deep Learning” which is a breakthrough of the older field “Machine Learning”.
Artificial Intelligence is defined in Wikipedia as:
Intelligence demonstrated by machines, as opposed to natural intelligence displayed by animals including humans. Leading AI textbooks define the field as the study of "intelligent agents": any system that perceives its environment and takes actions that maximize its chance of achieving its goals.
You can think of Artificial Intelligence as a way of demonstrating the intelligence of a machine instead of blindly following rules. AI consists of many subfields and in this article we are going to concentrate on the fields of machine learning and deep learning.
Why do we need Machine Learning?
Let’s say you have a car dealer’s business running which sells new and used cars. You have no problem identifying the selling price of the new car as you’ll add the cost of the car to your desired profit. Easy right?
So, what if someone wants to sell you their used car? Well, you could assume based on your experience that you’ll first notice the car’s model, color, miles driven, dents, year of production, etc. Based on these factors, you can estimate the price of this car today, determine your desired profit, and then sell the car to someone else.
What if you’re new to this business or if you lost money before due to incorrect calculations of the price? You only have one thing, a price history of used cars which were sold in the last 2 years, their condition, and several other factors describing the car.
This resembles two scenarios, the first one being that you could simply make a program with some specific rules which states that if the car’s model is “X”, year of production is “1997”, miles driven are between “20000” and “30000”, then the cost of the car is estimated as $3,000 or you could even provide a range.
But you could see where this is going right? You cannot extract all the available features to identify the cost of the used car. Even if you’re an expert, it would take tons of rules to provide the program with an efficient calculation of the price.
How to Use Machine Learning
Solving the problem requires a scenario in which you have the sales history of the used cars containing the car’s features and the actual amount it was sold for. The data may look like this:
You may notice that we reversed the situation here. We didn’t just rely on the expert to provide rules for our program to estimate the price of a used car given its features. We also supplied the program with the data of the cars that were sold along with their features and we’re expecting the model to extract the rules by itself.
But how should the model know how to estimate the selling price given those history sales data and features? To answer this question, we will simplify the problem. Instead of having 3 features governing the car’s selling price, we will only have 1 feature which is “Miles driven”.
Just imagine that the car’s selling price depends only on the number of miles it drove regardless of its type, year of production and other factors. So, we now should have a much simpler data looking like this:
You can observe that you can fit the data using a straight line like this:
You can also observe that there is some error but that’s okay, in the end, not everything is perfect. So how did the machine learning model come up with such an equation in the first place? Is it a one-shot solution or is there a progression towards finding that straight line?
Well, before figuring out the perfect equation, which is shown in the above figure, the model firstly assumed random numbers for the slope and the intercept. Let’s say slope = 1.5 and y-intercept = 600. Then we would have a graph like this:
You can automatically spot that this is a bad start as it predicts values which are not even close to the actual sales’ values. You can calculate a basic error formula in which you add up the errors to produce an indicator of how bad the model is.
The formula may be something like ∑ (actual value – prediction value). Where the actual value being the point on the graph and the predicted value is the corresponding prediction from the formula (straight line).
So, we could say that the first actual value is 20,000 while our prediction was 0. Consequently, the model knows now that it should decrease this error and progress through its predictions until it reaches the final near-accurate formula. Like this:
This is an iterative process of training the machine learning model in which every step the model takes is called an “epoch”. An epoch is an update to try and minimize the error of the whole dataset and come up with an efficient equation to satisfy the problem’s needs.
Taking this to the next level, we will find that our main problem consists of 3 features where we just simplified it to 1 feature. We will take the same steps, but instead of solving for 1 feature using the equation y = mx + c, where y is the selling price, and x is the miles driven. We will solve for 3 features using the equation: y = m1x1 + m2x2 + m3x3 + c.
This resembles a hyperplane which is hard to visualize but can be fitted using the model’s training. The best part is that the model could take various inputs and extract a relationship, if it’s possible.
In the previous example, we were talking about a linear relationship between the inputs and the outputs, and it may seem counter intuitive as this example had more than one feature, however a non-linear relationship could look like y = m1x12 + m2x22 + m3x32 + c.
Conclusion
This article provided an example of how machine learning can be used to solve challenging problems through “training” a model to find patterns and relationships in datasets. This is a complex topic and there are many great resources available if you want to further increase your knowledge and skills with Machine Learning.
The example in this document uses Amazon SageMaker and the linear learner algorithm, you can learn more about it here:
You can also get certified by Amazon Web Services (AWS) with the AWS Certified Machine Learning Specialty certification. This certification demonstrates skills with developing, architecting, or running machine learning/deep learning workloads in the AWS Cloud.
Posted on January 26, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.