Deep Learning for Dummies #1 - Introduction

Deep Learning is a state-of-art today for machine learning problems. With artificial neural networks, we automate tasks that we thought that only humans could do, but the future is now and technology is closer to humans with these algorithms. And the best feature is that engineers haven't care about code the logic for the computer, now, they can learn how to do tasks based on the data that we throw into the world. In this post, I explain what is deep learning and how it works.

The problem

The machine learning problems can be generally two types:

Classification: In this problem, the machine has to learn how to differentiate between some classes. It can be able to classification the input data and assign a class to each. For example, you introduce an image with a cat inside and it should learn to identify this image as a cat.
Regression: In this problem, the machine tries to predict a value from input data. This value can be any number and it should try to estimate this value. For example, if you introduce the last 100 days of SP500 close price, it should be able to predict the next day's value. There are other problems that you can find in the literature but these two are the typical problems.

The Maths and Neurons

Let's think about the regression problem. In this problem, you can think that the machine is learning to return an output value based on the input value. It is similar to a mathematical function where you introduce a series of numbers and it results in another number. For example, the next function:

It can be a function that given a value return the next number in the natural numbers ladder. But in the regression problem, you don't have the mathematical expression that links your input data (x) to your output data (f(x)) and your goal is to find this expression. But you think, how can you find this expression if you don't how it is? The answer is Taylor polynomials.

The Taylor polynomials have an expression like this:

And you can interpret it as a polynomial that estimates the real function value based on his inputs multiplied by weights. This expression can be larger and when larger, more precise in his estimations.
In deep learning, we are trying something like this but with a more complex approach because this expression is similar to the work process of a neuron in deep learning. The minimal computation step in deep learning is the neuron and it works following the next diagram:

This neuron is the most simple in deep learning and it takes some inputs and returns one output. The output is calculated based on a linear combination of his inputs. It is trying to approximate a function based on an addition of the inputs multiplied by weights and these weights are the values that a neuron will learn to approximate better this function.
With this neuron, you can try to approximate a function but the power of these neurons is magic where they are combined.

Neuronal Networks

A neuron is a good approach for estimate function but it works like a linear regression problem, but if the function that you should estimate is more complex than a line, what should you do? You can try to combine these neurons linking outputs as inputs of other neurons that build neural networks.
Let's think about a neural network of two neurons like this:

With this approach, you are encapsulating the value of the first neuron inside the function that the second neuron is trying to estimate. Mathematically it is something like this expression:

With this expression, we can estimate functions more complex and if you want to estimate more complex functions, you can append more neurons. But, if neural networks are famous, it is for their layers. The layers are neurons that are positioned at the same level inside the last chain of neurons. These neurons will be connected to the next layer of neurons building deep neural networks. These neurons can estimate functions with good performance and we can call them universal function approximators. An example is shown in the following image:

This architecture is the base of all neural network architectures and they can estimate functions better than the art-of-art machine learning algorithms with good data and good training.
In the next post, I will so you how these neural networks learn and how is the process of training.

Blog

Deep Learning for Dummies #1 - Introduction

Antonio Triguero

The problem

The Maths and Neurons

Neuronal Networks

Join Our Newsletter. No Spam, Only the good stuff.

Related