Neural Networks describe the World!! - Neural Networks Part 1

What you're currently watching is a Neural network trying to learn a Sine function. This is the first part of a Neural Networks series that I'm starting where we will understand the inner workings of neural networks from the Neurons, Layers, Activation functions, Optimization, and Backpropagation all from scratch in Python. In this post, we will first understand how neural networks can accomplish almost anything, their architecture and we will dive deep into the power that a single neuron holds.

I would like to start with a bold statement Neural networks can learn almost anything and model the world around us as they are Universal Function Approximators. To understand why that is the case we need to first understand why are functions so important. Well that's because

Functions Describe the World! - Thomas A. Garrity

Everything and I mean everything is described by functions. The sound of your voice can be described by a trigonometric function, an object thrown at an angle follows a projectile motion which can be described by a quadratic function, the electric current flowing inside your device can be described by a sine or cosine function, the orbit of the Earth around the sun can also be described by a function and the list goes on and on.

So the world can fundamentally be described with numbers and the relationship between numbers and we call them functions. We can use these functions to understand, model, and predict the world around us. Functions are input-output machines, they take an input set of numbers and output a corresponding set of numbers.

The goal of Artificial Intelligence is to understand, model, and approximate the functions that describe the world around us, and that is what neural networks do, they are function-building machines. The problem that neural networks solve is approximating functions whose definitions we don't know, they can take in inputs and provide desired outputs without being explicitly programmed.

We provide the neural network with a sample of data points with inputs and their corresponding outputs, and it must approximate a function that fits these data points and allows us to accurately predict outputs to the given inputs. This process is called curve fitting as shown in the above example, where we have provided the neural network with a sample data set of a sine function.

To understand how exactly neural networks achieve this we must have a look at its architecture.

Architecture of a Neural Network

Above is a condensed example of how a neural network looks like under the hood, we call them neural networks as they kind of look like a network of neurons. They are composed of interconnected nodes called neurons, organized into layers. When data passes through this network, it starts at the input layer, then gets passed to the hidden layers, and finally through the output layer. Let us have a look at the building blocks of a neural network

Neurons

Neurons, also known as nodes, are the basic processing units of a neural network. Every neuron has a unique connection to all the neurons in the previous layer. For a given neuron every previous layer neuron's output feeds into that neuron as an input.

Weights and Biases

Each connection between neurons has a weight associated with it. These weights determine the strength of the connection and are learned during the training process. Additionally, each neuron has a bias term that allows for fine-tuning the neuron's behavior. You can look at weights and biases as little knobs that we can tune to improve the output. The goal of weights is to change the magnitude of each input and bias impact the final output by positively or negatively offsetting it.

Layers

Neural networks are organized into layers, with each layer consisting of a group of neurons.

Input Layer: The input layer receives the initial dataset.
Hidden Layer: Hidden layers are intermediate layers that perform operations processing data, we call them hidden as their inner workings are not controlled by us, their weights and biases are tuned using algorithms like backpropagations.
Dense Layer: In a dense layer, each neuron of a given layer is connected to every neuron of the next layer, which means that its output value becomes an input for the next neurons.
Output Layer: The output layer is the final layer of the neural network, and it produces the network's output based on the processed information from the hidden layers. The number of nodes in the output layer varies based on our output requirements.

This might seem rather complicated. Neural networks are considered to be “black boxes” in that we often have no idea why they reach the conclusions they do. We do understand how they do this, though. Let us first understand how each neuron works at a base level by simulating only one neuron from the above network.

Simulating a single neuron

If we isolate and look at the workings of one single neuron, we can calculate the output of the neuron by the sum of the products of its input values and their corresponding weights, and add a bias term.

Output Equation

Implementation in Python

Let's say we are given a single neuron with three input neurons. As in most cases, when you initialize parameters in neural networks, our network will have weights initialized randomly, and biases set as zero to start. We incrementally update and correct the weights and biases using backpropagation and optimization which will become apparent later on. To simulate the working for now we will assume dummy values for inputs, weights, and biases.

inputs = [3, 2, 1]            # inputs from previous neurons
weights = [0.4, -0.8, 0.1]    # weights of the inputs
bias = 3                      # our neuron's unique bias

# output
output = inputs[0] * weights[0] + inputs[1] * weights[1] + inputs[2] * weights[2] + bias
print(output)

The output should come out to be 2.7.

Understanding how exactly weights and biases come into play

To understand how weights and bias affect the output, we will be training a neural network to predict taxi fare based on distance. To do so we will condense our neural network to have only one neuron, one weight, one bias, and one output.

The Output Equation for our above neural network will be

If you notice the output is a linear equation

where the slope (m) is our weight and the y-intercept (c) is our bias.

From the above depiction, weights impact the slope of our network. As we increase the value of the weight, the slope will get steeper. If we decrease the weight, the slope will decrease. If we negate the weight, the slope turns to a negative value.

Now let's take a look at how biases impact our network. The purpose of the bias is to offset the output positively or negatively. As we increase the bias, the function output overall shifts upward. If we decrease the bias, then the overall function output will move downward.

With these concepts in mind, we can apply them to train our neural network to predict taxi fares based on distance.

In the above example, we have randomly generated distance and fare values, as the distance increases the taxi fare also increases. The neural network is trained on this data, it iteratively tunes its weight and bias to accurately predict fare prices.

With this, you must have some understanding of how weights and biases help to impact the outputs of one neuron, but they do so in slightly different ways.

Conclusion

We have learned that neural networks are powerful computational models that learn from data to understand and predict various aspects of our world. Neural networks consist of interconnected nodes called neurons that adjust their weights and biases to solve complex problems.

We demonstrated how they can approximate functions by predicting taxi fares based on distance and it's just the surface of the power they hold. These networks excel at tasks like image recognition, object detection, language processing, and more. By training them on data, neural networks become the most sophisticated problem solvers, making them valuable tools for solving real-world challenges and advancing artificial intelligence.

This is the first part of the neural network series that I am starting. In the later parts, we will be learning about activation functions, backpropagation, and coding our neural networks along the way, Buckle up cause it's about to get interesting.

I would like to thank all of you for reading this article. Writing these blogs pushes me to learn as well and I enjoyed writing them. I have linked below all the resources I used in writing this blog and some more interesting content I found, please check them out as well.