Forward Propagation In Neural Networks


Forward Propagation In Neural Networks

Let us get to the topic directly. Exactly what is forward propagation in neural networks? Well, if you break down the words, forward implies moving ahead and propagation is a term for saying spreading of anything. forward propagation means we are moving in only one direction, from input to the output, in a neural network. Think of it as moving across time, where we have no option but to forge ahead, and just hope our mistakes don’t come back to haunt us.

Now, if you are thinking that Neural networks have a very low usefulness in trading, well we have to tell you that almost all the quant hedge funds have moved from neural networks to deep learning and AI to somehow keep an edge over the others. From Renaissance tech to Two Sigma, neural networks are being utilised in unprecedented ways.

A brief history of Neural Networks

We have tried to understand how humans work since time immemorial. In fact, even philosophy is in effect, trying to understand the human thought process. But it was only in recent years that we started making progress on understanding how our brain operates. And this is where conventional computers differ from humans.

You see, while we can develop an algorithm to solve a problem, we have to make sure we have taken into account all sorts of probabilities. Whereas, when it comes to humans, we might start off with limited or incomplete information, but we ‘learn’ and solve problems faster and with greater accuracy. Well, at least that’s what everyone says!

Thus, we started to research and develop artificial brains, which are actually called neural networks now.

The basic structure in the neural network is the perceptron, which is modelled after the neurons in our cells.

There are inputs to the neuron marked with yellow circles, and the neuron emits an output signal after some computation.

The input layer resembles the dendrites of the neuron and the output signal is the axon. Each input signal is assigned a weight, wi. This weight is multiplied by the input value and the neuron stores the weighted sum of all the input variables.

An activation function is then applied to the weighted sum, which results in the output signal of the neuron.

A popular example of neural networks is the image recognition software which can identify faces and is able to tag the same person in different lighting conditions as well. That being said, let us understand forward propagation in more detail now.

What is forward propagation in Neural Networks?

One of the first neural networks used the concept of forward propagation. I’ll try to explain forward propagation with the help of a simple equation of a line.

We all know that a line can be represented with the help of the equation: y = mx + b

Where y is the y coordinate of the point, m is the slope, x is the x coordinate and b is the y-intercept i.e. the point at which the line crosses the y-axis.

But why are we jotting the line equation here? This will help us later on when we understand the components of a neural network in detail.

Remember how we said neural networks are supposed to mimic the thinking process of humans. Well, let’s just assume that we do not know the equation of a line, but we do have a graph paper and draw a line randomly on it.

For the sake of this example, you drew a line through the origin and when you saw the x and y coordinates, they looked like this:

This looks familiar. If I asked you to find the relation between x and y, you would directly say it is y = 3x. But let us go through the process of how forward propagation works.

We will assume here x is the input and y is the output.

The first step here is the initialisation of the parameters. We will guess that y must be a multiplication factor of x. So we will assume that y = 5x and see the results then. Let us add this to the table and see how far we are from the answer.

Note that taking the number 5 is just a random guess and nothing else. We could have taken any other number here. I should point out that here we can term 5 as the weight of the model.

All right, this was our first attempt, now we will see how close (or far) we are from the actual output.

One way to do that is to use the difference of the actual output and the output we calculated. We will call this the error. Here, we aren’t concerned with the positive or negative sign and hence we take the absolute difference of the error. Thus, we will update the table now with the error.

If we take the sum of this error, we get the value 30. But why did we total the error? Since we are going to try multiple guesses to come to the closest answer, we need to know how close or how far we were from the previous answers. This helps us refine our guess and calculate the correct answer.

Wait. But if we just add up all the error values, it feels like we are giving equal weightage to all the answers. Shouldn’t we penalise the values which are way off the mark? For example, 10 here is too high than 2. It is here that we introduce the somewhat famous “Sum of squared Errors” or SSE for short. In SSE, we square all the error values and then add them. Thus, the error values which are very high get exaggerated and thus helps us in knowing how to proceed further.

Let’s put these values in the table below.

Now the SSE for the weight 5 (Recall that we assumed y = 5x), is 145. We call this the loss function. The loss function is important to understand the efficiency of the neural network and also helps us when we incorporate backpropagation in the neural network. All right, so far we understood the principle of how the neural network tries to learn. We have also seen the basic principle of the neuron. Let us now understand forward propagation in the neural network itself.

Components of forward propagation model

In the above diagram, we see a neural network consisting of three layers. The first and the third layer are straightforward, input and output layers. But what is this middle layer and why is it called the hidden layer?

Now, in our example, we had just one equation, thus we have only one neuron in each layer.

Nevertheless, the hidden layer consists of two functions:

• Pre-activation function: The weighted sum of the inputs is calculated in this function.
• Activation function: Here, based on the weighted sum, an activation function is applied to make the network non-linear and make it learn as the computation progresses. The activation function uses bias to make it non-linear. That’s all there is to know about forward propagation in Neural networks.