No products in the cart.
Neural Networks are at the core of Artificial Intelligence. They are used by organisations around the world to innovate, automate tasks and to become more competitive. In this article I will explain in plain terms and without any mathematics or technical terms what they are and how you can use them to innovate in your domain.
When the term “Artificial Intelligence” comes in a conversation, most people probably think about a robotic brain made out of electronic parts that is capable of self consciousness… I recently wrote a blog post about the amazing and boring truth about Artificial Intelligence, where I explain in simple terms why there is nothing inherently intelligent about Artificial Intelligence… It’s just computer software using maths to make complicated calculations.
Neural Networks is one of the most popular ways of doing Machine Learning. But as it turns out, we gave them that name, simply because the architecture of the software that does all the calculations was inspired by our understanding of the human brain.
The human brain is the central processing unit of the human body and it is made up of approximately 86 billion cells called Neurons. Each neuron has a nucleus and connects to other neurons by Axons that form Synapses. We have approximately one quadrillion synapses in our brain that transport electrical signals. Each neuron works as a “gateway” that decides if the electrical current it receives, needs to pass to the next neuron or not.
Every time there is an “input” from our body, say for example our hand touches a burning surface, an electric signal is sent from the hand to our brain and in a very short period of time, the electrical signal will travel from neuron to neuron, processing the information, until an “output” signal is sent back to the hand, telling it to get away from the burning surface as soon as possible.
In the late 1950s, an American Physchologist by the name of Frank Rosenblatt, came up with the concept of the perceptron. The perceptron is an algorithm for supervised learning that was modeled after the human Neuron and as such, it has:
The process repeats itself multiple times and the outputs become new inputs to other processing units.
Let’s say I am a very famous individual and every day I receive in my mail hundreds of invitations to participate in events. I am losing too much time going trough each invitation and deciding if I should participate or not. I decide I want to train a virtual secretary and provide it with a Neural Network so that it can do the job for me.
The secretary will be able to filter future invitations and only send me those that are worth my time and attention.
For the sake of simplicity, let’s say there are only two factors that I want to take in consideration when taking a decision to attend an event or not:
If the topic is very interesting and the event is close to my place, I will definitely attend it. If on the contrary, the event is boring and it is also very far away from my place, I will most definitely not go. The neural network would look something like this:
Let’s say I want to spend some time with my virtual secretary training it to correctly filter my invitations. If I take 20 examples, I can teach my virtual secretary the correct outputs for each example.
If I plot all the examples on a two dimensional graphic, the results would look something like this.
The ultimate goal of Machine Learning is to find out the mathematical equation that most accurately represents the boundary decision line, so that it can correctly predict if I will attend a future event or not on new data that I will feed it.
With the given training data, we can see this is a classical example of binary classification, where the boundary that separates my decision making is a simple straight line.
Sometimes the dataset will not necessarily fit a straight line…
What would happen for instance if the event topic is interesting but VERY far away from my place, say for example in another city?
It is very likely there is a maximum distance I am willing to travel to attend any given event no matter how interesting it is.
In this case we would need to adapt our mathematical equation so that it can provide more accurate results and the result would look a little bit like this.
This is known as “non-linear regression” and it is probably a more correct representation of reality. Non-linear regression can be applied to a different number of datasets to accurately draw the prediction or “classification” of a decision.
In our example, we only used two parameters to predict an outcome, but in reality, there are multiple criteria that could influence my decision, such as:
Not every input has the same weight on the output. For instance, an event can be very interesting and close to my place, but if I am very sick, there is no way I am going out of my bed.
In this case the input “health” has a very high weight (impact) on the output.
Sometimes it is simply impossible to classify all data with a line and in 1989, Mathematician George Cybenko, came up with the Universal Approximation Theory. The theory showed that multilayer neural networks were more accurate classifying data.
Multilayer networks have different “hidden” layers of inputs that are used as additional steps in the final calculation of the output.
It is not necessarily simple to understand what these hidden units mean as the more hidden layers there are in a neural network, the more they become abstract concepts.
In supervised learning, the ultimate goal is to figure out the right mathematical equation that will be able to predict or classify an output. For this, the Machine Learning scientist will need both to define the architecture of the neural network (number of inputs, number of hidden layers, etc), and also to “teach” his model, which are the “right parameters” or weights for each input. For this, he will implement something called “back propagation”, a topic I will probably cover in a different article.
You can forget about the virtual secretary 🙂
Now that we have covered the basics behind Neural Networks, just imagine that input sources for a model can be any you chose to use. Inputs can be financial data, market numbers, salary statistics or anything you could imagine.
Inputs can also be pixels of an image and this opens the door to very interesting applications like facial recognition, autonomous driving and medical diagnosing just to name a few.
The applications of neural networks are infinite and the only limit is our imagination.
As I mentioned before, there is nothing inherently intelligent about Artificial Intelligence. The Neural Networks were loosely modeled on our current understanding of the human brain and its synapsis, but this is a field where much research is currently being done and there is still much to be discovered.
In the end, it’s just a bunch of mathematics that take place in order to predict an outcome. The wonderful thing now, is that cloud computing gave us the power to do all these calculations for a very cheap price.
If Machine Learning inspires you and you think you would like to implement a use case in your organisation, please get in touch. We are vendor agnostic and we will recommend and integrate the technology that adapts the best to your needs. If non available technologies satisfy your needs, we can always train a custom model tailored to your project.