An ANN is made up of nodes (artificial neurons), single processing units that work in parallel, organized in layers or layers: an input layer, multiple hidden layers and an output layer. The nodes "weigh" the input data by categorizing its aspects, and by connecting to other nodes, they transfer them to the next layer until the output is obtained.
The weight is the strength of the connection between nodes and represents the influence, positive or negative, of each input on the specific characteristic that must be identified.
Each processing node has its own "bubble" of knowledge, made up of when it has already learned in previous experiences and rules originally programmed or defined autonomously by the node itself. The input layer communicates with the external environment, and since it does not receive data from previous layers, it is generally made up of passive neurons, while it is in the hidden layers that "thinking" develops.
To define the rules and decide what to pass to the next layer, ANNs follow different principles according to real algorithms, for example the gradient method, fuzzy logic, Bayesian inference, and more. Simplifying, a hidden layer is a set of neurons with a specific activation function, whose task is to process the inputs from previous layers, to extract specific characteristics.
The number of these layers depends on the complexity of the problem to be solved, but it must be calibrated as it is not a rule that the more hidden levels are, the greater the accuracy of the results.
Finally, the output layer, which collects and transmits information in the manner provided, with a number of nodes that depends on the type of ANN.
Neural networks can be characterized by the number of hidden layers or even by the number of nodes, and we speak of Deep Neural Network (DNN) only if there are two or more hidden layers.
The layers are independent from each other, and can have an arbitrary number of nodes, even if typically the number of hidden nodes must be greater than that of the input nodes. Considering specific functional characteristics we can identify: the Multilayer Perceptrons (MLP), the most classic type; the Spiking Neural Networks (SNN, presented in the magazine in June of last year), whose nodes are activated only when a certain threshold has been reached; the Convolution Neural Network (CNN), used mainly for image recognition; the RNN, Recurrent Neural Network, conceived for problems in which there are temporal references, for example speech recognition, more generally Natural Language Processing. It should be noted that it is possible to organize multiple ANNs in stacks by creating hybrid architectures, with a first that feeds a subsequent one of a different type, and from this possibly another, to achieve higher levels of performance.
For example a CNN or an RNN are not always used alone and a possible architecture foresees a stack with a CNN as input, an LSTM (Long Short-Term Memory, particular version of RNN) in the middle, and an MLP as output, from which a CNN LSTM architecture.
The term "perceptron", which can be translated as perceptron, dates back to the late 1950s, when an algorithm was developed at Cornell University, the perceptron, for the recognition of objects in images by machines.
A Perceptron is, as such, a binary classifier, to be considered as the simplest type of Feedforward Neural Network (FNN), with forward feeding, therefore networks in which data flows through the nodes in a single direction, without loops or backflow.
FNNs represent the basis for more complex structures, are used for general classifications of datasets, and often layered with other ANNs.
There can be some confusion in terms of terminology as MLP sometimes refers to any FNN and sometimes only to networks made up of multiple Perceptron layers.
In detail, a Perceptron Multilayer provides at least three levels of nodes: an input layer, a hidden layer and an output layer.
Each node, apart from the input ones, uses a non-linear activation function, which defines the output of that node based on an input or a set of inputs, where non-linearity means, as a general meaning, that the change of an output is not proportional to that of the input.
For the training of the network, a supervised learning technique called backpropagation is used.