Neural networks презентация

Содержание

Table of contents The basic concepts of neural networks Artificial neural networks. The structure of an artificial neuron. Activation functions. Basic paradigms of neural networks. Fundamentals of learning and training

Слайд 1Kazan National Research Technical University named after A.N. Tupolev German-Russian Institute of

Advanced Technologies (GRIAT)

NEURAL NETWORKS

by Dr. Igor Anikin


Слайд 2Table of contents
The basic concepts of neural networks
Artificial neural networks.
The

structure of an artificial neuron.
Activation functions.
Basic paradigms of neural networks.
Fundamentals of learning and training samples.
Using neural networks in practice
Single layer neural networks
Rosenblatt's single layer perceptron.
Learning single layer neural networks.
Associative memory and its realization on single layer neural networks.
Using single layer neural networks for pattern recognition and time series forecasing.
Multilayer perceptrons
The structure of multilayer perceptrons
Back propagation of error.
Using multilayer perceptrons for pattern recognition and time series forecasing.

Слайд 3Self-organizing maps
The principle of unsupervised learning.
Kohonen self-organizing maps.
Learning Kohonen

networks.
Practical using of Kohonen networks
Recurent neural networks
Neural networks with feedback.
Hopfield neural network.
Hamming neural network.
Training Hopfield and Hamming neural networks.
Practical using of Hopfield and Hamming neural networks.
Training and Testing
Training error and testing error.

Слайд 4References
David Kriesel. A brief Introduction to Neural networks // http://www.dkriesel.com/en/science/neural_networks
Raul

Rojas. Neural Networks. A Systematic Introduction // http://www.inf.fu-berlin.de/inst/ag-ki/rojas_home/documents/1996/NeuralNetworks/neuron.pdf.
L.P.J. Veelenturf. Analysis and Application of Artificial Neural Networks // http://www.ru.lv/~peter/zinatne/ebooks/Analysis%20and%20Applications%20of%20Artificial%20Neural%20Networks.pdf
Artificial Neural Networks – Methodological Advances and Biomedical Applications // InTech.ORG

Слайд 5The basic concepts of neural networks


Слайд 6Questions for motivation discussion
What tasks are machines good at doing that

humans are not?
What tasks are humans good at doing that machines are not?
What tasks are both good at?
What does it mean to learn?
How is learning related to intelligence?
What does it mean to be intelligent?
Do you believe a machine will ever been intelligent?
If a computer were intelligent, how would you know?

Слайд 7Types of learning
Knowledge acquisition from expert.
Knowledge acquisition from data:
Supervised learning –

the system is supplied with a set of training examples consisting of inputs and corresponding outputs, and is required to discover the relation or mapping between them.
Unsupervised learning – the system is supplied with a set of training examples consisting only of inputs. It is required to discover what appropriate outputs should be.

Слайд 8Artificial Neural Network
An extremely simplified model of the human’s brain
Transforms inputs

into the best outputs (some neural networks are the universal function approximators).

Слайд 9Artificial Neural Networks
Development of Neural Networks date back to the early

1940s.
It experienced an upsurge in popularity in the late 1980s due to discovery of new techniques of NN training.
Some NNs are models of biological neural networks and some are not, but historically, much of the inspiration for the field of NNs came from the desire to produce artificial systems capable of sophisticated, perhaps intelligent, computations similar to those that the human brain routinely performs, and thereby possibly to enhance our understanding of the human brain.
Most NNs have some sort of training rule. In other words, NNs learn from the examples (as children learn to recognize dogs from examples of dogs) and exhibit some capability for generalization beyond the training data.

Слайд 10ANN vs Computers
Computers have to be explicitly programmed
Analyze the problem to

be solved.
Write the code in a programming language.
Neural networks learn from the examples
No requirement of an explicit description of the problem.
No need for a programmer.
The neural computer adapts itself during a training period, based on examples of similar problems even without a desired solution to each problem. After sufficient training the neural computer is able to relate the problem data to the solutions, inputs to outputs, and it is then able to offer a viable solution to a brand new problem.


Слайд 11ANN vs Computers
Digital Computers
Deductive Reasoning. We apply known rules to input

data to produce output.
Computation is centralized, synchronous, and serial.
Memory is literally stored, and location addressable.
Not fault tolerant. One transistor goes and it no longer works.
Exact.
Static connectivity.

Applicable if well-defined rules accessible with precise input data.

Neural Networks
Inductive Reasoning. We use given input and output data (training examples) to make a reasoning.
Computation is collective, asynchronous, and parallel.
Memory is distributed, internalized, short term and content addressable.
Fault tolerant, redundancy, and sharing of responsibilities.
Inexact.
Dynamic connectivity.

Applicable if rules are unknown or complicated, or if data are noisy or partial.


Слайд 12Biological neuron


Слайд 13Biological neuron
Many “neurons” co-operate to perform the desired function
Basic elements:
Axon
Dendrite
Synapse


Слайд 14Artificial Neuron Structure
The output of a neuron is a function of

the weighted sum of the inputs plus a bias



Слайд 15Common activation functions


Слайд 17Examples of ANN topologies
Single layer ANN
Multilayer ANN
ANN with one recurrent layer


Слайд 18Fundamentals of learning and training samples
The weights in a neural

network are the most important factor in determining its function.
A training set is a set of training patterns, which we use to train our neural net.
Training is the act of presenting the network with some sample data and modifying the weights to better approximate the desired function

Слайд 19Fundamentals of learning and training samples
There are two main types

of training
Supervised Training
Supplies the neural network with inputs and the correct outputs (results).
We can estimate a error vector for certain input.
Response of the network to the inputs is measured. The weights are modified to reduce the difference between the actual and desired outputs
Unsupervised Training
The training set only consists of input patterns.
The neural network adjusts its own weights so that similar inputs cause similar outputs. The network identifies the patterns and differences in the inputs without any external assistance

Слайд 20Fundamentals of learning and training samples
A training pattern is an

input vector p with the components x1, x2, . . . , xn whose desired output is known.
By entering the training pattern into the network we receive an output that can be compared with the desired output.
The set of training patterns is called P. It contains a finite number of ordered pairs (p, t) of training patterns with corresponding desired output t.

Слайд 21Fundamentals of learning and training samples
Teaching input. Let j be an

output neuron. The teaching input tj is the desired and correct value j should output after the input of a certain training pattern.
Analogously to the vector p the teaching inputs t1, t2, . . . , tn of the neurons can also be combined into a vector t. This vector always refers to a specific training pattern p and contained in the set P of the training patterns.

Слайд 22Fundamentals of learning and training samples
Error vector. For several output neurons

Ω1,Ω2, . . . ,ΩO the difference between output vector and teaching input under a training input p is referred to as error vector.



Слайд 23Fundamentals of learning
Let P be the set of training patters. In

learning procedure we realize finite number of iterations or epochs.
Epoch – single presentation of the entire data to the neural network. Typically many epochs are required to train the neural network
Iteration - the process of providing the network with an single input and updating the network's weights

Слайд 24General learning procedure
Let P be the set of n training

patters pn
For i=1 to n
begin
We calculate NN output vector yi for the training pattern pi.
We compare yi with desired output ti. Then we calculate the error of output and make modification of weights.
end
If total error for the training set P more than some threshold then go to the step 2

Слайд 25Using training samples
We have to divide the set of training samples

into two subsets:
one training set really used to train;
one verification set to test our progress of learning.
The usual division relations are, 70% for training data and 30% for verification data (randomly chosen).
We can finish the training process when the network provides the good results on the training data as well as on the verification data.

Слайд 26Learning curve
The learning curve indicates the progress of the error, which

can be determined in various ways. This curve can indicate whether the network is progressing or not.

Слайд 27Error measurement
Let Ω be the output neuron and O be the

set of output neurons.
The specific error Errp is based on a single training sample.



The total error Err is based on all training samples.


Слайд 28When do we stop learning?
Generally, the training process is stopped when

the user in front of the learning computer "thinks" the error is small enough.

Слайд 29Using neural networks in practice (discussion)
Classification
in marketing: consumer spending pattern

classification
In defence: radar and sonar image classification
In medicine: ultrasound and electrocardiogram image classification, EEGs, medical diagnosis
Recognition and identification
In general computing and telecommunications: speech, vision and handwriting recognition
In finance: signature verification and bank note verification
Assessment
In engineering: product inspection monitoring and control
In defence: target tracking
In security: motion detection, surveillance image analysis and fingerprint matching
Forecasting and prediction
In finance: foreign exchange rate and stock market forecasting
In agriculture: crop yield forecasting
In marketing: sales forecasting
In meteorology: weather prediction

Слайд 30Single layer neural networks


Слайд 31Single layer network with binary threshold activation function



Matrix form


Слайд 32Single layer network with binary threshold activation function








Слайд 33Practice with single layer neural network
Performing a calculations in single

layer neural networks with using direct and matrix form. Using various activation functions.
Using single layer neural networks with binary threshold activation function as linear classifier. Adjusting the linear classifier based on training samples.



Слайд 34Hebbian learning rule



Introduced by Donald Hebb in his 1949 book “The Organization of Behavior”.


Describes a basic mechanism for synaptic plasticity

Слайд 35Hebbian learning rule (matrix form)





Слайд 36Practice with hebbian learning rule


Construction the neural network based on hebbian

learning rule for modeling OR logical operator


Слайд 37Delta rule (Widrow-Hoff rule)
The delta rule is a gradient descent learning rule for updating the

weights of the inputs to artificial neurons in single-layer neural network
The goal is to minimize the error between the actual outputs and the target outputs in the training data
For each (input/output) training pair, the delta rule determines the direction you need to adjust wij  to reduce the error for that training pair.
Derivatives are used for teaching

Слайд 38Delta rule (Widrow-Hoff rule)
ADALINE (ADAptive LINear Element) network




Слайд 39Delta rule (Widrow-Hoff rule)




Gradient descent method: find the steepest way down the slope from

where you are, and take a step in that direction

Слайд 40Delta rule algorithm
Define 0

small random value
Take input pattern and calculate output vector.
Modify weights and bias according delta rule.
Do steps 3-4 until E

Слайд 41Linear classifiers


Слайд 42Practice with delta rule
Construction the ADALINE neural network

(linear classifier with minimum error value) based on given training patterns.



Слайд 43Rosenblatt's single layer perceptron
The perceptron is an algorithm for supervised classification

of an input into one of several possible non-binary outputs.
It is a type of linear classifier.
Was invented in 1957 by Frank Rosenblatt as a machine for image recognition.

Слайд 44Rosenblatt's single layer perceptron

Learning rule


Слайд 45Rosenblatt's learning algorithm
Initialise the weights and the threshold. Weights may be

initialised to 0 or to a small random value.
Take input pattern x from X and calculate output vector y from Y.
If yi=tj then wij will not change.
If yi≠tj then wij(t+1) = wij (t) + α xi tj
Do steps 2-4 until yi=tj for whole training set


Слайд 46Rosenblatt's single layer perceptron
It was quickly proved that perceptrons could not

be trained to recognize many classes of patterns.
It is linear classifier. For example, it is impossible for these classes of network to learn an XOR function.

Слайд 47Practice with Rosenblatt's perceptron
Construction the linear classifier (Rosenblatt’s neural network perceptron)

based on given training patterns.

Слайд 48Associative memory
Associative memory (computer science) - a data-storage device in which

a location is identified by its informational content rather than by names, addresses, or relative positions, and from which the data may be retrieved. This memory enable one to retrieve a piece of data from only a tiny sample of itself.

Associative memory (psychology) - recalling a previously experienced item by thinking of something that is linked with it, thus invoking the association

Слайд 49Associative memory
Autoassociative memories are capable of retrieving a piece of data

upon presentation of only partial information from that piece of data
Heteroassociative memories can recall an associated piece of datum from one category upon presentation of data from another category.

Слайд 50Autoassociative memory based on sign activation function


Neural network structure:
Number of neurons

in the input layer = Number of neurons in the output layer

Activation function

Learning rule
(adopted hebbian rule)

Example:


Слайд 51Practice with autoassociative memory
Realization of the associative memory based on sign

activation function.
Working with multiple patterns.
Recognition of the original and noisy patterns.
Investigation of the properties and constraints of the associative memory based on sign activation function.

Слайд 52Using single layer neural networks for time series forecasting
A time series

- sequence of data points, measured typically at points in time spaced at uniform time intervals

Слайд 53Using single layer neural networks for time series forecasting

Training samples


Слайд 54Practice with time series forecasting
Using ADALINE neural networks for currency forecasting:
Creation

the training set from the raw data (www.val.ru).
Learning the ADALINE.
Training ADALINE network with using delta rule and estimation the error.

Слайд 55Multilayer perceptron


Слайд 56Multilayer perceptron
A multilayer perceptron (MLP) is a feed forward artificial neural network model that maps sets

of input data onto a set of appropriate outputs.
Consists of multiple layers (input, output, one or several hidden layers) of nodes in a directed graph, with each layer fully connected to the next one.
Neurons with a nonlinear activation function.
Utilizes a supervised learning technique called backpropagation of error.


Typical structure


Слайд 57Multilayer perceptron
Structure (2 hidden layers)
Calculation the output Y for input vector

X

Слайд 58Multilayer perceptron
Activation function is not a threshold
Usually a sigmoid function
Function approximator
Not

limited to linear problems
Information flows in one direction
The outputs of one layer act as inputs to the next layer

Слайд 59Classification ability
A single layer network can only find a linear discriminant

function.
It can divide the input space by means of hyperplane (straight lines in two-dimensional space)


Слайд 60Classification ability
Universal Function Approximation Theorem
MLP with one hidden

layer can approximate arbitrarily closely every continuous function that maps intervals of real numbers to some output interval of real numbers

f:[0,1]n->[0,1]
2n+1 neurons in hidden layer.
 Can form single convex
decision regions
One hidden layer is sufficient
for the large majority of problems

Слайд 61Classification ability
Any function can be approximated to arbitrary accuracy by a

network with two hidden layers
MLP with two hidden layers can classify sets of any form. It can form arbitrary disjoint decision regions

Слайд 62Backpropagation algorithm
D. Rumelhart, G. Hinton, R. Williams (1986)
Most common method of

obtaining the weights in the multilayer perceptron
A form of supervised training
The basic backpropagation algorithm is based on minimizing the error of the network using the derivatives of the error function
Backpropagation of error generalizes the delta rule

Слайд 63Basic steps
Forward propagation of a training pattern's input through the neural

network in order to generate the propagation's output activations.
Backward propagation of the output’s error through the neural network using the training pattern target in order to generate the deltas of all output and hidden neurons.

Слайд 64Backpropagation


Слайд 65Backpropagation
We use gradient descent method for minimizing the error


Слайд 66Backpropagation
Theorem. For any hidden layer i of the neural network, error

of the neuron i calculates by recursive way through the errors of neurons of the next layer j.


where m – number of neurons in the next layer j
wij – weights between neuron i and neurons in the next layer j
Sj – weighted sum for the neuron j in next layer.
Proof


Слайд 67Backpropagation
Theorem. We can calculate derivatives of error E through the weights

w and bias T by following way.


Proof


Слайд 68Backpropagation
Backpropagation rule


Слайд 69Backpropagation algorithm
Define the training speed α (0

Em
Initialize the weights and biases by random way.
Take consequently all input patterns x from X.
Calculate output vector y by following way

Realize backpropogation shceme by following way


Modify weights and biases by following way




Слайд 70Backpropagation algorithm
4. Calculate overall error for all patterns


5. If E>Em then

go to the step 3.





Слайд 71Practice. Calculation delta-rule expressions for various activation functions


Слайд 72Some problems
The learning rate is important
Too small
Convergence extremely slow
Too large
May not

converge

The result may converge to a local minimum.

Possible decision:
Using adaptive learning rate


Слайд 73Some problems
Overfitting
The number of hidden neurons is very important, it defines

the complexity of the decision boundary:
Too few
Underfit the data – it does not have enough free parameters to fit the training data well.
Too many
Overfit the data – NN learns the insignificant details
Try different number and use validation set to choose the best one.
Start small and increase the number until satisfactory results are obtained.

Слайд 74What constitutes a “good” training set?
Samples must represent the general population
Samples

must contain members of each class
Samples in each class must contain a wide range of variations or noise effect

Слайд 75Practice with multilayer perceptron
Using MLP for noisy digits recognition &
Using MLP

for time series forecasting.
- Training set preparation.
- MLP learning in Deductor software.
- Estimation the error.

Слайд 76Recurrent neural networks
Capable to influence to themselves by means of recurrences,

e.g. by including the network output in the following computation steps.
Hopfield neural network
Hamming neural network

Слайд 77Hopfield network

1. Invented by John Hopfield in 1982.
2. Content-addressable memory with binary threshold nodes (-1,1 or

0,1)
3. wij=wji, wii=0

Слайд 78Hopfield network


Слайд 79Hopfield network as associative memory


Слайд 80Using hopfield network as associative memory






Слайд 81Hopfield network as associative memory
Take noisy pattern y
Realize iterations




Until we will

not reach stable state (attractor)



Слайд 82Example


Слайд 83Practice with Hopfield network
Realization of the associative memory based on Hopfield

Neural Network
Working with multiple patterns.
Recognition of the original and noisy patterns.
Investigation of the properties and constraints of the associative memory based on Hopfield network.


Слайд 84Hamming network
R. Lippman (1987)
Hamming network is two-network bipolar classifier. The first

layer is single-layer perceptron. It calculates hamming distance between the vectors. The second network is Hopfield network.




Слайд 85Hamming network


Слайд 86Hamming network working algorithm
Define weights wij, Tj
Get input pattern and initialize

Hopfield weights
Make iterations in Hopfield network until we get stable output.
Take output neuron with 1 value.

Слайд 87Self-organizing maps


Слайд 88Self-organizing maps
Unsupervised Training
The training set only consists of input patterns.
The neural

network adjusts its own weights so that similar inputs cause similar outputs. The network identifies the patterns and differences in the inputs without any external assistance

Слайд 89Self-organizing maps (SOM)
A self-organizing map (SOM) is a type of artificial neural networkA self-organizing map (SOM)

is a type of artificial neural network that is trained using unsupervised learning to produce a low-dimensional (typically two-dimensional), discretized representation of the input space of the training samples, called a map.
Self-organizing maps are different from other artificial neural networks in the sense that they use a neighborhood function to preserve the topological properties of the input space.
The model was first described as an artificial neural network by the Finnish professor Teuvo Kohonen. 

Слайд 90Self-organizing maps
We only ask which neuron is active at the moment.
We

are not interested in the exact output of the neuron but in knowing which neuron provides output.
These networks widely used for clustering

SOMs (like our brain) decide the task of mapping a high-dimensional input (N dimensions) onto areas in a low-dimensional grid of cells (G dimensions).

Слайд 92Scheme of training of self-organizing map


Слайд 93Competitive learning
Competitive learning is a form of unsupervised learning in artificial neural networks, in which

nodes compete for the right to respond to a subset of the input data




Слайд 94Competitive learning


Слайд 95Vector quantization
It works by dividing a large set of points (vectors)

into groups having approximately the same number of points closest to them. Each group is represented by its centroid point, as in k-means and some other clustering algorithms.


Слайд 96Vector quantization


Choose random weights from [0;1].
t=1
Take all input patterns Xl,l=1,L




t=t+1


Applications:
data compression


pattern recognition
Video codecs
QuickTime
Cinepak
Indeo etc.
Audio codecs
Ogg Vorbis
TwinVQ
DTS etc.

Слайд 97Kohonen Maps


Слайд 98Kohonen maps



Слайд 99Kohonen maps learning procedure
Choose random weights from [0;1].
t=1
Take input pattern Xl

and calculate Dij=(Xl-Wij),where i,j=1,m
Detect winner neuron D(k1,k2)=min(Dij)
Calculate for every output neuron


Modify weights by following way


Repeat steps 3-6 for all input patterns

Слайд 100Training and Testing


Слайд 101Training
The goal is to achieve a balance between correct responses for

the training patterns and correct responses for new patterns.

Слайд 102Training and Verification
The set of all known samples is broken into

two independent sets
Training set
A group of samples used to train the neural network
Testing set
A group of samples used to test the performance of the neural network
Used to estimate the error rate

Слайд 103Verification
Provides an unbiased test of the quality of the network
Common

error is to “test” the neural network using the same samples that were used to train the neural network.
The network was optimized on these samples, and will obviously perform well on them
Doesn’t give any indication as to how well the network will be able to classify inputs that weren’t in the training set

Слайд 104Summary (Discussion)
Artificial neural networks are inspired by the learning processes that

take place in biological systems.
Artificial neurons and neural networks try to imitate the working mechanisms of their biological counterparts.
Learning can be perceived as an optimisation process.
Biological neural learning happens by the modification of the synaptic strength. Artificial neural networks learn in the same way.
The synapse strength modification rules for artificial neural networks can be derived by applying mathematical optimisation methods.

Слайд 105Summary
Learning tasks of artificial neural networks can be reformulated as function

approximation tasks.
Neural networks can be considered as nonlinear function approximating tools (i.e., linear combinations of nonlinear basis functions), where the parameters of the networks should be found by applying optimisation methods.
The optimisation is done with respect to the approximation error measure.
In general it is enough to have a single hidden layer neural network (MLP or other) to learn the approximation of a nonlinear function.

Слайд 106Questions and Comments


Слайд 107Thank you for attention


Обратная связь

Если не удалось найти и скачать презентацию, Вы можете заказать его на нашем сайте. Мы постараемся найти нужный Вам материал и отправим по электронной почте. Не стесняйтесь обращаться к нам, если у вас возникли вопросы или пожелания:

Email: Нажмите что бы посмотреть 

Что такое ThePresentation.ru?

Это сайт презентаций, докладов, проектов, шаблонов в формате PowerPoint. Мы помогаем школьникам, студентам, учителям, преподавателям хранить и обмениваться учебными материалами с другими пользователями.


Для правообладателей

Яндекс.Метрика