The Performance Of Neural Network Statistical Evaluationdocx
=== The performance of neural network – statistical evaluation ===
The Performance of Neural Network – Statistical
Evaluation
Abstract—The problem of evaluating the performance of the neural network, based on a study of the probabilistic behavior of the network is considered in this paper. Direct propagation network consisted of layer of input nodes, hidden layer and output layer is examined. For such networks estimates for some of the statistical characteristics of the neural network in the case of two recognized classes were obtained. The problem of constructing and realization a self-organizing neural networktype map for solving the problem of data classification is also considered. The neural network receives on input the input vector, make preprocesses in separation of significant components in the input data, and then decides the classification problem.
Index Terms—neural network, the weight of the neuron, recognition of the stimulus, self-organizing map.
INTRODUCTION
Artificial neural networks are often used for a variety of applications, such as pattern recognition. For the successful application of artificial neural networks is necessary to choose the right network architecture, to pick up its parameters, thresholds elements, activation function, and others [1], [2], [6], [7], [9]. Currently, neural computation of different levels of implementation, from specialized hardware to the neural network software packages are becoming more widely used.
They are successfully used to solve a number of tasks such as forecasting economic and financial indicators, the prediction of complications in patients in the postoperative period, biometric identification based on various characteristics, image processing, and others.
All this is possible for the neural networks because of their ability to learn, the establishment of associative connections in the input data. Typically, depending on the task, required to resort to different methods to transform input data, allowing a correct judgment about the laws and special features in the data that reflect their quality characteristics [4].
There is a problem data filtering and display them in the space of smaller dimension.
PROBLEM DESCRIPTION
The neural network technologies particularly important role played by the selection of the network architecture and the values of its parameters that affect its efficiency.
Artificial neural network is a mathematical model of biological neuron. It consists of relating to each other neurons. A neural network is exposed to learning by providing it with the input information in the form of numerical sequences. The training set up internal connections between neurons through which the network is endowed ability to recognize unfamiliar images.
There are different types of neural networks that differ in both the topology and the learning algorithms. In the problems of classification of some of the most effective are self-organizing maps, which unsupervised learning algorithm. They consists of two layers: the input and functional. The functional layer neurons are located on the grid composed of cells. Each neuron takes a single cell and is connected to all the neurons of the input layer (Fig.1).
Self-organizing map operates on the principle of ”winner takes all” [3]. That is, when a sequence of input vectors presented to the network, all the neurons of the functional layer cause activation function, which reflects the relationship between the input data and weight characteristics of these neurons. The winner is the neuron whose activation function (for example, function of the Euclidean distance between the vector of the input data and the weight vector of the neuron) takes the optimum value, such as the maximum or minimum.
If V1,V2,V3,…,Vn – input vector, a functional layer is selforganizing maps with k neurons, the winner will be the one neuron for which where Wij is weight coefficient i-th neuron functional layer associated with the j-th neuron of the input layer.
(1)
At the stage of network training presented training sequence data. The essence of the training is that at each step is determined by the neuron-winner, and neurons, located in a neighborhood of the neuron-winner more close to him. The result of training the neurons closer to the destination areas formed in a grid defined training sequence.
The radius of the neighborhood of the neuron-winner varies according to the formula:
(2)
The initial value of the radius of the neighborhood is taken
max(width,height)
σ0 = (3)
2
where width is width of the grid of the functional layer, height is height, λ is time constant is calculated by the formula:
(4)
where n is number of training iterations.
During the training, a network of neurons, neurons of the winner and his σ-neighborhood change their weights as follows:
(5)
where t is a iteration number,
(6)
(7)
dist – the distance between the neutron and the neutron-winner on the grid of the functional layer. L(t) function is called learning coefficient, θ(t) reflects the impact of the location of the neuron on the grid. Obviously, the most changed the weight of the winning neuron (Fig. 2).
Before learning network converts the input vectors on the allocation of the main components, whereby the dimension of the input space decreases and, consequently, the number of neurons in the input layer of the network. The trained network is tested for control of the input data and evaluate the effectiveness of its work. Training and tested neural network capable of recognizing data, similar to those on which it is trained.
Of particular interest is the study of the probabilistic behavior of the neural networks [8]. The neural network of direct distribution, which consists of a layer of input nodes, hidden
Fig. 2. Effect of the distance from the neuron-winner at changing the balance
layer and output layer is investigated. Neurons have a oneway communication, do not contain links between elements within the layer and backward linkages between the layers. The neurons of the input layer are connected to the hidden layer neurons excitatory and inhibitory connections randomly.
The outputs of all neurons in the hidden layer neurons are connected to the output layer. Neurons in each layer is referred to as the input, hidden and output elements, respectively [2], [7]. In addition, we note that the software engineers can develop software and implement the type of self-organizing neural network card for data classification solutions, and develop a library of special Java-classes, providing parallel processing.
PROBLEM
If each class has, respectively, l1 and l2 representatives, so after training sequence L = l1 + l2 two classes activators, so in k class hidden items accumulated weight:
Vk = Xδi Xηk(j)+ V0,(i = 1,2),(j = 1,1i) (8)
where V0 – initial weight of k − th hidden element. When at the entrance to a network of the pathogen ξi the input of the output layer is fed weight (k = 1,NA)
Ut =
where
Ut = XXδi Xηk(j,t) (10)
, if k-th hidden element is active
for ξj and ξt (11) otherwise.
The pathogen belonging of pathogen ξt to one of two classes is determined by comparing the weight Ut with a threshold output element R. When Ut > θR so we have first class. When Ut < θR – second class. When Ut = θR – refusal of recognition.
For the given network, a predetermined training sequence and a predetermined reference stimulus ξt, the weight Ut has a certain value. However, for the class of neural networks the Ut is a random variable. To determine the probability, that the network selected from a class, correctly classifies stimulus ξx, is required to calculate the probability characteristics of the random variable of weight Ux, supplied to the input of the output layer.
According to the modified Chebyshev inequality [5], for any random variable z with mathematical expectation Mz = µ with any dispersion Dz = σ2 we have following relations:
where P – the probability of the corresponding event, σ average quadratic deviation. Equations (12,13) can be used to estimate the probability of a correct response network with θR = 0 on ξx in the case of two recognized classes.
From the (12,13) follows that the probability of correct recognition is increases, if σ2/µ2 attitude is tends to zero.
With appropriate probabilistic characteristics of the network, we can estimate the probability of a correct response network to ξx. If the relation σ2(Ux)/µ2(Ux) can be made arbitrarily small, then for a selected network with θR = 0 the probability that a stimulus ξx is classified correctly approaches unit.
Let NA is number of hidden elements of the network, L = li + lj, li(lj) is the length of the training sequence of i − th(j − th) class, δi(δj) is increase of hidden element of weight when the stimulus from the i − th(j − th) grade pathogens is displaying. Pi is probability of excitation of hidden element when displaying a stimulus from the i − th class, Pij – probability of excitation of hidden element when displaying incentives from the i−th and j−th grade. To find the weight of the expectation at the input of the output layer proved the following theorem [8].
Theorem. Let the a set pathogens Ω = {ξ},Ω = Ω1 ∪Ω2, Ω1 ∩Ω2 = ∅, and training sequence ξ1,ξ2,…,ξL are given. Then the expectation of the weight input in the output layer when the stimulation of the i − th class is equal to
µi = NA(δiPili + δjPijlj), i,j = 1,2 (14)
Having µi, let estimate dispersion of random variable Ux with appearance of the network at the entrance of the pathogen ξx. For j = 1,..,L and r = 1,..,L will received:
σ2(Ux) = NAL2XXvjvrδjδr(Pjrx − Pjk ∗ Prx) (15)
where Pjx – excitation probability of hidden element when displaying incentives ξj,ξx,Prx – excitation probability of hidden element when displaying incentives ξr,ξx,Pjrx – excitation probability of hidden element when displaying incentives ξj,ξr,ξx.
Since the ratio σ2(Ux)/µ2(Ux) is not depend of the length of the training sequence, so any number of repetitions of the same training sequence does not change the characteristics of the system
Let consider the relation σ2(Ux)/µ2(Ux) for the control the stimulus ξx in cases, if ξx ∈ Ω1 and ξx ∈ Ω2.
RESULTS
Assessing the probability Pjrx,Pjx,Prx the expressions for the dispersion will be recieved.
Selected the following cases:
First case: assume that the control stimulus ξx ∈ Ω1. For the stimulus ξj and ξr in this case we obtain the following probabilities:
a). if ξj ∈ Ω1 and ξr ∈ Ω1, so Pjrx = Pjx = Prx = P1;
b). if ξj ∈ Ω1 and ξr ∈ Ω2, so Pjrx = P12,Pjx = P1,Prx =
P12;
c). if ξj ∈ Ω2 and ξr ∈ Ω1, so Pjrx = P12,Pjx = P12,Prx =
P1;
d). if ξj ∈ Ω2 and ξr ∈ Ω2, so Pjrx = P12,Pjx = P12,Prx =
P12.
To calculating follow relationship for these case
(a,b,c,d):
(16)
(18)
(19)
The probability of detection for the first class is increased by the maximum value of P1 − P12, called the characteristic function of the perceptron (CFP) [5]. Obviously, the function CFP seeks to a maximum value at P1 → 1.
In assessing the value of again see that P1 → 1 when
(in most of the discussed cases). For the first class
of stimuli received condition , at which execution
CFP → max, i.e. the probability of correct recognition for the first class stimulus is increases.
Second case: assume that the control stimulus ξx ∈ Ω2. For the stimulus ξj and ξr in discussed case the following probability will obtained:
a). if ξj ∈ Ω2 and ξr ∈ Ω2, so Pjrx = Pjx = Prx = P2;
b). if ξj ∈ Ω1 and ξr ∈ Ω2, so Pjrx = P12,Pjx = P12,Prx =
P2;
c). if ξj ∈ Ω2 and ξr ∈ Ω1, so Pjrx = P12,Pjx = P2,Prx =
P12;
d). if ξj ∈ Ω1 and ξr ∈ Ω1, so Pjrx = P12,Pjx = P12,Prx =
P12;
Similarly, calculating , for second class will be received the condition correct recognition:
. (20)
Hence
. (21)
CONCLUSIONS
The resulting estimates for certain statistical characteristics of a neural network in the case of two recognized classes have shown efficiency in the training of neuron networks. Moreover, a new condition for improving accuracy of recognition for the stimulus i-th class is received.
REFERENCES
Anil K. Jain, Jianchang Mao, K. M. Mohiuddin. Artificial Neural Networks: A Tutorial. Computer Special issue: neural computing: companion issue to Spring 1996 IEEE Computational Science Engineering, Volume 29, Issue 3, March 1996.
Barsky A.B. Neural networks: the recognition, management, decisionmaking. M .: Finance and Statistics, 2004, 176 p. (in Russian)
Buraga, S.C., Cioca, M., Cioca, A. Grid-based decision support system used in disaster management, Studies in Informatics and Control, 16(3), 2007
Cioca, M. Cioca, L.I., Buraga, S.C. Using semantic web technologies to improve the design process in the context of virtual production systems, WSEAS Transactions on Computers, 4(12), pp. 1788-1793, 2005
Ivakhnenko A.G. Perceptronpattern recognition system, Naukova Dumka,
Kiev, 1975. p. 426. (in Russian)
Neural networks. Matlab 6, Dialog-MIFI. 2002, 496 p. (in Russian)
Panteleev S.V. Development, research the use of neural network algorithms, .: 2001, 496 p. (in Russian)
Sargsyan S.G. Determination of the probability characteristics of adaptive recognition system, Trans. of Conf. Adaptable software, Kishinev, 1990. pp. 4651. (in Russian)
Vesna Rankovic, Jasna Radulovic, Nenad Grujovic, Dejan Divac, Neural Network Model Predictive Control of Nonlinear Systems Using Genetic Algorithms, INT J COMPUT COMMUN, 7(3), ISSN 1841-9836, pp. 540-549, 2012
Copyright Notice
© Licențiada.org respectă drepturile de proprietate intelectuală și așteaptă ca toți utilizatorii să facă același lucru. Dacă consideri că un conținut de pe site încalcă drepturile tale de autor, te rugăm să trimiți o notificare DMCA.
Acest articol: The Performance Of Neural Network Statistical Evaluationdocx (ID: 120427)
Dacă considerați că acest conținut vă încalcă drepturile de autor, vă rugăm să depuneți o cerere pe pagina noastră Copyright Takedown.
