Buletin S tiint i c – Universitatea din Pite sti [602273]
Buletin S tiint ic – Universitatea din Pite sti
Seria Matematic a si Informatic a, Nr. 21(2015), pg.1-N
IC's estimation performance by
multi-unit algorithms based on
negentropy function
Doru CONSTANTIN
Month, 2015
Abstract
The aim of this paper is to present the comparative study of con-
vergence for multi-unit algorithms based on negentropy function for
estimating the independent components. In the rst part of the pa-
per are presented the fundamental notions about negentropy func-
tion and the gradient algorithm for one component independent.
Then, is described the fast algorithm and the multi-unit algorithms
based on the de
ationary orthogonalization and the symmetric or-
thogonalization. The paper concludes by presenting experimental
results on the convergence of the multi-unit algorithms.
2010 Mathematics Subject Classication: 68; 68T10, 68T05,
68U10
Keywords and phrases: ICA, BSS, signal processing.
1 Introduction
A fundamental problem in neural network research, as well as in many
other disciplines, is nding a suitable representation of multivariate data,
University of Pitesti, Department of Mathematics and Computer Science, Str.
Targu din Vale, no.1, Pitesti
1
2 Doru CONSTANTIN
random vectors. For reasons of computational and conceptual simplic-
ity, the representation is often sought as a linear transformation of the
original data. In other words, each component of the representation is
a linear combination of the original variables. Well known linear trans-
formation methods include principal component analysis, factor analysis,
and projection pursuit. Independent component analysis is a recently
developed method in which the goal is to nd a linear representation of
non-Gaussian data so that the components are statistically independent,
or as independent as possible [9,6]. Such a representation seems to cap-
ture the essential structure of the data in many applications, including
feature extraction and signal separation.
2 One-unit algorithms based on negentropy func-
tion
The negentropy function is a measure of the nongaussianity and is dened
based on the entropy function. The entropy function Hof a random
vectorywith density function py() have the expression:
H(y) = Z
py() logpy() (1)
A fundamental result of information theory is that a gaussian variable
has the largest entropy among all random variables of equal variance [6].
This means that entropy could be used as a measure of nongaussianity.
To obtain a measure of nongaussianity that is zero for a gaussian
variable and always nonnegative, one often uses a normalized version of
dierential entropy, called negentropy. Negentropy J is dened as follows:
J(y) =H(ygauss) H(y) (2)
whereygauss is a gaussian random variable of the same correlation (and
covariance) matrix as y.
Negentropy approximations There are some approximations of the
negentropy function used in practical applications. The classic method
of approximating negentropy is using higher-order cumulants:
J(y)1
12Efy3g2+1
48kurt(y)2(3)
IC's estimation performance by multi-unit algorithms based on negentropy function 3
whereyis assumed to be of zero mean and unit variance.
Another approximation is based on two nonquadratic functions G1
andG2so thatG1is odd and G2is even, and we obtain:
J(y)k1(EfG1(y)g)2+k2(EfG2(y)g EfG2()g)2); (4)
wherek1andk2are positive constants, is a gaussian variable of zero
mean and unit variance and yis assumed to have zero mean and unit
variance.
In the case where we use only one nonquadratic function G, the ap-
proximation becomes:
J(y)[EfG(y)g EfG()g]2(5)
The gradient algorithm Taking the gradient of the approximation
of negentropy in (5)with respect to wand taking the normalization
Ef(wTz)2g=kwk2= 1 we obtain:
w/
Efzg(wTz)g (6)
w w
kwk(7)
where
=EfG(wTz)g EfG()gandbeing a standardized gaussian
random variable. For function gwe may use:
g1(y) = tanh( a1y) (8)
g2(y) =yexp( y2
2) (9)
g3(y) =y3(10)
where 1a12 is a constant.
The algorithm for one independent component estimation
Step 1 : Data centering (make its mean zero).
Step 2 : Data preprocessing (whitening data) and obtain z.
Step 3 : Choose an initial value for wof unit norm and an initial
value for
.
4 Doru CONSTANTIN
Step 4 : Update scheme by
w/
zg(wTz);
where the function gis dened in (8), (9), (10).
Step 5 : Normalize the vector wby:
w w
kwk:
Step 6 : If the sign of
is not known a priori, update
/(G(wTz) EfG()g)
:
Step 7 : If the algorithm not converged, go back to Step 4.
The xed-point algorithm for ICA model estimation From the
gradient method in (6) we may establish the following xed-point itera-
tion:
w Efzg(wTz)g (11)
After rewriting the (11) relation we have:
w=Efzg(wTz)g,(1 +)w=Efzg(wTz)g+w (12)
According to the Lagrange conditions EfG(wTz)gunder the con-
straintEfwTzg=kwk2= 1 are obtained at points where the gradient of
the Lagrangian is zero:
Efzg(wTz)g+w= 0 (13)
Now let us try to solve this equation by Newtons method, which is equiv-
alent to nding the optima of the Lagrangian by Newtons method. De-
noting the function on the left-hand side of (13) cu F, we obtain its
gradient:
@F
@w=EfzzTg0(wTz)g+I (14)
Apply a reasonable approximation:
IC's estimation performance by multi-unit algorithms based on negentropy function 5
EfzzTg0(wTz)g EfzzTgEfg0(wTz)g=Efg0(wTz)gI. Thus we
obtain the following approximative Newton iteration:
w w Efzg(wTz)g+w
Efg0(wTz)g+(15)
This algorithm can be further simplied by multiplying both sides of
(16) with+Efg0(wTz)g. This gives the following form:
w Efzg(wTz)g Efg0(wTz)gw (16)
This is the basic xed-point iteration in FastICA.
The FastICA algorithm for estimating one independent compo-
nent
Step 1 : Data centering.
Step 2 : Data preprocessing and obtain z.
Step 3 : Choose an initial value for vector wof unit norm.
Step 4 : Apply the updating rule:
w Efzg(wTz)g Efg0(wTz)gw;
where function gis dened in (8), (9), (10).
Step 5 : Normalize the vector w:
w w
kwk:
Step 6 : If the algorithm not converge, come back to 4.
3 Multi-unit algorithms for ICA model estimat-
ing
It is possible to nd more independent components by running an one-
unit algorithm many times and using dierent initial points but with
the property like the vectors wicorresponding to dierent independent
components are orthogonal in the whitened space [9,6].
6 Doru CONSTANTIN
3.1 The IC's estimation by de
ationary orthogonalization
For de
ationary orthogonalization is using the GramSchmidt method.
This means that we estimate the independent components one by one
and alternate the following steps:
Step 1 : Set the desired number of ICs to estimate mand initial-
izationp= 1.
Step 2 : Initialize wp.
Step 3 : Do an iteration of a one-unit algorithm and obtain wp.
Step 4 : Do orthogonalization transformation:
wp wp p 1X
j=1(wT
pwj)wj (17)
Step 5 : Normalize the vector wp:
w w
kwk:
Step 6 : ifwphas not converged back to step 3.
Step 7 : Set p p+ 1. Ifpis not greater than mback to step 2.
3.2 The IC's estimation by symmetric orthogonalization
In this case the vectors wiare estimated in parallel, not estimated one
by one. Thus the symmetric orthogonalization methods enable parallel
computation of ICs. The general form of this algorithm is:
Step 1 : Set the desired number of ICs to estimate m.
Step 2 : Initialize wi;i= 1;:::;m .
Step 3 : Do an iteration of a one-unit algorithm on every wiin
parallel scheme.
Step 4 : Do a symmetric orthogonalization of the matrix W=
(w1;:::;w m)T.
Step 5 : If wpnot converged back to step 3.
IC's estimation performance by multi-unit algorithms based on negentropy function 7
The symmetric orthogonalization of Wcan be accomplished by:
W (WWT) 1=2W (18)
The inverse square root ( WWT) 1=2is obtained from the eigenvalue de-
composition of WWT=Ediag (d1;:::;d m)ET:
(WWT) 1=2=Ediag (d 1=2
1;:::;d 1=2
m)ET(19)
A simpler alternative is the following iterative algorithm:
Step 1 : Calculate W W=kWk.
Step 2 : Calculate W 3=2W 1=2WWTW.
Step 3 : If the matrix WWTis not close enough to identity ma-
trix then go to step 2.
4 Experimental results for convergence of the
multi-unit algorithms
By using the FastICA algorithm we can determine the components inde-
pendent and was considered the estimate of the independent components
problem of a mixture of signals. The original signals are obtained from
the mixing matrix signals. For estimate de ICA model we have two multi-
unit algorithms: the algorithm based on the de
ationary orthogonaliza-
tion and the algorithm based on the symmetric orthogonalization. In the
experimentally applications we choose the following nonlinear functions
for function gused in the algorithms:
default function g(u) =u3.
function tanh g(u) =tanh(u).
function gauss g(u) =uexp( u2=2).
functiong(u) =u2.
To compare convergence for the two types of approaches, by de
ating
and symmetrically transformation, using the four functions mentioned
8 Doru CONSTANTIN
above, was considered for example the following mixing matrix form:
A=0
BBBBBBBBBB@1 2 3 1 2 3 1 2
1 2 3 1 2 3 1 2
1 2 3 1 2 3 1 2
1 2 3 1 2 3 1 2
1 2 3 1 2 3 1 2
1 2 3 1 2 3 1 2
1 2 3 1 2 3 1 2
1 2 3 1 2 3 1 21
CCCCCCCCCCA(20)
The application establish the seven independent components approxima-
tion of the original signals and the convergence is shown in the next table
by average of the iterations number:
Table I: The mean number of steps for convergence.
Function Symmetric De
ationary
1.g(u) =u383 steps 12-8-8-5-5-5-2
2.g(u) =tanh(u) 18 steps 16-14-14-10-5-4-2
3.g(u) =uexp( u2=2) 16 steps 12-8-16-21-17- –
4.g(u) =u217 steps 14-13-16-26- – –
From above table that presents the number of steps of convergence multi-
unit algorithms with symmetric and de
ationary orthogonalization note
that for the algorithm based on the symmetric orthogonalization the
function of type 3, 4 and 1 produce a suitable results of convergence
expressed through number of steps, and for the algorithm based on the
de
ationary orthogonalization the function of type 1 and 2 produce a
suitable results of convergence.
5 Conclusions
For estimating the independent components was used the negentropy
function like a contrast function. By using the negentropy we may derive
the updating rule for ICA estimation and obtain the general gradient one-
unit algorithm, the fastica algorithm and the multi-unit algorithms based
on the symmetric and de
ationary orthogonalization. For the multi-
unit algorithms based on the negentropy function and the symmetric and
de
ationary orthogonalization were established the experimental results
IC's estimation performance by multi-unit algorithms based on negentropy function 9
that illustrating the performance of original signals recognition in terms
of convergence.
References
[1] C. Balcau, Combinatorica si Teoria Grafurilor , Ed. Univ. Pitesti, 2007.
[2] C.M. Bishop, Neural Network for Pattern Recognition , Clarendon Press,
1995.
[3] A. Cichocki, R. Unbehauen, Neural Networks for Signal Processing and Op-
timization , Wiley, 1994.
[4] T.M. Cover, J.A. Thomas, Elements of Information Theory , Wiley, 1991.
[5] R. Gonzalez, P. Wintz, Digital Image Processing , Addison-Wisley, 1987.
[6] A. Hyvrinen, J. Karhunen, E. Oja, Independent Component Analysis , John
Wiley-Sons, 2001.
[7] T.W. Lee, Independent Component Analysis – Theory and Applications ,
Kluwer, 1998.
[8] D.A. Popescu, N. Bold, O. Domsa, A generator of sequences of hierarchical
tests which contain specied keywords , International Symposium on Applied
Computational Intelligence and Informatics, May 12-14, Timisoara, 2016.
[9] J.V. Stone, Independent Component Analysis: A Tutorial Introduction , Mit
Press, 2004.
Copyright Notice
© Licențiada.org respectă drepturile de proprietate intelectuală și așteaptă ca toți utilizatorii să facă același lucru. Dacă consideri că un conținut de pe site încalcă drepturile tale de autor, te rugăm să trimiți o notificare DMCA.
Acest articol: Buletin S tiint i c – Universitatea din Pite sti [602273] (ID: 602273)
Dacă considerați că acest conținut vă încalcă drepturile de autor, vă rugăm să depuneți o cerere pe pagina noastră Copyright Takedown.
