Blind separation of mixed-kurtosis signals using an adaptive threshold nonlinearity
Blind separation of sources, part II
F=(I+C)
~, w i t h D i a g { C } = 0 .
(3)
This constraint is assumed in the HJ algorithm. Now, it remains to find matrix C, that is, only n ( n - 1 ) unknowns. Ideally, there are several equivalent solutions given by
Wege hergeleitet worden. In diesem Beitrag versuchen wir, ihn von einem statistischen Gesichtspunkt aus zu beschreiben. Z u m Beispiel kann der Erneuerungsterm der synaptischen Wirksamkeitsmatrix nicht der Gradient eines einzelnen C 2Funktionals sein - obwohl er gelegentlich so interpretiert wird. Wir zeigen, dab der HJ-Algorithmus in der Tat gemeinsame Nullstellen von n Funktionalen mittels stochastischer, im Pipelining ausgefiihrter Iterationen sucht. A u f der Grundlage von Simulationen werden Vorteile u n d Bergrenzungen sowie mrgliche Verbesserungen nach kurzer theoretischer Analyse dargelegt. R6sum6. Bien qu'il 6veille maintenant de plus en plus de curiosit6, l'algorithme it&atif HJ n'a jusqu'~ pr6sent jamais 6t6 justifi6 en termes math6matiques rigoureux. Par exemple, le terme de remise ~ jour des poids synaptiques ne peut pas ;~tre le gradient d ' u n e fonctionnelle unique de classe C 2, contrairement h ce qui est paffois reconnu. N o u s essayons dans cet article de le pr6senter en empruntant une approche statisique. Nous montrons que I'algorithme HJ recherche en r6alit6 les z6ros c o m m u n s de n fonctionnelles par des it6rations stochastiques pipelin6es. Apr~s une courte analyse th6orique, et en nous a p p u y a n t sur des simulations, nous soulignons les avantages et les limitations de l'algorithme, ainsi q u ' u n certain nombre d'am61iorations possibles.
盲源分离英文
Mathematical Mode of BSS
The mathematical mode of liner BSS can be expressed as X(t)=A(t)S(t)+N(t) where X(t)=[x1(t), …, xm(t)] is the tested signals of ‘m’ sensors; S(t)=[s1(t), …, sn(t)] is the ‘n’ source signals; A(t) is the unknown mixing matrix; and N(t)=[n1(t), …, nm(t)] is the testing noise. n1 s1 ˆ1 s x1 s2 n2 ˆ2 mixing x s 2 s3 matrix( BSS ˆ3 s nm xm A) sn ˆ s
Source signal satisfy gaussian random distribution, only need to Noof or the sourceorder signal is a signal the second obey gaussian and statistics signaldistribution, can be completely the source signals between described. each other have good which has good sparse independent statistical source signals is more zero properties. The source element in the signal sourcematrix, signal. mixing matrix and mixed signal of all elements of nonnegative matrix.
Blind partial separation of instantaneous mixtures of sources
Blind partial separation of instantaneous mixtures ofsourcesD.T.PhamLaboratoire de Mod´e lisation et Calcul,BP53,38041Grenoble Cedex,FranceDinh-Tuan.Pham@imag.frAbstract.We introduce a general criterion for blindly extracting a subset ofsources in instantaneous mixtures.We derive the corresponding estimation equa-tions and generalize them based on arbitrary nonlinear separating functions.Aquasi-Newton algorithm for minimizing the criterion is presented,which reducesto the FastICA algorithm in the case when only one source is extracted.Theasymptotic distribution of the estimator is obtained and a simulation example isprovided.1IntroductionBlind source separation(BSS)has attracted much attention recently,as it has many use-ful applications.The simplest and most widely used BSS model assumes that the ob-servations are linear mixtures of independent sources with the same number of sources as the number of mixtures:X=AS where X and S represent the observation and the source vectors,both of a same dimension K,and A is an invertible matrix.The aim is to extract the sources from their mixtures,without relying on any specific knowl-edge about them and quite a few good algorithms have been proposed for this task.In many applications(biomedical for ex.)however,the number K of mixtures can be very large and therefore one may be interested in extracting only a small number of(inter-esting)sources.In such case,many sources would be nearly Gaussian and since BSS algorithms rely on the non Gausianity,these sources would not be reliably extracted.In fact,in BSS problem with very large number of mixtures,one routinely discards most of the extracted sources and only retain some of them.To extract only a small number of sources,one may of course proceed sequentially by extracting them one by one,using,for example,the(one-unit)FastICA algorithm[1]. However,such procedure entails a loss of performance as the accuracy of an extracted source is affected by the inaccuracy of the previously extracted ones,since the former is constrained to be uncorrelated with the later.Further,as it will be shown later,even for thefirst extracted source,the performance on the FastICA(with the optimal choice of the nonlinearity)is still less than extracting all sources simultaneously(through an optimal algorithm)and then retaining only one(adequately chosen)source.However, there is no loss of performance if one extracts simultaneously only m<K sources, provided that the remaining K−m are Gaussian.In this paper we shall develop a class of algorithms for extracting only m<K sources.For m=1,this class contains the one-unit FastICA algorithm,and for m=K,2Phamit contains the quasi-maximum likelihood algorithm in [2]and the mutual information based algorithm in [3].In the sequel,we shall assume,for simplicity,that the sources have zero means.If they are not,one just centers them,which amounts to centering the observed vector X .2Estimation methodFor the full extraction of sources,that is for the case m =K ,the mutual information approach leads to the criterion [3]:Ki =1H (Y i )−log det B (1)(to be minimized with respect to B )where Y i are the components of Y =BX and H (Y )denotes the Shannon differential entropy of the random variable Y :H (Y )=−E[log p Y (Y )],p Y denoting the density of Y and E denoting the expectation oper-ator [4].This criterion can be written up to an additive constant as K i =1H (Y i )−(1/2)log det(BCB T )where C =E(XX T )denotes the covariance matrix of X .The nice thing is that it involves only the statistical properties of the variables Y 1,...,Y K ,as BCB T represents the covariance matrix of the vector Y .Thus one can easily extend it to the case where only m <K sources are sought.More precisely,we will consider the criterion C (B )=m i =1H (Y i )−12log det(BCB T ),(2)in which B is a m ×K matrix and Y 1,...,Y m are the components of Y =BX .It has been shown in [5]that in the case where m =K ,one can generalize the criterion (1)by replacing H (Y i )by log Q (Y i )where is Q is a class II superadditive functional.Recall that [6]a functional Q of the distribution of a random variable Y ,is said to be of class II if it is scale equi-variant 1,in the sense that Q (aY )=|a |Q (Y )for any real number a ,and it said to be superadditive if;Q 2(Y +Z )≥Q 2(Y )+Q 2(Z )(3)for any pair of independent random variables Y,Z .It is proved in [5]that this general-ized criterion is still a contrast,in the sense that it can attain its minimum if and only if each of the Y 1,...,Y K is proportional to a different source.We can show that this result carries to the case m <K ,but the proof is omitted due to lack of space.Thus,in (2),one may take H =log Q where Q is a class II superadditive functional.Note that the exponential of the entropy functional has this property [6].The superadditivity condition is quite strong because (3)must be satisfied for any pair of independent random variables Y,Z ,but actually it is enough that this holds for random variables which are linear mixtures of sources.Thus (2)may still be a valid 1The definition of class II in [6]also requires that Q be translation invariant,but since we are working with zero-mean random variables,we drop this requirementBlind partial separation of sources3 criterion if H is only a class II functional.In fact,for such functional,the point B for which the components of Y are proportional to distinct sources,is still a stationary point of the criterion.Indeed,the gradient of the criterion(2)can be seen to beE[ψY(Y)X T]−(BCB T)−1BC(4)whereψY(Y)is the vector with componentsψY1(Y1),...ψYm(Y m)andψY denotesthe“coordinate free”derivative of the functional H,defined by the conditionlim→0[H(Y+ Z)−H(Y)]/ =E[ψY(Y)Z](5)for any random variable Z.(For H the entropy functional,this is the score function[3].) Setting the gradient(4)to zero yields the estimation equation(for the stationary point of the criterion),which can be seen to be equivalent toE[ψY(Y)S T]−[E(YY T)]−1E(YS T)=0,(6)since X=AS.Note that if Y i is proportional to a source Sπi,E[ψYi (Y i)S j]=E[ψYi(Y i)Y i]E(Y i Sπi)/E(Y2i),j=πi0,j=πiThus,provided that E[ψYi (Y i)Y i]=1,equation(6)is satisfied as soon as Y1,...,Y mare proportional to distinct sources.On the other hand,since Q is of class II,H(Y+ Y)=H(Y)+log(1+ ),which by(5)yields immediately E[ψY(Y)Y]=1.A simple example of class II functional is Q(Y)=exp{E[G(Y/σY)]}σY,where G is somefixed function andσY=[E(Y2)]1/2.This functional yields,in the case m=1,the same criterion as in the FastICA algorithm.Indeed,in the case m=1and with H=E[G(Y/σY)]+logσY,(2)becomesC(b)=E[G(Y/σY)],Y=bX,where we have used the symbol b in place of B to emphasize that it is a row vector. The corresponding functionψY is then given byψY(y)=g(y/σY)/σY+{1−E[g(Y/σY)Y/σY]}y/σ2Y(7) where g is the derivative of G.In practice the(theoretical)criterion C would be replaced by the empirical criterion ˆC,defined as in(2)but with H replaced by an estimateˆH and C replaced by the sample covariance matrixˆC of X.The gradient ofˆC is still given by(4)but with C replaced byˆC andψY replaced byˆψY i[Y i(t)]=n∂ˆH(Y i)/∂Y i(t),t=1,...,n,(8) Y(t)=BX(t)and X(1),...,X(n)being the observed sample[3].In the case H(Y)=E[G(Y/σY)]+logσY,its estimatorˆH is naturally defined by the same ex-pression but with E replaced by the sample average operatorˆE andσ2Y replaced by4Phamˆσ2Y =ˆE(Y 2).The function ˆψY is again given by (7)but with E replaced by ˆE and σY replaced by ˆσY .The above argument shows that one can even start with the system of estimating equations obtained by equating (4)to zero,with ψY i being arbitrary functions (depend-ing on the distribution of Y i )subjected to the only condition that E[ψY i (Y i )Y i ]=1.In practice,one would replace ψY i by some estimate ˆψY i ,E by ˆE and C by ˆC ,which results in the empirical estimating equationˆE[ˆψY (Y )X T ]−(B ˆCB T )−1BC =0,Y =BX .(9)We only require ˆψY i to satisfy ˆE[ˆψY i (Y i )Y i ]=1,which holds automatically if it is given by (8)and ˆHis scale equi-variant,in the sense that ˆH (αY )=ˆH (Y )+log |α|.3The quasi Newton algorithmIn this section,we develop the quasi-Newton algorithm for solving (9).In the Newton algorithm,one replaces B in the right hand side of (9)by B +δB and linearizes the result with respect to δB .Here B denotes a current estimate and the new estimate is obtained by adding to it the solution δB of the linearized equations.In the quasi Newton algorithm,the system matrix of the linearized equations is further approximated.Instead of working with δB ,it is much more convenient to work with its coefficients in a basis which contains the rows of B as its basis vectors.Thus we shall complete B to a square matrix ¯B by adding K −m rows,which are chosen to be orthogonal to the rows of B and among themselves,in the sense of the metric ˆC .More precisely,the matrix ¯B satisfies ¯B ˆC ¯B T = B ˆCB T 00I.(10)Let E ij ,i =1,...,m,j =1,...,K ,be the element of the matrix δB ¯B −1,then δY =δBX has components δY i = K j =1E ij Y j ,where Y j denote the components of ¯BX(or of Y if j ≤m ).Thus,ˆE[ˆψY i +δY i(Y i +δY i )X T ]is linearized as ˆE[ˆψY i (Y i )X T ]+Kj =1ˆE {[ˆψ Y i(Y i )Y j +˙ˆψY i ;Y j (Y i )]X T }E ij where ψ Y i is the derivative of ˆψY i and ˙ˆψY i ;Y j is the derivative of ψY i +hY j with respect to h at h =0.We shall replace the last term in the above expression by an appropriate approx-imation.To this end,we shall assume that B is close to the solution so that the ex-tracted sources Y 1,...,Y m are nearly proportional to S π1,...,S πm for some distinct set of indexes π1,...,πm in {1,...,K }.Since the Y m +1,...,Y K ,by construction,have zero sample correlation with Y 1,...,Y m ,they would be nearly uncorrelated with S π1,...,S πm and hence must be nearly linear combinations of the sources other than S π1,...,S πm .Thus we may treat the Y 1,...,Y m as independent among themselves and (Y m +1,...,Y K )as independent of (Y 1,...,Y m ).Further,we shall approximate ˆEBlind partial separation of sources5by the expectation operator E and vice versa and regardˆψYi as afixed(non random)function.With such approximation Kj=1ˆE{[ˆψY i(Y i)Y j+˙ˆψYi;Y j(Y i)]Y k}E ij≈ˆE[ˆψY i(Y i)]ˆE(Y2k)E ik k=iˆE[ˆψY i(Y i)Y2i+˙ˆψYi;Y j(Y i)Y i]E ii k=iButˆE[(Y i+hY i)ˆψYi +hY i(Y i+hY i)=1,hence by taking the derivative with respectto h and letting h=0,one getsˆE[ˆψ Yi (Y i)Y2i+˙ˆψYi;Y j(Y i)Y i]=−1.Therefore,thelinearization ofˆE[ˆψY+δY(Y+δY)X T]is approximatelyˆE[ψY(Y)X T]+∆¯B−1T where∆is a m×K matrix with general element∆ij= ˆE[ˆψY i(Y i)]ˆE(Y2j)E ij,j=i−E ii,j=i(11)On the other hand,the linearization of[(B+δB)ˆC(B+δB)T]−1(B+δB)ˆC with respect toδB is(BˆCB T)−1BˆC+(BˆCB T)−1δBˆC−(BˆCB T)−1(δBˆCB T+BˆCδB T)(BˆCB T)−1BˆC(12) Multiplying the above expression by¯B T and using(10)yields[I0]+(BˆCB T)−1[0E c]−[E T0]where E and E c are the matrices formed by thefirst m columns and by the last K−m columns of B−1δB,respectively.Note that the off diagonal elements of BCB T nearly vanish since the Y1,...,Y m are nearly independent,hence one may replaces BCB T by diag(BCB T),where diag denotes the operator with builds a diagonal matrix from the diagonal elements of its argument.Finally,ˆE[ˆψY+δY(Y+δY)X T]−[(B+δB)ˆC(B+δB)T]−1(B+δB)ˆC can be approximately linearized asE[ˆψY(Y)X T]+{∆−[I0]+E T0−diag(BˆCB T)−1[0E c]}¯B−1TEquating this expression to zero yields,after a multiplication by¯B T,ˆE[ψY(Y)(¯BX)T]−[I0]+∆+[E T0]−diag(BˆCB T)−1[0E c]=0. This equation can be written explicitly as,noting that their i,i elements already yield the identity0=0and that the diagonal elements of BˆCB T equalˆE(Y21),...,ˆE(Y2m) andˆE(Y2m+1)=···=ˆE(Y2K)=1,E[ˆψYi (Y i)Y j]+ˆE[ˆψ Yi(Y i)]ˆE(Y2j)E ij+E ji=0,1≤i=j≤m(13)ˆE[ˆψY i (Y i)Y j]+{ˆE[ˆψ Yi(Y i)]−1/ˆE(Y2i)}E ij=0,1≤i≤m,m<j≤K.(14)These equations can be solved explicitly for E ij and then the new value of B is given by B+E B+E c B c where B c is the matrix formed by the last K−m rows of¯B.6PhamIt should be noted that the matrix B c is not unique as one can pre-multiply it by any orthogonal matrix of size K −m without affecting (10).Thus the matrix E c is also not unique.However,the product E c B c is.Indeed,by (14),E c =−D −1Y ˆE[ˆψY (Y )(B c X )T ]where D Y is the diagonal matrix with diagonal elements ˆE[ˆψ Y i (Y i )]−1/ˆE(Y 2i ).Hence E c B c =−D −1Y ˆE[ˆψY (Y )X T ]B c T B c .But by (10),ˆC −1=¯B T (B ˆCB T )−100I¯B =B T (B ˆCB T )−1B +B c T B c .yielding B c T B c =ˆC−1−B T (BCB T )−1B .Therefore,one can rewrite the algorithm in a form independent of the choice of B c asB ←B +E B +D −1Y {ˆE[ˆψY (Y )Y T ](B ˆCB T )−1B −ˆE[ˆψY (Y )X T ]ˆC −1},E being the m ×m matrix with zero diagonal and off diagonal elements solution of (13).Note In the case where m =1and the extracted source is normalized to have unit sample variance,the algorithm becomes:b ←b +b −ˆE[ˆψY (Y )X T ]ˆC −1ˆE[ˆψ Y (Y )]−1=ˆE[ˆψ Y (Y )]b −E[ˆψY (Y )X T ]ˆC −1ˆE[ˆψ Y(Y )]−1.The new b is not normalized (but is nearly so),therefore one has to renormalize it and thus the denominator in the last right side is irrelevant.In the case where ψY is given by (7)with σY replaced by ˆσY =[ˆE(Y 2)]1/2=1,the numerator takes the same form but with ˆψY replaced by g .One is thus led to the fixed point FastICA algorithm [1].4Asymptotic distribution of the estimatorConsider the asymptotic distribution of the estimator ˆB,solution of the estimating equa-tions (9).We shall assume that this estimator converge (as the sample size n goes to infinity)to an unmixing solution,that is a matrix B ,with rows proportional to distinct rows of A −1.Let ˆδB =ˆB −B ,we may repeat the same calculations as in previous section.However,we now complete B to ¯B in a slightly different way:the last K −m rows of ¯B are chosen so that (10)holds with the true covariance matrix C in place of ˆC.By the same argument as in previous section,ˆE[ˆψY +δY (Y +δY )X T ]≈ˆE[ˆψY (Y )X T ]+∆¯B −1where ∆is defined as before by (11).We shall made further approximation by replacing ˆψY in the above right hand side by ψY and ˆE and ˆψ Y i in (11)by E and ψ Y i .On the other hand,[B +δB )ˆC (B +δB )T ]−1(B +δB )ˆC may be linearized with respect to δB as (12)as before.But since δB is small and ˆC converges to C ,one can replace,in the last two term in (12),ˆC by C .Then by the same argument as in previous section and noting that ¯Bsatisfies (10)with C in place of ˆC ,the resulting expression can be written as{[I (B ˆCBT )−1B ˆCB c T ]+(BCB T )−1[0E c ]−[E T 0]}¯B T −1Blind partial separation of sources7 Note that BCB T=diag(BCB T)since the Y i are uncorrelated.Further,since BˆCB c T→0,one may replace(BˆCB T)−1BˆCB c T by[diag(BCB T)]−1BˆCB c T, which is the matrix with general elementˆE(Y i Y m+j)/E(Y2i).Then by the same ar-gument as before,the elements E ij ofδB¯B−1can be seen to be approximatively the solution ofˆE[ψY i (Y i)Y j]+E[ψ Yi(Y i)]σ2YjE ij+E ji=0,1≤i=j≤mˆE{[ψY i (Y i)−σ−2Y iY i]Y j}+{E[ψ Yi(Y i)]−σ−2Y i]E ij=0,1≤i≤m,m<j≤K.whereσ2Yi=E(Y2i).The solution isE ij E ji=−E[ψY i(Y i)]σ2Y j11E[ψY j(Y j)]σ2Y i−1 ˆE[ψYi(Y i)Y jˆE|ψY i(Y i)Y j],1≤i<j≤m,E ij=−ˆE{[ψY i(Y i)−σ−2Y iY i]Y j}E[ψY i(Y i)]−σ−2Y i,1≤i≤m,m<j≤K.One then can show,using the Central Limit Theorem,that the vectors[E ij E ji]T,1≤i<j≤m and the random variables E ij,1≤i≤m,m<j≤K are asymptotically independently normally distributed,with covariance matrices1 nσYi/σYj0σYj/σYiλ−1i11λ−1j−1ρ−2i11ρ−2jλ−1i11λ−1j−1σYi/σYj0σYj/σYiand variances(σ2Yi /n)(ρ−2i−1)/(λ−1i−1)2,where n is the sample size andρi=1σYiE[ψ2i(Y i)]=corr{Y i,ψi(Y i)},λi=1σ2Y iE[ψi(Y i)].One can prove that the asymptotic variance is smallest whenψYi is the scorefunction of Y i,in this caseλi=ρi and the asymptotic variance of E ij equals(σ2Y i /σ2Y i)ρ−2j/(ρ−2iρ−2j−1)if1≤j≤m andσ2Y i/(ρ−2i−1)if m<j≤K.Thus,assuming that the extracted sources are normalized to have unit variance,there is a loss of accuracy with respect to the case where all sources are extracted,since1/(ρ−2i −1)>ρ−2j/(ρ−2iρ−2j−1)forρ2j<1.But the loss could be negligible if theρj,m<j≤K are close to1,that is if the non extracted sources are nearly Gaussian. This would not be the case if only one source is extracted since it is unlikely that all the remaining sources are nearly Gaussian.5An example of simulationIn a simulation experiment,we have generated10source signals of length n=1000: thefirst is a sinusoid,the second is a sequence of uniform random variables,the third is a sequence of bilateral exponential variables and the remaining are sequences of Gaussian variables.All sources have zero mean and unit variance.8PhamAs it can be shown,our algorithm is“transformation invariant”in the sense that its behavior when applying to a mixtures with mixing matrix A and starting with a matrix B is the same as when applying to unmixed sources and starting with the global matrix G=BA.Thus we shall apply our algorithm to the unmixed sources with a starting matrix G with elements randomly generated as independent standard normal variates. The following table shows the initial value of G and thefinal value produced by the algorithm after convergence.Initial matrix G1234567891010.70070.76691.0257-0.62380.9284 1.04470.0076-0.0686 1.56200.40702-0.87750.49971.08760.1395-0.0442-0.6111-0.2117 1.8387-0.9778-0.622230.6501-1.43550.13990.3051-0.8784 2.30580.0912-1.1623 1.0585 1.0601Final matrix G12345678910 1-0.03760.13443.70400.01840.08140.0712-0.05970.1758-0.1045-0.026028.43420.03540.1003-0.0941-0.0667-0.03370.1271-0.12670.0704-0.124330.0524-6.44880.0060-0.1386-0.02760.14220.05130.01220.05810.1500Table1.Initial andfinal matrices GOne can see from the above table that the algorithm have correctly extracted the first three sources(but in the order third,first,second).However,we have observed that depending on the starting value,the algorithm may extract only two non Gaussian source and the other is a mixture of the Gaussian sources.The problem is that the algorithm may be stuck with a local minimum of the criterion;and it may be shown that any point B for which the random variable Y1,...,Y m are independent and at most one of them can be Gaussian,is a local minimum point of the criterion(2).Thus the algorithm may not produce the most interesting sources but only some sources and possibly a mixture of Gaussian sources in the case where there are several Gaussian sources.We currently investigate ways to avoid this problem.References1.Hyv¨a rinen,A.:Fast and robustfixed-point algorithms for independent component analysis.IEEE Trans.Neural Networks10(1999)626–6342.Pham,D.T.,Garat,P.:Blind separation of mixtures of independent sources through a quasimaximum likelihood approach.IEEE Trans.Signal Processing45(1997)1712–17253.Pham,D.T.:Fast algorithms for mutual information based independent component analysis.IEEE Trans.on Signal Processing52(2004)2690–27004.Cover,T.M.,Thomas,J.A.:Elements of Information Theory.Wiley,New-York(1991)5.Pham,D.T.:Contrast functions for blind seperation and deconvolution of sources.In Lee,T.W.,Jung,T.P.,Makeig,S.,Sejnowski,T.J.,eds.:Proceeding of ICA2001Conference,San-Diego,USA(2001)37–426.Huber,P.J.:Projection pursuit.Ann.Statist.13(1985)435–475。
盲源分离程序流程
盲源分离程序流程Blind source separation is a process that involves separating a set of mixed signals into their individual source components. This is a challenging problem in signal processing, and there are several methods and algorithms used to address it. 盲源分离是一个涉及将一组混合信号分离成其各自源组件的过程。
这在信号处理中是一个具有挑战性的问题,有几种方法和算法用于解决它。
One common approach to blind source separation is independent component analysis (ICA). ICA is based on the assumption that the observed signals are linear mixtures of independent source signals, and it aims to recover the original sources by estimating the mixing matrix. 盲源分离的一种常见方法是独立成分分析(ICA)。
ICA基于这样的假设:观测到的信号是独立源信号的线性混合,并且旨在通过估计混合矩阵来恢复原始信号源。
Another method for blind source separation is non-negative matrix factorization (NMF). NMF is based on the idea that a matrix can be factorized into two non-negative matrices, which can be used to represent the original data. This method has been widely applied inspeech and audio signal processing. 盲源分离的另一种方法是非负矩阵分解(NMF)。
盲信号分离及其应用医学PPT课件
合肥工业大学 计算机与信息学院 图像信息处理研究室 Tel:2901393 Email:images@ /organ/images
数学建模
线性瞬时混合盲信号分离的数学建模 线性卷积混合盲信号分离的数学建模 非线性(Post-Nonlinear, PNL)混合 盲信号分离的数学建模
合肥工业大学 计算机与信息学院 图像信息处理研究室 Tel:2901393 Email:images@ /organ/images
基本理论
盲信号分离的数学建模 盲信号分离的可解性与独立性分析 盲信号分离的目标函数 盲信号分离的优化算法
盲分离问题需要解决的问题就是如 何从接收到的观察信号中,估计出源信号 S1(t),S2(t) … Sn(t)和混合矩阵的过程。 实际上式还应该存在一个干扰存项,如果 考虑到噪声的迅在,那么上式可以推广到 更一般的情况,即为:
X (t ) AS(t ) n(t )
合肥工业大学 计算机与信息学院 图像信息处理研究室 Tel:2901393 Email:images@ /organ/images
来表示。它们有如下关系:
合肥工业大学 计算机与信息学院 图像信息处理研究室 Tel:2901393 Email:images@ /organ/images
X 1 (t ) a11 S1 (t ) a1n S n (t ) X (t ) a S (t ) a S (t ) m1 1 mn n m
线性瞬时混合 线性卷积混合 非线性混合
返回
合肥工业大学 计算机与信息学院 图像信息处理研究室 Tel:2901393 Email:images@ /organ/image号处理方法 对其研究始于二十世纪八十年代中后期 有关的理论和算法都已经取得了较大的发展 对于线性瞬时混合信号的分离问题、卷积混 合信号的分离问题以及非线性混合信号的分 离问题都做了深入的研究,提出了许多经典 算法 用于语音信号分离、图像特征提取和医学脑 电信号的分离等方面
Globally convergent blind source separation based on a multiuser kurtosis maximization criterio
Globally Convergent Blind Source Separation Based on a Multiuser Kurtosis Maximization CriterionConstantinos B.Papadias,Member,IEEEAbstract—We consider the problem of recovering blindly(i.e., without the use of training sequences)a number of independent and identically distributed source(user)signals that are trans-mitted simultaneously through a linear instantaneous mixing channel.The received signals are,hence,corrupted by interuser interference(IUI),and we can model them as the outputs of a linear multiple-input-multiple-output(MIMO)memoryless system.Assuming the transmitted signals to be mutually inde-pendent,i.i.d.,and to share the same non-Gaussian distribution, a set of necessary and sufficient conditions for the perfect blind recovery(up to scalar phase ambiguities)of all the signals exists and involves the kurtosis as well as the covariance of the output signals.We focus on a straightforward blind constrained criterion stemming from these conditions.From this criterion,we derive an adaptive algorithm for blind source separation,which we call the multiuser kurtosis(MUK)algorithm.At each iteration, the algorithm combines a stochastic gradient update and a Gram–Schmidt orthogonalization procedure in order to satisfy the criterion’s whiteness constraints.A performance analysis of its stationary points reveals that the MUK algorithm is free of any stable undesired local stationary points for any number of sources; hence,it is globally convergent to a setting that recovers them all. Index Terms—Adaptive filtering,blind equalization,blind source separation,constant modulus algorithm,high order sta-tistics,kurtosis,multiple-input multiple-output(MIMO)systems, narrowband channels.I.I NTRODUCTIONT HE field of unsupervised(commonly referred to as blind) deconvolution has known a big growth since its beginning more than two decades ago.Its goal,which is the recovery of signals transmitted through an unknown channel based solely on the channel’s output(i.e.,without access to its input)seems to be finding a growing number of applications,especially in modern digital communication systems.The two most common appli-cations of blind deconvolution in communications are channel equalization and source separation.The former deals with the recovery of a single-user signal that is transmitted through a linear dispersive channel;it is thus received in the presence of intersymbol interference(ISI).The latter deals with the simul-taneous recovery of a number of(typically independent)signals that are transmitted through a linear mixing channel.Each of the received signals is then a linear mixture of all the transmitted signals;it is hence corrupted by interuser interference(IUI).Manuscript received January15,1999;revised August9,2000.Parts of the results presented in this work were previously reported in[31]and[43]–[45]. The associate editor coordinating the review of this paper and approving it for publication was Dr.Sergio Barbarossa.The author is with Lucent Technologies/Bell Labs Research,Holmdel,NJ 07733-0400USA(e-mail:papadias@).Publisher Item Identifier S1053-587X(00)10139-4.In this paper,we will be concerned with the problem of blind source separation(BSS)of instantaneous mixtures where all the source(user)signals share the same statistical properties.This scenario seems to be finding an increasing number of applica-tions.In multiuser communications,such as in code division multiple access(CDMA)systems[1],a number of independent users use the same modulation format to transmit simultane-ously through the same medium(despite the presence of multi-path,the channel can be in a number of cases modeled with good approximation as flat in frequency[2]).In multiple antenna sys-tems,a single user’s information sequence is demultiplexed into multiple independent substreams that are then transmitted over a(typically narrowband)mixing channel[3].A rich literature on the BSS problem has emerged in the last decade or so.Similarly to the case of single-user blind equaliza-tion(BE),the majority of BSS techniques are derived through the construction of statistical criteria on the receiver output sig-nals that typically reflect some known structural properties of the transmitted signals.These criteria usually involve higher order statistics(HOS)(e.g.,through the use of cumulants)of the received signals since second-order statistics(SOS)criteria in most cases do not suffice for the complete separation of the sources(they typically result in a unitary matrix ambiguity). Early fundamental work on HOS-based BSS can be traced back to Donoho[4],who treated the problem from an entropy point of view,and was later expanded by Comon[5],[6],who first introduced the notion of contrasts(a contrast is a cumu-lant-based function of the outputs that is maximized if and only if separation is achieved).These works have set the stage for the development of efficient BSS algorithms by providing im-portant identifiability results for different cases of mixtures(i.e., necessary and sufficient conditions that guarantee perfect sepa-ration under some conditions).On the algorithm side,an early bootstrapping approach was proposed,first in[7]and later in a neural network context in[8](where the well-known“Jutten-Hérault algorithm”was introduced).In these techniques,the re-ceiver filter operates on both the received and the output signals (whence we get the term“bootstrapping”).More traditionally, the receiver operates in a linear fashion only on the received signals(whereas its computation is driven by its outputs).Some techniques that rely directly on HOS cumulants were introduced in[9]–[12],whereas second-order approaches include[13]and [14].Similarly to the single-user BE problem[15],[16],con-stant-modulus(CM)-type techniques have also been proposed for BSS.These are derived from fourth-order statistical criteria that are expressed as deviations from a constant modulus.Early approaches of this type include[17]and[18],wherein a single desired signal is extracted,and the other user signals are treated1053–587X/00$10.00©2000IEEEas undesired ter,multiuser extensions attempted to recover all the input signals either in a single-stage[19]–[22] or in a multistage[23],[24]fashion.Another class of algo-rithms that relate closely to CM techniques are based on kur-tosis-type criteria(Shalvi and Weinstein were the first to show in[16]the close relation between CM and kurtosis-type criteria ina single-user context).Early kurtosis-based techniques include the adaptive techniques proposed in[25]and the deflation al-gorithm of[26].More recently,similar kurtosis-based criteria appeared in[6]and[27]–[34].An important issue in blind source separation is that of convergence of the employed algorithms.The statistical cri-teria that are optimized by the algorithms often contain several classes of stationary points where the algorithm may potentially converge.This reflects the multimodality of the corresponding HOS-based functions.A subset of these points typically include the so-called“desired”convergence points,wherein the recov-ered signals reveal the source signals up to minimal ambiguities (such as arbitrary ordering and scalar phase rotation of each source).This is guaranteed by the construction of the criterion, which is usually based on some known identifiability condition that guarantees signal recovery.However,some other classes of stationary points are“undesired”in the sense that they do not correspond to the recovery of all the source signals.If these points are stable,the algorithm may be“trapped”in one of them,failing to converge to an acceptable setting.Despite the volume of the available literature on this topic, the majority of BSS techniques cannot be shown to be free of undesired stable stationary points,especially when the number of sources is larger than two(see[21]and[35]–[37]for some results in the case of two sources).Two notable exceptions of algorithms that are globally convergent to ideal desired settings are the deflation-type approach of Delfosse and Loubaton[26], [38]and multistage constant-modulus algorithms(such as the one analyzed by Tugnait in[24]).Deflation techniques can be classified as single stage since the recovery of each source does not involve the direct use of previously recovered sources.How-ever,during their adaptation,the convergence of each output is assisted by the simultaneous convergence of other outputs.The deflation technique of[26]seems to be the first algorithm with proven global convergence for an arbitrary number of sources, but it does not operate directly on the receiver filter parameters; it requires a parameter transformation that may introduce bias and extra computational complexity.On the other hand,the mul-tistage CM algorithm analyzed in[24]operates directly on the equalizer parameters,but its multistage structure(the sources are extracted one by one)imposes a larger number of required iterations before final convergence.Its iterative source extrac-tion may also introduce bias in the final solution due to error accumulation originating from the nonperfect subtraction of the detected signals in previous stages.The global convergence of BSS algorithms that are both single-stage and direct in the re-ceiver filter parameters seems to remain an open question in the BSS literature.11A deflation-type variant of the MU-CMA algorithm first introduced in[20] was recently shown in[39]to be,under some conditions,globally convergent in the case of real sub-Gaussian sources and real channels.A generalization of this result to the complex case will soon appear elsewhere.In this paper,we will introduce an algorithm for blind source separation that is direct in the receiver parameters and can be shown to be free of undesired stationary points for an arbitrary number of user(source)signals.The algorithm is derived from a constrained multiuser kurtosis optimization criterion.At each iteration,it performs a stochastic gradient adaptation,followed by a Gram–Schmidt orthogonalization that projects the updated settings on the constraint.As will become clear later,its absence of undesired stationary points is intimately linked to its deflation structure.The rest of the paper is organized as follows.In Section II, we present our problem formulation and assumptions,as well as a brief summary of the single-user necessary and sufficient conditions for BE that were derived in[16].In Section III,we review an extension of these necessary and sufficient conditions for perfect recovery to the multiuser linear instantaneous mix-ture case.These conditions lead naturally to the shaping of a constrained multiuser kurtosis criterion for BSS in Section IV. Based on this criterion,in Section V,we derive the multiuser kurtosis(MUK)algorithm through a stochastic-gradient update combined with a Gram–Schmidt orthogonalization procedure that is used for the satisfaction of the criterion’s constraints.In Section VI,we perform a study of the stationary points of the MUK algorithm,which reveals the absence of stable undesired maxima for any number of source signals,leaving as only stable solutions those where perfect(in the absence of noise)separa-tion of all the sources is achieved.In Section VII,we present some numerical results that confirm the expected globally con-vergent behavior of the algorithm,whereas Section VIII con-tains our conclusions.II.P ROBLEM F ORMULATION AND B ACKGROUNDWe assumethatthat share the same statistical properties are transmitted througha MIMO linear memoryless channel that introduces IUI.The received signal model then takes the familiarformvector of transmitted(source)signals;“matrixequalizer”that producesthe vectoroutput.The receiver output can be hence mathematically representedaswhere global response matrix,andshould ideallymatch the transmittedsignals.The equalizeroutputs can be written(in the absence of noise)as(3)wherechannel inputs tothe.........(5)The above-described linear mixture setup is depicted in Fig.1and can be used to model narrowband antenna-array systems,CDMA systems,or oversampled systems.In analogy to[16],we may express each output’s varianceand kurtosisas(6)(7)respectively.the variance ofany(8)respectively,where.The problem we will treat can be stated as follows:We aim atthe retrieval of the inputsequences,forall,based only on the statistics of the equalizeroutputs.In the following,we will denote theimaginary.Review:Single-User Equalization ProblemA dual model to the one described above holds in the caseof single user linear equalization of ISI channels.In this case,the single equalizer output can be writtenasis a(possibly infinite-length)vector containing thecoefficientsis the subset of the set of integers spanned by theequalizer’s nonzero entries indices.Based on these expressions,Shalvi and Weinstein showed that the following set of condi-tions are necessary and sufficient forBE(11)which stems from the factthat(12)Notice that the equality in(12)holds only in the case that thereexists a unique nonzeroelement.Based on(11),the following constrained optimizationproblem was suggested for BE in[16]denotes Hermitian transpose.An important resultabout the behavior of the criterion(13)was also shown in[16],namely,the costfunctionifIII.N ECESSARY AND S UFFICIENT C ONDITIONS FOR B LINDS IGNAL R ECOVERY As is typically assumed in blind equalization,we will allow each transmitted signal to be recovered up to a unitary scalar rotation.This is due to the inherent inability of statistical blind techniques to distinguish between different rotated versions of the input signals.Therefore,blind recovery will be achieved if,after suitable reordering of the equalizer outputs,the followingholds:andallis an i.i.d.zero-meansequence,are statistically independentfor and share the same statistical distribution,then the following set of conditions are necessary and sufficient for the recovery of all the transmitted signals at the equalizer outputsC1C2C3for((17)where the single nonzero element can be at any position.Now,combining C3)with (4),weget,then .Hence,thedifferentinputsDiscussion:While the acquisition of the conditions C)is straightforward given the conditions of Shalvi and Weinstein in [16]for the BE problem,their consequences are important for blind source separation.In all practical cases where the assump-tions of Theorem 1are met,all successful BSS methods will satisfy these conditions.As a result,they can be used as guide-lines for the design of source separation criteria.Moreover,they could be used as a statistical test for the measure of the suc-cess of blind source separation methods.It should be noted at this point that with minor differences,similar necessary and suf-ficient conditions were previously presented in [6],[25],[27],[37],[40],and [41].They are restated here in the form of The-orem I since they constitute an integral part of the algorithm development and analysis that follow.IV .M ULTIUSER K URTOSIS (MUK)C RITERIONAs mentioned in Section I,a number of HOS-based optimiza-tion criteria have been proposed for blind source separation.In this section,we will focus on the following criterion that stems in a straightforward fashion from conditionsC):identity matrix.The constraint in (19)comes from the fact that,according to C2)andC3)(20)We call (19)the multiuser Kurtosis (MUK)maximization criterion.Notice the similarity between this criterion and the single-user criterion for BE (13)proposed in [16].According to Theorem 1,the global optima of the MUK criterion (19)achieve blind recovery (see also [34]),and hence,any adaptive algorithm that attempts to optimize the MUK criterion will have a chance of reaching the optimal global desired solutions.We should note at this point that algorithms that are inspired,in one form or another,by the criterion (19)have already appeared in the literature.As mentioned earlier,Delfosse and Loubaton were the first to propose a kurtosis-based technique in [26],which is,however,not direct on the equalizer parameters.Cardoso and Laheld have also proposed kurtosis-based equi-variant techniques in [28],whose global convergence has not been shown though for the case of more than two users.More recently,Kung and Mejuto use in [33]a normalized kurtosis criterion to derive an iterative algorithm wherein the sources are extracted one by one.Ding and Nguyen also proposed in [32]a normalized kurtosis-based algorithm that aims at identifying a target source while projecting away from the others.In the next section,we will derive a constrained stochastic-gradient-type algorithm for the adaptive implementation of the MUK cri-terion (19).The algorithm is single stage (it has a deflation2Anextension of this theorem to the case of convolutive mixture channels can be found in [37]and [45];however,as mentioned before,in this study,we confine ourselves to the memoryless (instantaneous mixture)case.structure),is direct in the equalizer parameters,and aims at the joint recovery of all the sources.More importantly,as will be shown later,it encounters no undesired stationary points for an arbitrary number of input signals.V .MUK A LGORITHMWe first defineby the following setofcomplexmatrices:(21)An adaptive stochastic-gradient algorithm for the constrained criterion (19)can be obtained as follows.We first compute thegradientof.Bywriting and by invoking C2)and assuming symmetrical inputs equals (similarly to[16])to beunitary .That can beachieved,for example,by a number of second-order blind sepa-ration methods that typically converge to a unitary instantaneous mixture of the inputs (see,e.g.,[42]and references therein,as well as a discussion in Section VII).3The MUK method can take over from there on to undo blindly the remaining unitarymixture.This would also make theequalizer,thus saving computational complexity.Then,to satisfy the con-straint (25),it suffices tosatisfy,i.e.,(26)3Inthe more general case of convolutive mixtures,a “space-time”prewhitening is required.This reduces to a purely temporal prewhitening in the case of single user BE,as discussed in [16].In general,as there is no guaranteethat will satisfythe constraint (26),we have to transform it into aunitary.We propose tochoosematrix satisfying (26)that is as close as possibletoin the Euclidean sense.Dropping for convenience the time index,this can be achieved with an iterative procedure that satisfies the followingcriterion successivelyfor:(27)wheresquared Euclidean norm ofvector definedfrom(28)Problem (27)can also be writtenas,whichequals(30)whereare real scalar parameters.Settingthe gradientof:(32)whichgives(33)According to(33)(34)where).We summarize the MUK algorithm in Table I.Notice that steps 4–8correspond to a Gram–Schmidt orthog-onalizationof.We also note that the projection de-scribed by steps 4–8of the algorithm would reduce to the mere normalization of step 4in the case of a single user,similarly to [16].A nice property of the MUK algorithm of Table I is that it can handle equally wellsub-Gaussian(36)wheredenotes empirical averaging.In conclusion,the MUK algorithm can be used to handle any non-Gaussian,possibly nonsymmetrical source signals.This is to be contrasted with CM-type alternatives (such as in [20],[24],[39],[41]),which can only handle sub-Gaussian and symmetrical signals.In the next section,we will perform a study of the stationary points of the MUK algorithm described in Table I.The results of this analysis are expected to reflect on the algorithm’s con-vergence behavior.VI.A NALYSIS OF S TATIONARY P OINTSWe will now show the following theorem regarding the per-formance of the MUK algorithm described in Table I:Theorem 2:If the conditions of Theorem 1are satisfiedand that correspond to the perfectrecovery (in the absence of noise and up to an arbitrary permu-tation and scalar rotation each)of all the input signals,i.e.,isapermu-tation matrix.Proof:We first show the following lemma.TABLE I MUK ALGORITHMLemma 1:The MUK algorithm in Table I can be alterna-tively obtained by optimizing in parallel the following “array”ofcriteria:constrained opti-mization criteria that are implemented “in parallel,”i.e.,simul-taneously.Similarly to the derivation in Section V ,in order to derive an algorithm for the criteria in (38),we need to perform first an updateofin ,asfollows:,the followingconstraints:to so that it satisfies (40).Doing this,similarly to Section V ,in a minimum squared norm fashion,results again in (35).Hence,the derived algorithm is identical to the MUK algorithm of Table I.This concludes the proof of thelemma.domain.By observing (38),we first notice thateach(35)onis assumedunitaryandin (38)are equivalent to theconstraintssubjecttodiag de-notes the dot (pair-wise)product.We now show the followinglemma.Lemma 2:Consider the following optimizationcriterion:subjecttois a positive integer,and column vector.Then,the only stable maxima of the criterion in (44)arethe(45)wherepositions.Proof:The proof of this lemma is similar to the proof of the global convergence of the SW algorithm in [16],the main difference being that in ourcase,diagonal matrix definedasandforall(47)where-element subsetof .We now study separately the following cases:•.In thiscase,locations.These settings are the global maximaof thefunctionover ;hence,they are stable maxima.•.These solutions have at least two nonzero elements at some po-sitionsofon:isa,the difference between the squared mag-nitudes oftheand(50)which,by construction,is only nonzeroforrequiresthatis a small positive constant.Noticethatis allowedto have this formsince.Byevaluating,weobtain .In thiscase,on;there-fore,these settings are global minimaofon .In conclusion,the only stable maxima of (44)are the Dirac-type solutions (45).This concludes the proof of thelemma.Fig.2.Performance of the MUK algorithm for p=24-QAM signals. From Lemma2,we conclude that the only stable maximafor,corresponding to the recovery of one of the sources.Now,subjectto,it is only of interest to examine the possible convergencesettingsof,and the orthogonalconstraint.Then,we canwrite.The stationary pointsfor(54)where,the set of(54)reducesto;hence,the outputof(56)(after possible rearrangement of entries)whose only stablemaxima,according to Lemma2,are Dirac-type solutions withtheir unique unit-magnitude nonzero element at a differentposition than all the previous outputs after convergence.Thiscorresponds to recovering,each time,a different input source.After sources have been recovered,up to an arbitrary phase rotation each,correspondingtoWe now notice that due to the assumed full rankofdomain hold intheinput usersignals.VII.N UMERICAL R ESULTSWe now present some examples on the performance of theMUK algorithm described in Table I.We first consider theFig.3.Performance of the MUK algorithm for p =2super-Gaussian signals.simple case of a unitary matrix channel chosen randomlyasand withand we run the MUK algorithm withandinde-pendent 4-QAM sources.In order to test the algorithm’s ro-bustness to more demanding and realistic conditions,we nowconsider a case ofusers and antenna elements,wherein each entry of themixing matrix,which pro-vides the following decomposition:(57)wherea diagonal matrix withreal entries.The denoising of the estimatecan be easily done in the caseas the average of the smallest eigenvaluesofFig.4.Average performance of the MUK algorithm for p=4on100random channels.Given thatideally,),the“nonzero part”of thematrixequalsis a unitary matrix and where we construct as thematrix that containsthe.Then,we prefilter the received signal asfollows:(59)where#denotes matrix pseudo-inverse.Notice that the effec-tive channel between theinputand is ideally unitary.As mentioned before,the dimension ofis,thus reducing the computational complexity of thereceiver processing.For each of100different experiments,weprewhiten the received signal as described above,and then,werun the MUK algorithmwith.Afterconvergence has been reached,the output signals are differen-tially decoded to remove the phase ambiguities and are thencompared with the corresponding input signals.Fig.4shows theaverage performance of the algorithm over the100runs,wherewe display the average evolution of the instantaneous squarederror,in decibels,between each of the decoded outputs andthe corresponding inputs.We first observe that differential en-coding/decoding allows for the successful removal of the phaseambiguity of the output signals.More importantly,we observethat the algorithm recovers,in a completely blind fashion,all theinput source signals,thus confirming its expected globally con-vergent behavior.This behavior is especially appealing since itindicates that the algorithm is robust to the sample-based imper-fect prewhitening operation,to the resulting noise coloring thatthe prewhitening induces,and to the non-negligible level of theadditive white Gaussian noise.VIII.C ONCLUSIONIn this paper,we have studied the problem of unsupervised(blind)source separation of independent and identically dis-tributed i.i.d.sources.Based on a set of necessary and sufficientconditions for the correct retrieval of all the sources,we deriveda multiuser kurtosis(MUK)adaptive algorithm for the blindsource separation of instantaneous mixtures.Due to its con-strained optimization criterion,the algorithm combines at eachiteration a stochastic gradient update with a Gram–Schmidtorthogonalization that projects the updated parameters tothe whiteness constraints.One feature of the Gram–Schmidtcomponent of the algorithm is that it results in the algorithm’supdating procedure having a deflation structure,which turnsout to be key in its convergence behavior.More specifically,an analysis of the stationary points of the MUK algorithmreveals its absence of undesired stationary points,making itglobally convergent to settings that recover(in the absence ofnoise and up to scalar phase and permutation ambiguities)allthe input signals perfectly.This result holds for any numberof input signals.The good performance of the algorithmwas demonstrated through extensive computer simulations,where its global convergence was confirmed,together withits robustness to the sources’distance from Gaussianity,thenonperfect received signal prewhitening,and the presence ofadditive white Gaussian noise.Based on this behavior,it is ourbelief that the MUK algorithm is a powerful technique that canbe used in a number of blind source separation applications.。
ITERATIVE TECHNIQUES FOR BLIND SOURCE SEPARATION
EUSIPCO ’92ITERATIVE TECHNIQUES FOR BLIND SOURCE SEPARATIONUSING ONLY FOURTH-ORDER CUMULANTSJean-François CardosoTélécom Paris.Dept Signal,46rue Barrault,75634Paris CEDEX13,FRANCE.Email:cardoso@sig.enst.frAbstract."Blind source separation"is an array processing problem without a priori information(no array manifold).This model can be identified resorting to4th-order cumulants only via the concept of4th-order signal subspace(FOSS)which is defined as a matrix space.This idea leads to a"Blind MUSIC"approach where identification is achieved by looking for the(approximate)intersections between the FOSS and the manifold of1D projection matrices.Pratical implementations of these ideas are discussed and illustrated with computer simulations.1.INTRODUCTIONThis paper adresses the problem of blind source separation or independent component analysis in the complex case where it can be seen as a narrow-band array processing problem:the output,denoted x(t),of an array of m sensors listening at n discrete sources takes the form: (1)x(t)=p=1,nΣs p(t)a p+n(t)where s p(t)denotes the complex signal emmited by the p-th source;where a p is afixed(deterministic)vector called the p-th source signature;and where n(t)is an independent additive noise assumed to be normally distributed with arbitrary covariance.In"standard"array processing,the same data model is used but the source signatures depend on very few location parameters and this dependence is assumed to be known via the array manifold.In contrast,we adress here the blind problem where no a priori information is available about source signatures.Hence blind qualifies any processing based on the sole obervations."Blindness"is compensated by exploiting the hypothesized assumption of source independence.In the following it is assumed that:•source signals are statistically independent,•source signals have non vanishing kurtosis,•source signatures are linearly independent.Blind source separation is understood as estimation of the source signals s p(t),while blind identification refers to the estimation of the source signatures a p.In the following, we focus on identfication since separation can be based on signature estimates.The blind problem is interesting because its solution allows to process narrow band array data without explicit knowledge about array geometry and without assumptions about wavefront shapes.Various solutions relying on the use of both2nd-order and4th-order cumulant statistics of the array output have already been reported[1-4].These approaches make use of2nd-order information to whiten the data,and4th-order information is then used to process the resulting orthogonalized problem.The additive noise is assumed to be normally distributed:it has no effect on the(exact)4th-order cumulants but2nd-order noise cumulants do not vanish so that the spatial structure of the noise covariance has to be known,modelled or estimated in order to achieve consistent2nd-order prewhitening. These limitations can be overcome by giving up the idea of 2nd-order whitening and resorting to4th-order cumulants only[5-6].It is the purpose of this communication to show how the concept of fourth-order signal subspace yields simple implementations of4th-order only blind identification.2.BLIND IDENTIFICATION2.1On identifiability.The blind context does not lead to full identifiability of the model(1)because any complex factor can be exchanged between s p and a p without modifying the observation.Hence if no a priori information is available,each signature can be identified only up to a scale factor.We take advantage of this to assume,without any loss of generality,that each signature a p has unit norm.With this constraint,an unidentifiable phase term is still present in a p.For each source p,we denote asΠp the orthogonal projector onto the1D space where the p th component lives.It is the space spanned by the signature a p and the projector onto it is(2)Πp=∆a p a p∗This hermitian matrix is unaffected by any phase term in a p and conversely determines a p up to a phase term.It follows that the projectorsΠp are the algebraic quantities that can be,at best,identified in the blind context.It iseasily seen that knowing these projectors is sufficient to perform blind separation of the source signals s p(t) because they allow to construct for each source the linear filter that zeroes all the components but the one specified. We then define blind identification as the problem of estimating the projectorsΠp from sample statistics only. 2.2Blind MUSIC.The approach to blind identification presented in this contribution can be seen as a blind4th-order version of the celebrated MUSIC algorithm.The MUSIC technique is based on the concept of signal subspace which is the vector space spanned by the steering vectors.It can be summarized as i)Estimate the signal subspace using the covariance.ii)Search for the steering vectors which are the closest to the signal subspace.The search is,of course,across the array manifold:this is how MUSIC exploits the a priori information contained in the parameterization of the steering vectors.The fourth-order signal subspace(FOSS)is defined as the real span of the projectorsΠp i.e.as the linear matrix space made of all possible linear combinations with real coefficients of the projectorsΠp:(3)FOSS={M|M=p=1,nΣγp a p a p∗,γp∈R}Let us now consider the following idea for blind identification i.e.estimation of theΠp:i)Estimate the FOSS using4th-order cumulants.ii)Search for the orthogonal1D projectors which are the closest to the FOSS.The closest projectors to the FOSS are taken as estimates of the source projectorsΠp.Such an idea could be termed"blind MUSIC"because,in spite of its strong analogy with the classical MUSIC,no signature parameterization is assumed here:the search is across the so called rank one manifold(ROM)which is defined as the set of all rank-one unit-norm hermitian matrices i.e.across all the1D orthogonal projectors.The reason why Blind MUSIC works is that a matrix space with structure as in eq.(3)is shown,under mild conditions,to contain no other1D projectors than the ones used in its construction,i.e.theΠp.This is obviously true as soon as the signatures a p are linearly independent since in that case,a matrix M as in eq.(3)has a rank equal to the number of non-zero coefficientsγp.If M is a1D projector, it has rank one,hence all the coefficientsγp but one are zero and M is then necessarily one of theΠp.We now have to discuss i)FOSS estimation from4th-order cumulants ii)practical implementations of the blind MUSIC search.3.FOURTH-ORDER SIGNAL SUBSPACE3.1Quadricovariance.Wefind convenient to make temporary use of indexed notations to express the cumulants of the vector process x.Let us denote by x i the i-th coordinate of vector x and by x i the i-th coordinate of its dual x∗.Of course,only orthonormal basis are used: x i is just the complex conjugate of x i.The covariance classically is the matrix R whose(i,j)-coordinate denoted r i j is the2nd-order cumulant of x i and x j:(4)r i j=∆Cum(x i,x j)1≤i,j≤m Similarly,we define the quadricovariance of x as the set of m4complex scalars,q il jk,1≤i,j,k,l≤m:(5)q il jk=∆Cum(x i,x j,x k,x l)Our approach to process4th-order information is to consider the quadricovariance as a matrix mapping denoted Q,which to any matrix M with coordinates m i j associates the matrix N=Q(M)with coordinates n i j according to:(6)n i j=1≤k,l≤mΣq il jk m k lThe quadricovariance has the two following properties:i) it maps any hermitian matrix to another hermitian matrix. ii)it is itself an hermitian operator in the(usual)sense that for any matrices M and N we have<N|Q(M)>∗= <M|Q(N)>with the Euclidian scalar product <M|N>=∆Tr(NM H).These are trivial consequences of cumulant symmetries.It follows[3]that the quadricovariance admits m2real eigenvalues,denoted µi,i=1,m2and m2corresponding orthonormal hermitian eigen-matrices,denoted E i,i=1,m2,verifying:∀i=1,m2Q(E i)=µi E i withE i=E i H,µi∈R,Tr(E i E j)=δ(i,j)As a simple consequence[3]of cumulant additivity and multilinearity,the quadricovariance of a linear mixture(1) of independent components takes the special form:(7)Q(M)=p=1,nΣk p a p∗Ma a p a p a p∗with no contribution from the additive noise(since it has been assumed Gaussian and independent of the signals) and where the kurtosis of the p-th source is denoted by k p: (8)k p=∆Cum(s p,s p∗,s p∗,s p)Equation(7)evidences that the image space of the quadricovariance Q is spanned by the projectors Πp=a p a p∗(hence the name"FOSS").It has exactly rank n if no kurtosis k p is zero and if the projectorsΠp are linearly independent.This last condition is fulfilled whenever the signatures a p are themselves independent.It follows that quadricovariance eigen-decomposition shows only n non-zero eigenvalues.Let us assume that they are numbered in such a way that the corresponding n eigen-matrices are(E i|i=1,n).These eigen-matrices form an hermitian orthonormal basis of the FOSS.3.2FOSS estimation.When a strongly consistent estimate of the signal covariance is used,the2nd-order signal subspace estimate obtained via eigen-decomposition also is strongly consistent.The same can be shown to hold for the FOSS estimates obtained from an eigen-decomposition of the sample quadricovariance into eigen-matrices.This should be the preferred FOSS estimation method for small arrays but eigen-decomposition of the quadricovariance may be too expensive with large arrays.Note however that only a small number of eigen-matrices need to be estimated(n and not m2)and this fact can lead to large computational savings(see[7]).Even in that case, the whole set of4th-order cumulants is needed,and quadricovariance estimation cost may be prohibitive. Fortunately,the FOSS can be estimated in a simpler manner.We demonstrate this in the(rather common)case where signals are circurlarly distributed.Cumulant expression in terms of the moments then reduces to: (9)q il jk=E{x i x j x k x l}−r i j r l k−r l j r i kand it is readily checked that the quadricovariance image of any matrix M accordingly reduces to:(10)Q(M)=E{(x∗Mx x)x x∗}−RMR−R Tr(MR) This expression admits an obvious sample counterpart showing that Q(M)can be estimated at a cost similar to the covariance.This suggests to choose a priori a set of n hermitian matrices M i and to estimate Q(M i)according to (10).The result will be a set of n almost surely independent matrices of the FOSS.They can then be orthonormalized(by a Gram-Schmidt procedure for instance)into an orthonormal hermitian basis of the FOSS. Such a procedure obviously yields FOSS estimates with higher variance than those obtained by eigen-decomposition of the whole set of4th-order cumulants. Since it is not possible to ensure in advance that the Q(M i) actually are independent,a safer solution would be to use a number of M i larger than n.4.BLIND MUSIC IMPLEMENTATIONFrom now on,we assume that a FOSS estimate is available in the form of a set of n hermitian orthonormal matrices:(M i|i=1,n)forming a basis of the estimated FOSS.The following search implementations do not depend on the particular FOSS estimation technique.Blind MUSIC can be implemented as searching through the FOSS the closest ROM matrix(see the PQN technique below)or,alternatively,as searching through the ROM the closest FOSS matrix(seeΠV3)In both cases,the suggested techniques do not implement the search via a gradient(or similar)approach but expresses the Blind MUSIC estimates asfixed points of an appropriate mapping.In our simulations,we have found that thesefixed points were the only stable points:the blind MUSIC search can then be implemented as the iteration of these mappings with arbitrary starting points.4.1ΠV3Search.The natural approach to blind MUSIC is to maximize the norm of the projection onto the FOSS of a matrix A under the constraint that it is a1D projector. Using an orthonormal hermitian basis,the squared norm of this projection,denoted d,is the sum of the squared projections onto each basis matrix.(11)d=i=1,nΣTr2(A M i)Blind MUSIC estimates are obtained as the maximizers of d under the constraint that A=v v∗with v∗v=1.Since Tr(A M i)=v∗M i v,the variation with respect to v of a Lagrange function L=1/2d−λv∗v associated to this constrained optimization problem is:δL=i=1,nΣ(v∗M i v)(v∗M iδv+δv∗M i v)−λ(v∗δv+v∗δv) Defining the cubic vector mapping v→φ(v)as: (13)φ(v)=∆Σi=1,n(v∗M i v)M i vthe Lagrange function variation is rewritten in:(14)δL=(φ−λv)∗δv+δv∗(φ−λv)which is zero for anyδv iffφ(v)=λv.This is equivalent to v being afixed point of the mapping v→Φ(v)where: (15)Φ(v)=∆φ(v)/|φ(v)|TheΠV3search(where V3is a reminder for the cubic dependence on the iterated vector)starts with a random vector and then iteratively computes its image throughΦ.4.2PQN Search.An alternate approach is to search for matrices of the FOSS that are as close as possible to the ROM.The basic idea is to start with an arbitrary matrix of the FOSS and to repeatedly project it onto the ROM and back onto the FOSS.Projection onto the ROM is equivalent to truncating the matrix to itsfirst principal eigen-component,which requires an eigen-decomposition at each step.On the other hand,repeatedly squaring a matrix has the effect of enhancing the dominant eigenvalue.In our experiments,we have found that the projection onto the ROM,being included in the iteration loop,could be replaced by a simple matrix squaring followed by renormalization The PQN algorithm is just cycling through the three steps of projection,quadration, and normalization,hence the acronym"PQN".After convergence,the dominating eigen-vector is extracted, providing an estimate of one of the source signatures.Quadration and projection can be efficiently implemented in a single step by representing the iterated matrix,say A,by its(real)coordinates a i,i=1,n in the FOSS basis:A=Σa i M i.The squared matrix then is A2=Σa i a j M i M j.Since an orthonormal basis is used, the projection of A2onto the FOSS has coordinates a k′given by:(16)a k′=Σi,j t ijk a i a j with t ijk=∆Tr(M i M j M k) Hence,by pre-computing the table t ijk,each quadration-projection is computed in the single step(16),involving only n3real multiplications.4.3Simulation results The following simulation results are for a uniform linear half-wavelength array of4sensors, two independent PSK modulated sources of unit variance located respectively at0and20degrees(i.e.under the same lobe).The signal is corrupted by additive Gaussian white noise with covarianceσI.We performed20Monte-Carlo runs using the PQN algorithm for data lengths of50, 100,200,500,1000and for noise levelsσ=-10,0,10,20 dB.For each run,we plot a performance indexρdefinedas (17)ρ=n1p =1,nΣ1−Tr (Πˆp Πp)where each Πˆpis the estimated p -th projector.This index also is the squared sine of the angle between each signature and its estimate (averaged on the sources).Perf.index10E-110E-210E-310E-4501002005001000-10dB SNR•••••••••••••••••••••••••••••••Perf.index10E-110E-210E-310E-45010020050010000dB SNR•••••••••••••••••••••••••••••••••••••••••••••••••Perf.index10E-110E-210E-310E-450100200500100010dB SNR•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••For negative SNR,the data lengths used here do notallow any correct estimation :the performance index is close to 0.5,indicating "random estimation"!But as the SNR gets positive meaningful estimates are obtained with relatively short data lengths.Also note that there is no significant improvement when the SNR goes from 10dB to 20dB.This is a general feature of 4th-order-only blind techniques :when the noise level is low enough,the performance is dominated by the sample size.This could be contrasted with the standard parametric MUSIC (2nd-order or 4th-order)where,at low noise levels,the varianceSample sizePerf.index10E-110E-210E-310E-450100200500100020dB SNR •••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••of the estimates is proportional to the noise power.Another remark is that these simulations are for equipowered sources :the performance would degrade for a source when its power gets weaker.This effect shows at any order but is naturally more severe at fourth-order than at 2nd-order.Just as in the MUSIC case,asymptotic performance can be obtained in closed form but room is definitely lacking for exposition of the results.They will be presented at the conference.CONCLUSION.The notion of fourth-order signal subspace (FOSS)has been introduced.This matrix space,function of the 4th-order cumulants,is the natural 4th-order counterpart of the classical 2nd-order signal subspace.Exploited in a "blind MUSIC"fashion,it allows for blind identification without resorting to 2nd-order information.This can be done with one of the low cost fixed point techniques presented here.REFERENCES[1]on,"Independent Component Analysis",Proc.Int.Workshop on Higher-Order Stat.,Chamrousse,France,Jul.91,pp.111-120.[2]M.Gaeta,coume,"Source Separation Without A Priori Knowledge :the Maximum Likelihood Solution",Proc.EUSIPCO,Barcelona,Spain,Sept.90,pp.621-624.[3]J.F.Cardoso,"Eigen-structure of the fourth-order cumulant tensor with application to the blind source separation problem",Proc.ICASSP’90,pp.2655-2658,Albuquerque,1990.[4]V.C.Soon,L.Tong,Y.F.Huang,R.Liu,"An extended fourth order blind identification algorithm in spatially correlated noise",Proc.ICASSP’90,pp.1365-1368,Albuquerque,1990.[5]J.F.Cardoso,"Super-Symmetric Decomposition of the Fourth-Order Cumulant Tensor.Blind Identification of More Sources than Sensors",Proc.ICASSP’91,pp.3109-3112,Toronto,1991.[6]G.Giannakis,S.Shamsunder,"Modelling of non Gaussian array data using cumulants :DOA estimation with less sensors than sources",Proc.of conf.on Info.Sci.and Syst.,Baltimore,MD,1991.[7]J.F.Cardoso,on,"Tensor-based Independent Component Analysis",Proc.EUSIPCO,Barcelona,Spain,Sept.90,pp.673-676.。
盲源分离文档
盲源分离什么是盲源分离?盲源分离(Blind Source Separation)是一种信号处理技术,用于从混合信号中将源信号分离出来,而不需要关于源信号的先验信息。
盲源分离在许多领域都有广泛的应用,例如语音信号处理、图像处理、生物医学工程等。
盲源分离的原理盲源分离的原理基于独立成分分析(Independent Component Analysis,ICA)的概念。
ICA假设混合信号是源信号的线性组合,并尝试找到一个转换矩阵,使得通过转换后的混合信号在各个维度上最大程度上变得相互独立。
通过独立成分分析,盲源分离技术可以将混合信号恢复为源信号。
盲源分离的应用语音信号处理在语音信号处理中,盲源分离可以用来从混合语音信号中分离出不同的说话者的语音信号。
这对于语音识别、语音增强、人机交互等应用非常重要。
图像处理在图像处理中,盲源分离可以用来从混合图像中分离出不同的成分,例如前景和背景、深度信息等。
这对于图像增强、图像分析、计算机视觉等应用非常有用。
生物医学工程在生物医学工程中,盲源分离可以用来分离脑电图(EEG)信号中不同脑区的活动。
这对于研究脑功能和脑疾病诊断都具有重要意义。
盲源分离的挑战盲源分离面临着一些挑战。
首先,混合信号的混合过程往往是非线性的,这给分离过程带来了一定的困难。
其次,混合信号中的噪声会影响分离效果,因此需要对噪声进行建模和处理。
最后,盲源分离问题本质上是一个不适定问题,即存在无穷多个与观测数据一致的解。
为了解决这些挑战,研究者们提出了许多改进的盲源分离方法,包括非负矩阵分解、卷积神经网络等。
盲源分离的应用工具目前,有许多开源的软件包和工具可用于实现盲源分离。
以下是一些常用的工具:•FastICA:基于独立成分分析的算法,可用于分离混合信号。
•BSS Eval:用于评估盲源分离算法性能的工具包。
•MIRtoolbox:用于音频信号处理和音乐信息检索的工具包,包含盲源分离的功能。
结论盲源分离是一种重要的信号处理技术,可以在没有先验信息的情况下从混合信号中分离出源信号。
基于无监督学习的盲信号源分离技术研究
第33卷第1期电子科技大学学报V ol.33 No.1 2004年2月Journal of UEST of China Feb. 2004 基于无监督学习的盲信号源分离技术研究傅 彦,周俊临(电子科技大学计算机科学与工程学院成都 610054)【摘要】以独立分量分析为主要对象, 描述了盲信号源分离技术的基本模型,介绍了盲分离的主要方法和数学原理, 分析了盲信号源的可辨识性。
提出基于神经网络无监督学习的盲分离方法,并改进了分离效果评判指标。
在生物信息处理的背景下将人工神经网络和信息理论相结合,解决了盲信号源分离,自适应地求得分离矩阵,且可以同时分离具有正峭度和负峭度的信号源,对盲信号源分离的研究有极大的促进作用。
关键词盲信号源分离; 神经网络; 无监督学习; 独立分量分析板中图分类号TN911; TN911.7 文献标识码 AResearch of Blind Source Separation Technology whichBased on Unsupervised LearningFu Yan,Zhou Junlin(School of Computer Science and Engineering, UEST of China Chengdu 610054)Abstract This paper focuses on the independent component analysis presenting a review on the basic model, the main method, the mathematical principle of blind source separation,analyzing the possibility of separation. The paper puts forward the method that based on the neural network unsupervised learning, also, improves the index on separation effects. Under the biology information processing background, combining artificial neural network with information theory to resolve this kind of problem can get the separation matrix adaptively by itself. It can separate mixtures which have both positive and negative kurtosis, and promotes the research of blind source separation greatly.Key words blind source separation; neural network; unsupervised learning; independent component analysis盲源分离在语音、阵列处理、无线数据通信、图像、医学和地震信号处理等领域有广阔的应用前景。
盲信号分离
盲信号分离=盲源分离BSS Blind Signal/Source SeparationHerault、Jutten 1985从多个观测到的混合信号中分析出没有观测的原始信号。
观测到的混合信号来自多个传感器的输出,且传感器的输出信号线性不相关。
文献:盲信号分离技术研究与算法综述_周治宇、陈豪1.盲信号分离的“盲”是什么意思?已知原信号和传输通道的先验知识时,通过滤波器的信号处理能够在一定程度上完成信号分离的任务。
但是在没有原信号和传输通道的先验知识时,上述通过滤波的信号处理方法无法完成信号分离的任务,必须通过盲信号分离技术来解决。
“盲”是指(1)原始信号并不知道;(2)对于信号混合的方式也不知道。
也就是仅根据观测到的混合信号估计源信号。
2.什么是“信号分离”?是信号处理中的一个基本问题。
从接收到的混合信号(感兴趣信号+干扰+噪声)中分别分离或恢复出原始信号。
各种时域滤波器、频域滤波器、空域滤波器或码域滤波器都可以看作是一种信号分离器,完成信号分离任务。
3.盲信号分离如何实现的?独立分量分析ICA Independent Component Analysis是为了解决盲信号分离问题而逐渐发展起来的一种新技术,是目前主要采用的方法。
将接收到的混合信号按照统计独立的原则通过优化算法分解为若干独立分量,这些独立分量作为源信号的一种近似估计。
4.盲信号分离结果存在两个不确定性分离结果排列顺序不确定、分离结果幅度不确定。
由于要传送的信息往往包含在信号波形中, 因此这两个不确定性并不影响在实际中的应用。
5.目前主要应用领域目前盲信号处理技术已经在生物医学信号处理、语音信号处理、雷达信号分选、电子侦察、数字波束形成、无线通信、地震信号处理、机械故障诊断、图像处理、数字水印、人脸识别和金融数据分析等领域得到了广泛应用。
独立分量分析ICA Independent Component Analysis一种有效的对高阶数据进行分析的方法不仅可以处理非高斯信号(?),而且可以用于解决非线性、非稳态信号的问题分析,在特征提取方面有着独特的优点和广阔的前景。
blind nonlinear source separation using ekens learning and MLP network
1: Nonlinear multilayer perceptron (MLP) mixing model
In this work, we assume that the observed data vectors
1 The score function [10] is an estimated function of the vector y(t), it is a derivative of the likelihood function of output y(t) which attains its minimum when separation is achieved.
x(t) at time t have been generated by respective unknown source vectors s(t) through an unknown nonlinear mapping H (•) and unknown linear mixing matrix A. The nonlinear mixing system (from the unknown sources s(t) to the observation) is modelled with the multi-layer perceptron (MLP) network as illustrated in Fig. 1. This nonlinear mapping system is then followed by inverse transfer functions and a linear unmixing matrix. This model is appropriate for multiple inputs and multiple outputs (MIMO) channels; where an n-sensor array provides the output signal x(t) = [x1 (t), x2 (t), . . . , xn (t)]T , T denotes transposition. We assume the signals x(t) are nonlinear memoryless mixtures of m unknown statistically independent sources s(t) = [s1 (t), s2 (t), . . . , sm (t)]T , x (t) = H (As(t))
基于BSS的单路与多路RFID混合信号的防碰撞技术
The anti-collision technology of mixed signal based on BSS in singal path and multiple paths RFID
Thesis Submitted to Nanjing University of Posts and Telecommunications for the Degree of Master of Engineering
单位代码: 10293
密
级:
硕 士 学 位 论 文
论文题目:
基于 BSS 的单路与多路 RFID 混合信号的防碰撞技术
学 姓 导 学 研 科 究 专 方
号 名 师 业 向
Y1010020706 陈 成谢锋 晨 教授
电路与系统 智能信息与应用 工学硕士 2013 年 02 月 26 日
申请学位类别 论文提交日期
关键词: 射频识别,独立成分分析,经验模态分层,小波包分层,独立子波函数,欠定盲 分离
I
Abstract
As the rapid development of Radio Frequency Identification and Ultra Frequency Identification Technology, we have being settled an incrementally higher qualification to the collision avoiding technology in the Radio Frequency Identification Technology, and in the course BSS is the focus of research of Radio Frequency Identification Technology recently.In this paper we analyzed the signal mixing model of RFID, then drew the conclusion of statistical independence among the radio frequency identification signals, the insensitivity of the range and polarity, and the insensitivity to the arranging order requirements of the radio frequency identification signals.In synthesis, radio frequency identification multi-label signals satisfy the requirements of the two indeterminations of Blind Source Separation algorithmic, so independent component analyzing algorithmic could be used to deal with the mixed signals identified by the radio frequency.Based by this we set two experiments: The Blind Source Separation adapting to mixture of radio frequency identification baseband signals and The Blind Source Separation adapting to mixture of radio frequency identification modulating signals, then discussed the separation capability and observed that the simulating coefficient approaches 1 unlimitedly, which means independent component analyzing algorithmic could be made use of in the Blind Source Separation of Radio Frequency Identification.First, we settled the underdetermined blind source separation mathematics model of single-way Radio Frequency Identification mixed signals, discussed the delaminating technology based on EMD and wavelet packet, analyzed the distilling methods of independent sub-wave functions and the determining methods of the number of independent sub-wave functions, then distilled the underdetermined blind source separation method of single-way mixed RFID signals based on independent sub-wave functions.Through imitating experiments we continue to analyze the physical characteristics of mixed RFID signals, discussed the influence of the separation effect of the frequency difference, and compared the effect of underdetermined blind source separation based on the two separation technologies of EMD and wavelet packet. The research result indicates that the collision avoiding technology of BSS used in the single-way and multi-way mixed RFID signals is an effective method, we can reduce the quantity of antennas in the electron object identification system effectively, enhance the reliability of the implement of RFID collision avoidance and reduce economical costs by the application of underdetermined blind source separation technology which is implemented in RFID.
Blind Source Separation - Finding Needles in Haystacks:盲源分离在草堆里找针
• Conclusions
Blind Source Separation (BSS) A Simple Math Example
s(k)
x(k)
y(k)
A
B
• Let s1(k), s2(k),…, sm(k) be signals of interest • Measurements: For 1 ≤ i ≤ m,
• Criteria Based On This Goal:
Density Modeling Contrast Functions Property Restoral [e.g. (Non-)Constant Modulus
Algorithm]
• Implications:
Separating capability of the criteria will be similar Implementation details (e.g. optimization strategy)
Example: Wireless Signal Separation
Example: Wireless Signal Separation
Example: Wireless Signal Separation
Example: Wireless Signal Separation
Outline of Talk
Speech Enhancement Methods
• Classic (frequency selective) linear filtering
Only useful for the simplest of situations
盲源分离(ICA)
三、分离算法
▪ 负熵:信息论中的“熵” 是随机变量的随机性越大,熵就越大,高斯变量是所有等 方差的随机变量中熵最大的。负熵是任意随机变量与高斯随机变量之间的相对熵, 定义如下:
J[p(y)]值越大表示它距离高斯分布越远,可用来作为非高斯性的度量。
第九页,共29页。
三、分离算法
(2)互信息极小化准则(Minimization of Mutual Information, MMI) ▪ 当 y中各分量统计独立时,互信息 I ( y ) =0,互信息定义如下:
▪ 这一过程又称为独立分量分析(Independent Component Analysis-ICA)。
▪ 如果考虑到时间延迟对观测信号的影响,那么观测到的信号应该是源信号和通
道的卷积,对卷积混迭信号进行盲分离通常被称为盲反卷积(Blind Deconvolusion)。
第二页,共29页。
二、数学模型
正交化步骤,把已提取过的分量去掉。由于 U 是正交归一阵,所以可采用
Gram-schmidt正交分解法来实现。
第二十三页,共29页。
三、分离算法
▪ 基于负熵的, 提取多个源信号的固定点算法步骤如下:
第二十四页,共29页。
四、仿真结果
第二十五页,共29页。
四、仿真结果
源信号只含一个随机噪声分离后得到的波形图
▪ 自适应算法
▪ 常规的随机梯度法
▪ 自然梯度与相对梯度法 ▪ 扩展的Infomax法 ▪ 非线性PCA的自适应算法
▪ 逐次提取法
▪ 梯度算法 ▪ 固定点算法(Fixed-point algorithm)
第十六页,共29页。
三、分离算法
(三)快速独立分量分析 (a Fast algorithm of Independent Component Analysis,
基于mMSE、Kurtosis和小波-ICA的EEG去噪说明书
Denoising EEG using mMSE, Kurtosis andWavelet-ICAGautam KaushalPunjabi University, Patiala (Punjab)Amanpreet SinghPunjab Technical University (PTU),Jalandhar (Punjab)V. K. JainSant Longowal Institute of Engineering and TechnologyLongowal (Punjab)Abstract —Electroencephalogram (EEG) is electrical signal recorded from the scalp which represents the neural activity of human brain. EEG is often contaminated by the ocular artifacts viz. saccades, voluntarily or involuntarily eye movement and eye blink. Various methods have been proposed both in signal processing field as well as in neuroscience for identification and correction of ocular artifacts. Among many methods based on wavelet transform, adaptive filters, independent component analysis have shown promising results in removal of such artifacts. In this paper unsupervised robust and computationally fast algorithm using multi scale sample entropy (mMSE) and Kurtosis is used to automatically identify independent artifactual components and then denoising these components using wavelet decomposition. Results have shown improved reconstructed EEG signals. The proposed algorithm does not need manual identification of artifactual components.Keywords — Electroencephalography (EEG), ocular Artifact (OAs), discrete wavelet transform (DWT), Kurtosis, modified multi-scale sample entropy (mMSE)I. INTRODUCTIONEEG is recording of electrical signal generated due to neural activity in the brain and it is used to diagnose different abnormalities viz. sleep disorders, brain death, coma, tumors, epilepsy, trauma, stroke etc. The signal is recorded either by placing electrodes on scalp or by recording local field potential from prefrontal cortex. A signal generated in the absence of any stimulus is termed as spontaneous EEG whereas signal generated with an external stimulus is known as Event Related Potential (ERP). For a normal person, EEG amplitude ranges from 10100V μ- having following frequency components:()()()()()0.14,48,813,1330, 30.Delta Hz Theta Hz Alpha Hz Beta Hz Gamma above Hz ----The recorded EEG signal is often contaminated by spurious signals from other unwanted sources. This kind of contamination in medical terminology is named as artifact. An erroneous potential difference appears at the electrodes due to presence of these artifacts in the EEG signal. Among these artifacts, most of them are Electro-oculogram (EOG) due to eye blink or eye movement; muscle activity and Electrocardiogram (ECG) due to electrical activity of heart[1]. An optimized way to correct for an EEG contaminated with EOG signal is to first detect the EOG signal and then to clean the corresponding EEG signal instead of cleaning the whole EEG signal. This method is not only computationally efficient but is also cost effective.The EEG data set used in this work is created by BIH Sleep Laboratory. The dataset can be downloaded from Physio-Net ATM [2]. The dataset consists of 7 recorded channels, each having 2500 samples with sampling rate of 250 Hz, and existing MATLAB code [3] is used for FastICA.From the dataset, we took an EEG signal and an EOG signal and then mixed these two signals to form a two channel corrupted EEG signal. This corrupted EEG signal is utilized for the examination of mMSE based algorithm for artifact removal.In this work, Section 1 describes dataset used for this work. Section 2 gives related work for artifacts removal from EEG. Section 3 demonstrates the mMSE based algorithm used for the detection and correction of the EEG signal. Performance measurements and quantitative analysis is given in Section 4. Results, discussion and conclusion are presented in Section 5.II.RELATED WORKIn recent years, research community both in medical science and in engineering has examined the various artifacts present in EEG. Among these artifacts, ocular artifacts are shown to cause a significant deterioration of EEG signal. Several methods to remove ocular artifacts have been proposed from decades.V. Krishnaveni et al. [4] attempted to deal with such artifacts using wavelet based adaptive thresholding algorithm only to the identified OA zones. Adaptive thresholding applied only to OA zones preserves the shape of EEG signal in non artifact zones. The method has been shown to give promising results in removal of ocular artifacts in their method. Power spectral density plots and frequency correlation plots are used, which gives only an estimate in providing an interference relating to relative superiority of algorithm used for removing ocular artifact removal from EEG. In the algorithm, finding of artifact rising and falling edges are complex, locating the OA zones and calculating the edges is lengthy. Further, performance indices based on poweralgorithm. Block diagram of the algorithm is shown in Fig. 1.Fig.1: Block diagram of automatic artifacts removal algorithm usingtools Kurtosis and mMSE).Fig.2: Amplitude vs. sample plot for automatic artifacts removal。
眼科学英语词汇
眼科学英语词汇ophthalmology, OPH, Ophth 眼科学visionics 视觉学visual optics 视觉光学visual physiology 视觉生理学physiology of eye 眼生理学visual electro physiology 视觉电生理学pathology of eye 眼病理学dioptrics of eye 眼屈光学neuro ophthalmology 神经眼科学ophthalmiatrics 眼科治疗学ophthalmic surgery 眼科手术学cryo ophthalmology 冷冻眼科学right eye, RE, oculus dexter, OD 右眼left eye, LE, oculus sinister, OS 左眼oculus uterque, OU 双眼eyeball phantom 眼球模型eye bank 眼库prevention of blindness, PB 防盲primary eye care 初级眼保健low vision 低视力blindness 盲totol blindness 全盲imcomplete blindness 不全盲congenital blindness 先天性盲acquired blindness 后天性盲曾用名“获得性盲”。
functional blindness 功能性盲organic blindness 器质性盲occupational blindness 职业性盲legal blindness 法定盲visual aura 视觉先兆visual disorder 视觉障碍visual deterioration 视力减退transitional blindness 一过性盲amaurosis 黑amaurosis fugax 一过性黑toxic amaurosis 中毒性黑central amaurosis 中枢性黑uremic amaurosis 尿毒性黑cortical blindness 皮质盲macropsia 视物显大症曾用名“大视”。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
BLIND SEPARATION OF MIXED-KURTOSIS SIGNALSUSING AN ADAPTIVE THRESHOLD NONLINEARITYHeinz Mathis,Thomas P.von Hoff,and Marcel JohoSignal and Information Processing Laboratory,Swiss Federal Institute of Technology Zurich,Switzerlandmathis@isi.ee.ethz.ch,vonhoff@isi.ee.ethz.ch,joho@isi.ee.ethz.chIndependent Component Analysis and Blind Signal Separation ICA2000,June19–22,2000,Helsinki,Finland,accepted for publicationABSTRACTA parameterized threshold nonlinearity,which separates a mix-ture of signals with any distribution(except for Gaussian),is intro-duced.This nonlinearity is particularly simple to implement,since it neither uses hyperbolic nor polynomial functions,unlike most nonlinearities used for blind separation.For some specific distri-butions,the stable region of the threshold parameter is derived, and optimal values for best separation performance are given.If the threshold parameter is made adaptive during the separation process,the successful separation of signals whose distribution is unknown is demonstrated and compared against other known methods.1.INTRODUCTIONBlind signal separation using higher-order statistics either explic-itly or implicitly has attracted many researchers whose main goal is to separate a set of mixed signals as fast as possible with the smallest residual mixing.Most approaches require complete or at least some knowledge of the source distributions.If sources of dif-ferent distributions are mixed,such techniques may fail to work.Throughout this paper we assume a linear mixing and separa-tion process,where the measured signals to be processed are linear combinations of the original source signals,weighted by scalars,which are the elements of the mixing matrix.denotes the number of sources as well as the number of sensors.Recovery of the signals is carried out by a blind adaptive algorithm adjusting the coefficients of the separation matrix.The output of the algorithm is therefore(1) In order to successfully separate the signals,should ap-proximate as closely as possible a scaled permutation matrix.A possible update equation for the separation matrix results from either the minimization of the mutual information of the output signals[1],the output entropy maximization[2],the ML estima-tor[3],the maximum negentropy[4],and applying the natural gra-dient[1]to these methods,or exchanging a non-blind criterion of an RLS-like algorithm with a blind criterion[5],(2) where is the step size,the identity matrix,and a Bussgang nonlinearity[6].2.THE THRESHOLD NONLINEARITYThe nonlinearity plays a central role in blind signal separation.Its nature is defined by the objective or contrast function,which is often some kind of information-theoretic measure,such as entropy or mutual information.Very frequently,the nonlinearities derived by different methods are similar in nature for a given probability distribution of the signals to separate.In fact,the exact curve of the nonlinearity might not matter[7].Whereas the minimization of the mutual information leads to a pdf-independent polynomial with several terms[1],both the Infomax and the Maximum-Likelihood approach[8]lead to(3) where and are the pdf and its derivative,respec-tively,of the source signals.In the following,the range of will be assumed as that given in(3)if not indicated otherwise.(3)is referred to as the score function of a certain pdf.From(3)it can be seen,that super-Gaussian signals typically have sigmoidal nonlinearities such as or,whereas sub-Gaussian signals can be separated using a nonlinearity of the formodd(4) being odd ensures the validity of the sign after the nonlinearity. If(4)is rewritten as(5) is no longer restricted to odd integers,but can be any rational number greater than one.(5)also has the advantage that it is di-rectly applicable to complex signals.A very simple nonlinearity for the separation of sub-Gaussian signals has been derived in[9] starting from the pdf of a generalized Gaussian signal(7)is the gamma function given by,and shows a recursive property simi-lar to the factorial function,.For large,(7)yields(9) In the limit for(uniform distribution)we get the threshold nonlinearity(10)A more detailed derivation of Eqs.(7)to(10)can be found in the appendix.The infinite gain in(10)will of course cause conver-gence problems for afinite step size.The gain can therefore be traded off against a lower threshold for a specified output power. To this end,we make use of the scaling equation[9](11)which is a result of the Bussgang property.is a source dis-tribution with unit variance.Solving(11)for the uniform distribution and a given threshold results in afinite gain of(12)for(14) which is the same as(13)for(22) Thus,the stability condition results inFigure2:Range of threshold(shaded area)as a function of the generalized Gaussian parameter.ution with.While the upper limit of the stability range isas the pdf gets close to the uniform distribution.Note that only for the uniform distribution the threshold nonlinearity is the score func-tion(except for thefinite gain).Generally speaking,for sub-Gaussian signals,the stable range is between1and(29) we get with Eq.(22)for the gain of the threshold nonlinearity(30) Using(30)in(28)wefind after some calculations the optimal threshold asotherwise(32) where the scaling factor has already been computed by Eq.(12). Eq.(28)then results in(33) Numerical inspection of(33)on the interval.4.BLIND SEPARATION OF ARBITRARY SOURCES 4.1.Known MethodsIn practice,the distributions of the sources are often identical, since the nature of their origin is related.In that case,the nonlin-earity will be the same for all outputs.However,we mayfind the situation where some sources have different distributions,possibly with a different sign of their respective kurtoses.If in a mixture some sources are sub-Gaussian and some super-Gaussian distrib-uted,the appropriate nonlinearity might be chosen in advance,as long as the number of sub-and super-Gaussian sources is known. Doing so,the system is deprived of some degree of freedom,due to the restriction of permutation within the group of equal kurtosis sign.In other words,once a nonlinearity is chosen,only a signal with the appropriate kurtosis can be separated at that specific out-put.Other signals are forced away to outputs with the appropriatenonlinearity.Global convergence can be greatly accelerated by let-ting the system choose its permutation closest to some initial mix-ing condition.This can be achieved by an adaptive nonlinearity. If the number of sub-Gaussian and the number of super-Gaussian sources is unknown,adaptive nonlinearities are a necessity.Douglas et al.[13]switch between two nonlinearities,namelyand(34) where and separate sub-and super-Gaussian signals, respectively.The algorithm does not try to normalize its output power regardless of the distribution.A sufficient stability condi-tion is therefore(35) The left-hand side of(35)is constantly evaluated for the two non-linearities and.The larger value decides which nonlin-earity is applied.Similarly,Lee et al.[14]present an extended Infomax algo-rithm,where the update equation for the separation matrix is for-mulated as(36)with being the vector of signs.is positive for a super-Gaussian and negative for a sub-Gaussian sig-nal,respectively.If the distributions are unknown,the sign might be switched according to a kurtosis estimation at the output or some parameter expressing the stability of the nonlinearity cur-rently used.Similarly to(35)it follows(37) By choosing the same sign as the rest of(37),the algorithm is stabilized.Thus,the sign must be adapted as(38) Again,output powers are not normalized,and depend on the source distributions.4.2.Adaptive Threshold NonlinearitySince we know that any non-Gaussian distribution can be sepa-rated by the threshold nonlinearity with either or, we can set up an algorithm in which the update equation for the separation matrix is given by(2)with(39)Each threshold is chosen from as that value which maximizes the right-hand side of(17)with of(39).The use of the stability equation to switch between the two threshold val-ues has two important disadvantages.First,the value of is difficult to work out since the function is unknown.Second, although the threshold nonlinearity successfully separates discrete distributions,is generally zero for discrete distributions as used in data communications,making the switching criterion in-valid.An alternative is to adapt the threshold vectorsaccording to(40) where is an estimate of the output kurtoses of the vector at sample time.Additionally,is clipped at0and1.5to keep it inside a meaningful region.4.3.Output normalizationIn the analysis of the stability we have shown that the scaling factors in have to be chosen according to Eq.(21)in order to obtain output signals with unit variance.In an environment where the probability distributions are given,Eq.(21) can be evaluated off-line and is thusfixed for the adaptation part.If the distributions are unknown,however,itself has to be found adaptively.To this end,we note that for unimodal,symmet-ric distributions,is a monotonously decreasing function of the output standard deviation.Vice versa,can be written as ,where is a monotonously decreasing function for,henceinto a different step size and write(46) with being the vector of power esti-mates and a vector of ones,respectively.(46)is a simple AGC (automatic gain control)algorithm,which normalizes the outputs of the separation process.It runs along with the adaptation ofand.Alternately,the normalization can of course be performed by a separate AGC stage after the separation process.This is for example necessary if the mixture contains binary sources.It is intuitively clear that a normalized source with symbol values produces zero output after the threshold nonlinearity with.In summary,the adaptive threshold nonlinearity algorithm is given asAdaptive Threshold Nonlinearity Algorithm(47)(48)(49)(50)5.SIMULATIONSFor the following simulations of the convergence behavior of blind signal separation using the adaptive threshold device,in-dependent source signals were mixed by random matrix,whose condition number is chosen(the singular values of are logarithmically distributed).Block processing with a block length was applied.With this length the kurtosis esti-mation for the purpose of threshold adaptation is accurate enough, and inter-block memory does not offer any advantage.Figure3:Course of some statistics during the separation process of mixed-kurtosis signals.Top left:Separation performance.Top right:Output powers.Bottom left:Adaptive threshold values. Bottom right:Kurtoses of output signals.In thefirst computer experiment we mixed three Laplacian, three uniform,three16-PAM,and one Gaussian source.If more sources are Gaussian distributed,they can still be separated from other sources by the adaptive threshold nonlinearity,but remain mixed among themselves,leading to a disturbed permutation ma-trix.This is an inherent limitation of blind separation using higher-order statistics,and is usually circumvented by the restriction to at most one Gaussian source.The adaptive threshold nonlinear-ity algorithm(47)to(50)was then used to separate the signals in a block-processing manner.The step size of the adaptation was adjusted for a residual mixing of ICI dB,where the performance measureICI(51)is the average interchannel interference and is described in[15].Fig.3shows the adaptation process.The effect of the AGC can be observed as well as the convergence of the kurtoses of the output signals to the respectivevalues for Laplacian,for uniform,for16-PAM,and for Gaussian distributions. The threshold values approach either or,depending on their kurtoses,and converge around10000samples,except for the out-put with the Gaussian distribution where the threshold remains un-decided.In the next simulation we wanted to compare the adaptive threshold nonlinearity algorithm with the algorithms found by Douglas et al.[13]and Lee et al.[14].To this end,we mixed5 Laplacian and5uniform sources.The three algorithms were then run with as similar parameters as possible to allow a fair compari-son.A block processing with the block size was used for all algorithms.The step sizes of the adaptation algorithms were adjusted individually for a residual mixing of ICI dB. Averaging over50runs with different matrices(all with the char-Figure4:Separation performance of adaptive threshold nonlinear-ity.acteristics as described above)were carried out to get typical be-havior.Fig.4shows the separation performance for all tested al-gorithms.The adaptive threshold nonlinearity algorithm and the algorithm by Douglas et al.reach the dB point at exactly the same time on average,whereas the extended Infomax algorithm needs considerably more time.A comparison with simulations of Laplacian sources only[5] shows,that the convergence time associated with Laplacian sources only is about the same as the convergence time obtained here.Simulations also confirm that the adaptive threshold concept is advantageous for mixed-kurtosis signals even if the distributions are known,since byfixing the nonlinearity in advance we deprivethe system of some degree of freedom by restricting the distrib-utions to the outputs with the appropriate nonlinearity,whereas with the adaptive threshold,the system is free to choose among more permutations,thus reducing convergence time.6.CONCLUSIONSA threshold nonlinearity for the blind separation of any non-Gaussian sources has been derived.The threshold nonlinearity is not just a simplification of polynomial functions but the true score function for the uniform distribution.As shown in a stability analysis,it separates any sub-Gaussian signals and,if the threshold is reduced to zero,even super-Gaussian signals.Using the kurtosis of the output signal to control the threshold parameter,the adaptive threshold nonlinearity might be used for the blind separation of mixed-kurtosis signals.The threshold nonlinearity offers very simple implementation options,since the set of possible output values of the nonlinearity only contains three values,and for negative-kurtosis signals, and two values,for positive-kurtosis signals,respectively.The threshold operation can easily be implemented by two comparators only.7.APPENDIX:DETAILED DERIV ATION OF THETHRESHOLD NONLINEARITY Differentiating(6)with respect to leads to(54) For,Eq.(54)gives(57) Both terms are close to for large values of,so that simplification of(57)yields(8).The first term of the Taylor expansion of a sine function for a small argument is just the argument itself,leading to(9).Eq.(10)finally is a consequence of the behavior of depending on being less or greater than one.8.REFERENCES[1]S.-I.Amari,A.Cichocki,and H.H.Yang,“A new learningalgorithm for blind signal separation,”Advances in Neural Information Processing Systems,vol.8,pp.757–763,1996.[2] A.J.Bell and T.J.Sejnowski,“An information-maximiza-tion approach to blind separation and blind deconvolution,”Neural Computation,vol.7,pp.1129–1159,1995.[3]J.-F.Cardoso,“Blind signal separation:Statistical princi-ples,”Proc.IEEE,vol.86,no.10,pp.2009–2025,Oct.1998.[4]M.Girolami and C.Fyfe,“Negentropy and kurtosis as pro-jection pursuit indices provide generalised ICA algorithms,”in Proc.NIPS,Aspen,CO,Dec,7,1996.[5]M.Joho and H.Mathis,“Performance comparison of com-bined blind/non-blind source separation algorithms,”in Proc.ICA,Aussois,France,Jan.11–15,1999,pp.139–142.[6]mbert and A.J.Bell,“Blind separation of multi-ple speakers in a multipath environment,”in Proc.ICASSP, Munich,Germany,Apr.21–24,1997,pp.423–426.[7] A.Hyvärinen and E.Oja,“Independent component analysisby general nonlinear Hebbian-like learning rules,”Signal Processing,vol.64,pp.301–313,1998.[8]T.-W.Lee,M.Girolami,A.J.Bell,and T.J.Sejnowski,“Aunifying information-theoretic framework for independent component analysis,”Int.J.Math.and Comp.Modeling,in press,1999.[9]H.Mathis,M.Joho,and G.S.Moschytz,“A simple thresh-old nonlinearity for blind signal separation,”in Proc.ISCAS, Geneva,Switzerland,May28–31,2000,accepted for publi-cation.[10]T.P.von Hoff,A.G.Lindgren,and A.N.Kaelin,“Trans-pose properties in the stability and performance of the classic adaptive algorithms for blind source separation and deconvo-lution,”Signal Processing,2000,submitted.[11]J.-F.Cardoso and held,“Equivariant adaptive sourceseparation,”IEEE Trans.Signal Processing,vol.44,no.12, pp.3017–3030,Dec.1996.[12]S.-I.Amari,T.-P.Chen,and A.Cichocki,“Stability analysisof adaptive blind source separation,”Neural Networks,vol.10,no.8,pp.1345–1351,Aug.1997.[13]S.C.Douglas,A.Cichocki,and S.Amari,“Multichan-nel blind separation and deconvolution of sources with arbi-trary distributions,”in IEEE Workshop on Neural Networks for Signal Processing,Almelia Island Plantation,FL,Sept.1997,pp.436–445.[14]T.-W.Lee,M.Girolami,and T.J.Sejnowski,“Independentcomponent analysis using an extended infomax algorithm for mixed sub-Gaussian and super-Gaussian sources,”Neural Computation,vol.11,no.2,pp.417–441,1999.[15]M.Joho,H.Mathis,and G.S.Moschytz,“An FFT-based al-gorithm for multichannel blind deconvolution,”in Proc.IS-CAS,Orlando,FL,May30–June2,1999,pp.III–203–206.[16]mbert,Multichannel Blind Deconvolution:FIR Ma-trix Algebra and Separation of Multipath Mixtures,Ph.D.thesis,University of Southern California,1996.[17]I.N.Bronshtein and K.A.Semendyayev,Handbook ofMathematics,Springer Verlag,3rd edition,1997.。