Kolmogorov Complexity with Error
Quantum Kolmogorov Complexity Based on Classical Descriptions
a r X i v :q u a n t -p h /0102108v 2 9 O c t 2001Quantum Kolmogorov Complexity Based onClassical DescriptionsPaul M.B.Vit´a nyiAbstract —We develop a theory of the algorithmic informa-tion in bits contained in an individual pure quantum state.This extends classical Kolmogorov complexity to the quan-tum domain retaining classical descriptions.Quantum Kol-mogorov complexity coincides with the classical Kolmogorov complexity on the classical domain.Quantum Kolmogorov complexity is upper bounded and can be effectively approx-imated from above under certain conditions.With high probability a quantum object is incompressible.Upper-and lower bounds of the quantum complexity of multiple copies of individual pure quantum states are derived and may shed some light on the no-cloning properties of quantum states.In the quantum situation complexity is not sub-additive.We discuss some relations with “no-cloning”and “approximate cloning”properties.Keywords —Algorithmic information theory,quantum;classical descriptions of quantum states;information the-ory,quantum;Kolmogorov complexity,quantum;quantum cloning.I.IntroductionQUANTUM information theory,the quantum mechan-ical analogue of classical information theory [6],is ex-periencing a renaissance [2]due to the rising interest in the notion of quantum computation and the possibility of re-alizing a quantum computer [16].While Kolmogorov com-plexity [12]is the accepted absolute measure of information content in a individual classical finite object,a similar ab-solute notion is needed for the information content of an individual pure quantum state.One motivation is to extend probabilistic quantum information theory to Kolmogorov’s absolute individual notion.Another reason is to try and duplicate the success of classical Kolmogorov complexity as a general proof method in applications ranging from com-binatorics to the analysis of algorithms,and from pattern recognition to learning theory [13].We propose a theory of quantum Kolmogorov complexity based on classical de-scriptions and derive the results given in the abstract.A preliminary partial version appeared as [19].What are the problems and choices to be made develop-ing a theory of quantum Kolmogorov complexity?Quan-tum theory assumes that every complex vector of unit length represents a realizable pure quantum state [17].There arises the question of how to design the equipment that prepares such a pure state.While there are contin-uously many pure states in a finite-dimensional complexPartially supported by the EU fifth framework project QAIP,IST–1999–11234,the NoE QUIPROCONE IST–1999–29064,the ESF QiT Programmme,and the EU Fourth Framework BRA NeuroCOLT II Working Group EP 27150.Part of this work was done during the author’s 1998stay at Tokyo Institute of Technology,Tokyo,Japan,as Gaikoku-Jin Kenkyuin at INCOCSAT,and appeared in a preliminary version [19]archived as quant-ph/9907035.Address:CWI,Kruislaan 413,1098SJ Amsterdam,The Netherlands.Email:paulv@cwi.nlvector space—corresponding to all vectors of unit length—we can finitely describe only a countable subset.Imposing effectiveness on such descriptions leads to constructive pro-cedures.The most general such procedures satisfying uni-versally agreed-upon logical principles of effectiveness are quantum Turing machines,[3].To define quantum Kol-mogorov complexity by way of quantum Turing machines leaves essentially two options:1.We want to describe every quantum superposition ex-actly;or2.we want to take into account the number of bits/qubits in the specification as well the accuracy of the quantum state produced.We have to deal with three problems:•There are continuously many quantum Turing machines;•There are continuously many pure quantum states;•There are continuously many qubit descriptions.There are uncountably many quantum Turing machines only if we allow arbitrary real rotations in the definition of machines.Then,a quantum Turing machine can only be universal in the sense that it can approximate the compu-tation of an arbitrary machine,[3].In descriptions using universal quantum Turing machines we would have to ac-count for the closeness of approximation,the number of steps required to get this precision,and the like.In con-trast,if we fix the rotation of all contemplated machines to a single primitive rotation θwith cos θ=35,then there are only countably many Turing machines and the universal machine simulates the others exactly [1].Ev-ery quantum Turing machine computation,using arbitrary real rotations to obtain a target pure quantum state,can be approximated to every precision by machines with fixed rotation θbut in general cannot be simulated exactly—just like in the case of the simulation of arbitrary quantum Turing machines by a universal quantum Turing machine.Since exact simulation is impossible by a fixed universal quantum Turing machine anyhow,but arbitrarily close ap-proximations are possible by Turing machines using a fixed rotation like θ,we are motivated to fix Q 1,Q 2,...as a stan-dard enumeration of quantum Turing machines using only rotation θ.Our next question is whether we want programs (descrip-tions)to be in classical bits or in qubits?The intuitive no-tion of computability requires the programs to be ly,to prepare a quantum state requires a physical ap-paratus that “computes”this quantum state from classical specifications.Since such specifications have effective de-scriptions,every quantum state that can be prepared can be described effectively in descriptions consisting of classi-cal bits.Descriptions consisting of arbitrary pure quantumstates allows noncomputable(or hard to compute)informa-tion to be hidden in the bits of the amplitudes.In Defini-tion4we call a pure quantum state directly computable if there is a(classical)program such that the universal quan-tum Turing machine computes that state from the program and then halts in an appropriate fashion.In a computa-tional setting we naturally require that directly computable pure quantum states can be prepared.By repeating the preparation we can obtain arbitrarily many copies of the pure quantum state.If descriptions are not effective then we are not going to use them in our algorithms except possibly on inputs from an“unprepared”origin.Every quantum state used in a quantum computation arises from some classically prepa-ration or is possibly captured from some unknown origin. If the latter,then we can consume it as conditional side-information or an oracle.Restricting ourselves to an effective enumeration of quan-tum Turing machines and classical descriptions to describe by approximation continuously many pure quantum states is reminiscent of the construction of continuously many real numbers from Cauchy sequences of rational numbers,the rationals being effectively enumerable.Kolmogorov complexity:We summarize some basic definitions in Appendix A(see also this journal[20])in order to establish notations and recall the notion of short-est effective descriptions.More details can be found in the textbook[13].Shortest effective descriptions are“effective”in the sense that they are programs:we can compute the described objects from them.Unfortunately,[12],there is no algorithm that computes the shortest program and then halts,that is,there is no general method to compute the length of a shortest description(the Kolmogorov com-plexity)from the object being described.This obviously impedes actual use.Instead,one needs to consider com-putable approximations to shortest descriptions,for exam-ple by restricting the allowable approximation time.Apart from computability and approximability,there is another property of descriptions that is important to us.A set of descriptions is prefix-free if no description is a proper prefix of another description.Such a set is called a prefix code. Since a code message consists of concatenated code words, we have to parse it into its constituent code words to re-trieve the encoded source message.If the code is uniquely decodable,then every code message can be decoded in only one way.The importance of prefix-codes stems from the fact that(i)they are uniquely decodable from left to right without backing up,and(ii)for every uniquely decodable code there is a prefix code with the same length code words. Therefore,we can restrict ourselves to prefix codes.In our setting we require the set of programs to be prefix-free and hence to be a prefix-code for the objects being described.It is well-known that with every prefix-code there corresponds a probability distribution P(·)such that the prefix-code is a Shannon-Fano code1that assigns prefix code length l x=−log P(x)to x—irrespective of the regularities in x. 1In what follows,“log”denotes the binary logarithm.For example,with the uniform distribution P(x)=2−n on the set of n-bit source words,the Shannon-Fano code word length of an all-zero source word equals the code word length of a truly irregular source word.The Shannon-Fano code gives an expected code word length close to the en-tropy,and,by Shannon’s Noiseless Coding Theorem,it possesses the optimal expected code word length.But the Shannon-Fano code is not optimal for individual elements: it does not take advantage of the regularity in some ele-ments to encode those shorter.In contrast,one can view the Kolmogorov complexity K(x)as the code word length of the shortest program x∗for x,the set of shortest pro-grams consitituting the Shannon-Fano code of the so-called “universal distribution”m(x)=2−K(x).The code consist-ing of the shortest programs has the remarkable property that it achieves(i)an expected code length that is about optimal since it is close to the entropy,and simultaneously, (ii)every individual object is coded as short as is effectively possible,that is,squeezing out all regularity.In this sense the set of shortest programs constitutes the optimal effec-tive Shannon-Fano code,induced by the optimal effective distribution(the universal distribution).Quantum Computing:We summarize some basic def-initions in Appendix B in order to establish notations and briefly review the notion of a quantum Turing machine computation.See also this journal’s survey[2]on quan-tum information theory.More details can be found in the textbook[16].Loosely speaking,like randomized compu-tation is a generalization of deterministic computation,so is quantum computation a generalization of randomized computation.Realizing a mathematical random source to drive a random computation is,in its ideal form,presum-ably impossible(or impossible to certify)in practice.Thus, in applications an algorithmic random number generator is used.Strictly speaking this invalidates the analysis based on mathematical randomized computation.As John von Neumann[15]put it:“Any one who considers arithmetical methods of producing random digits is,of course,in a state of sin.For,as has been pointed out several times,there is no such thing as a random number—there are only meth-ods to produce random numbers,and a strict arithmetical procedure is of course not such a method.”In practice ran-domized computations reasonably satisfy theoretical anal-ysis.In the quantum computation setting,the practical problem is that the ideal coherent superposition cannot re-ally be maintained during computation but deteriorates—it decoheres.In our analysis we abstract from that problem and one hopes that in practice anti-decoherence techniques will suffice to approximate the idealized performance suffi-ciently.We view a quantum Turing machine as a generalization of the classic probabilistic(that is,randomized)Turing machine.The probabilistic Turing machine computation follows multiple computation paths in parallel,each path with a certain associated probability.The quantum Turing machine computation follows multiple computation paths in parallel,but now every path has an associated complex probability amplitude.If it is possible to reach the sameVIT´ANYI:QUANTUM KOLMOGOROV COMPLEXITY BASED ON CLASSICAL DESCRIPTIONS3state via different paths,then in the probabilistic case the probability of observing that state is simply the sum of the path probabilities.In the quantum case it is the squared norm of the summed path probability amplitudes.Since the probability amplitudes can be of opposite sign,the ob-servation probability can vanish;if the path probability amplitudes are of equal sign then the observation probabil-ity can get boosted since it is the square of the sum norm. While this generalizes the probabilistic aspect,and boosts the computation power through the phenomenon of inter-ference between parallel computation paths,there are extra restrictions vis-a-vis probabilistic computation in that the quantum evolution must be unitary.Quantum Kolmogorov Complexity:We define the Kolmogorov complexity of a pure quantum state as the length of the shortest two-part code consisting of a classical program to compute an approximate pure quantum state and the negative log-fidelity of the approximation to the target quantum state.We show that the resulting quantum Kolmogorov complexity coincides with the classical self-delimiting complexity on the domain of classical objects; and that certain properties that we love and cherish in the classical Kolmogorov complexity are shared by the new quantum Kolmogorov complexity:quantum Kolmogorov complexity of an n-qubit object is upper bounded by about 2n;it is not computable but can under certain conditions be approximated from above by a computable process;and with high probability a quantum object is incompressible. We may call this quantum Kolmogorov complexity the bit complexity of a pure quantum state|φ (using Dirac’s“ket”notation)and denote it by K(|φ ).From now on,we will denote by+<an inequality to within an additive constant, and by+=the situation when both+<and+>hold.For exam-ple,we will show that,for n-qubit states|φ ,the complexity satisfies K(|φ |n)+<2n.For certain restricted pure quan-tum states,quantum kolmogorov complexity satisfies the sub-additive property:K(|φ,ψ )+<K(|φ )+K(|ψ ||φ ). But,in general,quantum Kolmogorov complexity is not sub-additive.Although“cloning”of non-orthogonal states is forbidden in the quantum setting[21],[7],m copies of the same quantum state have combined complexity that can be considerable lower than m times the complexity of a single copy.In fact,quantum Kolmogorov complex-ity appears to enable us to express and partially quantify “non-clonability”and“approximate clonability”of individ-ual pure quantum states.Related Work:In the classical situation there are sev-eral variants of Kolmogorov complexity that are very mean-ingful in their respective settings:plain Kolmogorov com-plexity,prefix complexity,monotone complexity,uniform complexity,negative logarithm of universal measure,and so on[13].It is therefore not surprising that in the more com-plicated situation of quantum information several different choices of complexity can be meaningful and unavoidable in different settings.Following the preliminary version[19] of this work there have been alternative proposals:Qubit Descriptions:The most straightforward way to define a notion of quantum Kolmogorov complexity is to consider the shortest effective qubit description of a pure quantum state which is studied in[4].(This qubit com-plexity can also be formulated in terms of the conditional version of bit complexity as in[19].)An advantage of qubit complexity is that the upper bound on the complexity of a pure quantum state is immediately given by the number of qubits involved in the literal description of that pure quan-tum state.Let us denote the resulting qubit complexity of a pure quantum state|φ by KQ(|φ ).While it is clear that(just as with the previous aproach) the qubit complexity is not computable,it is unlikely that one can approximate the qubit complexity from above by a computable process in some meaningful sense.In particu-lar,the dovetailing approach we used in our approach now doesn’t seem applicable due to the non-countability of the potentential qubit program candidates.The quantitative incompressibility properties are much like the classical case (this is important for future applications).There are some interesting exceptions in case of objects consisting of multi-ple copies related to the“no-cloning”property of quantum objects,[21],[7].Qubit complexity does not satisfy the sub-additive property,and a certain version of it(bounded fidelity)is bounded above by the von Neumann entropy. Density Matrices:In classical algorithmic informa-tion theory it turns out that the negative logarithm of the “largest”probability distribution effectively approximable from below—the universal distribution—coincides with the self-delimiting Kolmogorov complexity.In[8]G´a cs defines two notions of complexities based on the negative loga-rithm of the“largest”density matrixµeffectively approx-imable from below.There arise two different complexi-ties of|φ based on whether we take the logarithm inside as KG(|φ )=− φ|logµ|φ or outside as Kg(|φ )=−log φ|µ|φ .It turns out that Kg(|φ )+<KG(|φ ). This approach serves to compare the two approaches above: It was shown that Kg(|φ )is within a factor four of K(|φ ); that KG(|φ )essentially is a lower bound on KQ(|φ )and an oracle version of KG is essentially an upper bound on qubit complexity KQ.Since qubit complexity is trivially+<n and it was shown that bit complexity is typically close to2n,atfirst glance this leaves the possibility that the two complexities are within a factor two of each other.This turns out to be not the case since it was shown that the Kg complexity can for some arguments be much smaller than the KG complexity,so that the bit complexity is in these cases also much smaller than the qubit complexity.As[8] states:this is due to the permissive way the bit complexity deals with approximation.The von Neumann entropy of a computable density matrix is within an additive constant (the complexity of the program computing the density ma-trix)of a notion of average complexity.The drawback of density matrix based complexity is that we seem to have lost the direct relation with a meaningful interpretation in terms of description length:a crucial aspect of classical Kolmogorov complexity in most applications[13].Real Descriptions:A version of quantum Kolmogorov4IEEE TRANSACTIONS ON INFORMATION THEORYcomplexity briefly considered in[19]uses computable real parameters to describe the pure quantum state with com-plex probability amplitudes.This requires two reals per complex probability amplitude,that is,for n qubits one requires2n+1real numbers in the worst case.A real num-ber is computable if there is afixed program that outputs consecutive bits of the binary expansion of the number for-ever.Since every computable real number may require a separate program,a computable n-qubit pure state may re-quire2n+1finite programs.Most n-qubit pure states have parameters that are noncomputable and increased preci-sion will require increasingly long programs.For exam-ple,if the parameters are recursively enumerable(the po-sitions of the“1”s in the binary expansion is a recursively enumerable set),then a log k length program per parame-ter,to achieve k bits precision per recursively enumerable real,is sufficient and for some recursively enumerable re-als also necessary.In certain contexts where the approx-imation of the real parameters is a central concern,such considerations may be useful.While this approach does not allow the development of a clean theory in the sense of the previous approaches,it can be directly developed in terms of algorithmic thermodynamics—an extension of Kolmogorov complexity to randomness of infinite sequences (such as binary expansions of real numbers)in terms of coarse-graining and sequential Martin-L¨o f tests,analogous to the classical case in[9],[13].But this is outside the scope of the present paper.II.Quantum Turing Machine ModelWe assume the notation and definitions in Appendices A, B.Our model of computation is a quantum Turing ma-chine equipped with a input tape that is one-way infinite with the classical input(the program)in binary left ad-justed from the beginning.We require that the input tape is read-only from left-to-right without backing up.This automatically yields a property we require in the sequel: The set of halting programs is prefix-free.Additionaly,the machine contains a one-way infinite work tape containing qubits,a one-way infinite auxiliary tape containing qubits, and a one-way infinite output tape containing qubits.Ini-tially,the input tape contains a classical binary program p, and all(qu)bits of the work tape,auxiliary tape,and out-put tape qubits are set to|0 .In case the Turing machine has an auxiliary input(classical or quantum)then initially the leftmost qubits of the auxiliary tape contain this in-put.A quantum Turing machine Q with classical program p and auxiliary input y computes until it halts with output Q(p,y)on its output tape or it computes forever.Halt-ing is a more complicated matter here than in the classical case since quantum Turing machines are reversible,which means that there must be an ongoing evolution with non-repeating configurations.There are various ways to resolve this problem[3]and we do not discuss this matter further. We only consider quantum Turing machine that do not modify the output tape after halting.Another—related—problem is that after halting the quantum state on the out-put tape may be“entangled”with the quantum state of the remainder of the machine,that is,the input tape,thefinite control,the work tape,and the auxilliary tape.This hasthe effect that the output state viewed in isolation may notbe a pure quantum state but a mixture of pure quantumstates.This problem does not arise if the output and the remainder of the machine form a tensor product so that theoutput is un-entangled with the remainder.The results inthis paper are invariant under these different assumptions,but considering output entangled with the remainder ofthe machine complicates formulas and calculations.Corre-spondingly,we restrict consideration to outputs that forma tensor product with the remainder of the machine,withthe understanding that the same results hold with aboutthe same proofs if we choose the other option—except inthe case of Theorem4item(ii),see the pertinent caveat there.Note that the Kolmogorov complexity based on en-tangled output tapes is at most(and conceivably less than)the Kolmogorov complexity based on un-entangled outputtapes.Definition1:Define the output Q(p,y)of a quantumTuring machine Q with classical program p and auxil-iary input y as the pure quantum state|ψ resulting of Q computing until it halts with output|ψ on its ouputtape.Moreover,|ψ doesn’t change after halting,andit is un-entangled with the remainder of Q’s configura-tion.We write Q(p,y)<∞.If there is no such|ψthen Q(p,y)is undefined and we write Q(p,y)=∞.By definition the input tape is read-only from left-to-rightwithout backing up:therefore the set of halting programsP y={p:Q(p,y)<∞}is prefix-free:no program in P y is a proper prefix of another program in P y.Put differ-ently,the Turing machine scans all of a halting program p but never scans the bit following the last bit of p:it isself-delimiting.Wefix the rotation of all contemplated machines to a sin-gle primitive rotationθwith cosθ=35.Thereare only countably many such Turing ing astandard ordering,wefix Q1,Q2,...as a standard enumer-ation of quantum Turing machines using only rotationθ. By[1],there is a universal machine U in this enumeration that simulates the others exactly:U(1i0p,y)=Q i(p,y), for all i,p,y.(Instead of the many-bit encoding1i0for i we can use a shorter self-delimiting code like i′in Ap-pendix A.)As noted in the Introduction,every quantum Turing machine computation using arbitrary real rotations can be approximated to arbitrary precision by machines withfixed rotationθbut in general cannot be simulated exactly.Remark1:There are two possible interpretations for the computation relation Q(p,y)=|x .In the narrow interpre-tation we require that Q with p on the input tape and y on the conditional tape halts with|x on the output tape.In the wide interpretation we can define pure quantum states by requiring that for every precision parameter k>0the computation of Q with p on the input tape and y on the conditional tape,with k on a special new tape where the precision is to be supplied,halts with|x′ on the output tape and|| x|x′ ||2≥1−1/2k.Such a notion of“com-VIT ´ANYI:QUANTUM KOLMOGOROV COMPLEXITY BASED ON CLASSICAL DESCRIPTIONS5putable”or “recursive”pure quantum states is similar to Turing’s notion of “computable numbers.”In the remain-der of this section we use the narrow interpretation.Remark 2:As remarked in [8],the notion of a quan-tum computer is not essential to the theory here or in [4],[8].Since the computation time of the machine is not limited in the theory of description complexity as de-veloped here,a quantum computer can be simulated by a classical computer to every desired degree of precision.We can rephrase everything in terms of the standard enu-meration of T 1,T 2,...of classical Turing machines.Let |x = N −1i =0αi |e i (N =2n )be an n -qubit state.We can write T (p )=|x if T either outputs(i)algebraic definitions of the coefficients of |x (in case these are algebraic),or(ii)a sequence of approximations (α0,k ,...,αN −1,k )for k =1,2,...where αi,k is an algebraic approximation of αi to within 2−k .III.Classical Descriptions of Pure QuantumStates The complex quantity x |z is the inner product of vec-tors x |and |z .Since pure quantum states |x ,|z have unit length,|| x |z ||=|cos θ|where θis the angle between vectors |x and |z .The quantity || x |z ||2,the fidelity between |x and |z ,is a measure of how “close”or “con-fusable”the vectors |x and |z are.It is the probability of outcome |x being measured from state |z .Essentially,we project |z on outcome |x using projection |x x |resulting in x |z |x .Definition 2:The (self-delimiting)complexity of |x with respect to quantum Turing machine Q with y as conditional input given for free isK Q (|x |y )=min p{l (p )+⌈−log || z |x ||2⌉:Q (p,y )=|z }(1)where l (p )is the number of bits in the program p ,auxiliary y is an input (possibly quantum)state,and |x is the target state that one is trying to describe.Note that |z is the quantum state produced by the com-putation Q (p,y ),and therefore,given Q and y ,completely determined by p .Therefore,we obtain the minimum of the right-hand side of the equality by minimizing over p only.We call the |z that minimizes the right-hand sidethe directly computed part of |x while ⌈−log || z |x ||2⌉is the approximation part .Quantum Kolmogorov complexity is the sum of two terms:the first term is the integral length of a binary pro-gram,and the second term,the minlog probability term,corresponds to the length of the corresponding code word in the Shannon-Fano code associated with that probabil-ity distribution,see for example [6],and is thus also ex-pressed in an integral number of bits.Let us consider this relation more closely:For a quantum system |z the quantity P (x )=|| z |x ||2is the probability that the system passes a test for |x ,and vice versa.The term ⌈−log || z |x ||2⌉can be viewed as the code word lengthto redescribe |x ,given |z and an orthonormal basis with |x as one of the basis vectors,using the Shannon-Fano pre-fix code.This works as follows:Write N =2n .For every state |z in (2n )-dimensional Hilbert space with basis vec-tors B ={|e 0 ,...,|e N −1 }we have N −1i =0|| e i |z ||2=1.If the basis has |x as one of the basis vectors,then we can consider |z as a random variable that assumes value |x with probability || x |z ||2.The Shannon-Fano code word for |x in the probabilistic ensemble B ,(|| e i |z ||2)iisbased on the probability || x |z ||2of |x ,given |z ,and haslength ⌈−log || x |z ||2⌉.Considering a canonical method of constructing an orthonormal basis B =|e 0 ,...,|e N −1 from a given basis vector,we can choose B such thatK (B )+=min i {K (|e i )}.The Shannon-Fano code is ap-propriate for our purpose since it is optimal in that it achieves the least expected code word length—the expec-tation taken over the probability of the source words—up to 1bit by Shannon’s Noiseless Coding Theorem.As in the classical case the quantum Kolmogorov complexity is an integral number.The main property required to be able to develop a meaningful theory is that our definition satisfies a so-called Invariance Theorem (see also Appendix A).Below we use “U ”to denote a special type of universal (quantum)Turing machine rather than a unitary matrix.Theorem 1(Invariance)There is a universal machine U ,such that for all machines Q ,there is a constant c Q (the length of the description of the index of Q in the enumera-tion),such that for all quantum states |x and all auxiliary inputs y we have:K U (|x |y )≤K Q (|x |y )+c Q .Proof:Assume that the program p that minimizes the right-hand side of (1)is p 0and the computed |z is |z 0 :K Q (|x |y )=l (p 0)+⌈−log || z 0|x ||2⌉.There is a universal quantum Turing machine U in the standard enumeration Q 1,Q 2,...such that for every quan-tum Turing machine Q in the enumeration there is a self-delimiting program i Q (the index of Q )and U (i Q p,y )=Q (p,y )for all p,y :if Q (p,y )=|z then U (i Q p,y )=|z .In particular,this holds for p 0such that Q with auxiliary input y halts with output |z 0 .But U with auxiliary input y halts on input i Q p 0also with output |z 0 .Consequently,the program q that minimizes the right-hand side of (1)with U substituted for Q ,and computes U (q,y )=|u for some state |u possibly different from |z ,satisfiesK U (|x |y )=l (q )+⌈−log || u |x ||2⌉≤l (i Q p 0)+⌈−log || z 0|x ||2⌉.Combining the two displayed inequalities,and setting c Q =l (i Q ),proves the theorem.。
信息论课程(英文)章节总结
Log sum inequality. For n positive numbers, a1, a2, . . . , an and b1, b2, . . . , bn,
n ai ∑a a log ≥ ∑ i b ∑ a i log ∑ b i= 1 i i= 1 ai =constant with equality iff bi
Properties of D and I I(X;Y) = H(X) – H(X|Y) = H(Y) – H(Y|X) = H(X) + H(Y) – H(X,Y)
Data processing inequality. The data-processing inequality can be used to show that no cleaver manipulation of the data can improve inferences that can be made from the data. If X → Y → Z forms a Markov chain,
I X ; Y = ∑
x ∈ X y ∈Y
∑ p x , y log p x p y
p x , y
Jensen's inequality. Jensen's inequality is one of the most widely used in mathematics and one that underlies many of the basic results in information theory. If f is a convex function, then
I X 1 , X 2 , . . , X n∣Y = ∑ I X i ; Y ∣ X 1, X 2, .. , X i− 1
几类非负矩阵特征值反问题
The Inverse Eigenvalue Problem for Several Classes of Nonnegative MatricesA DissertationSubmitted for the Degree of MasterOn computational mathematicsby Tian YuUnder the Supervision ofProf. Wang Jinlin(College of Mathematics and Information Sciences)Nanchang Hangkong University, Nanchang, ChinaJune, 2011摘 要非负矩阵理论一直是矩阵理论中最活跃的研究领域之一,在数学、自然科学的其他分支以及社会科学中都广泛涉及到,例如博弈论、Markov链(随机矩阵)、概率论、概率算法、数值分析、离散分布、群论、matrix scaling、小振荡弹性系统(振荡矩阵)和经济学等等.近年来,特征值反问题是矩阵理论研究的热点,本文将就非负矩阵特征值反问题(NIEP)这一问题进行研究.文章主要研究几类特殊形式的非负矩阵特征值反问题,得到了相关问题的充分必要条件和一些充分条件,进而给出了这几类特殊形式的非负矩阵特征值反问题数值算法,并通过数值算例来验证相关定理的正确性以及算法的准确性.主要工作如下: 第一章是绪论部分,阐述了非负矩阵特征值反问题的重要意义和发展历程,介绍国内外研究现状.第二章,研究非负三对角矩阵特征值反问题.首先对三阶非负三对角矩阵特征值反问题,分几种情形进行讨论,解决了三阶非负三对角矩阵特征值反问题,得到了三阶非负三对角矩阵特征值反问题有解的充分必要条件.然后对n阶非负三对角矩阵特征值反问题,通过非负三对角矩阵截断矩阵特征多项式,并结合Jacobi 矩阵特征值的关系,得到了非负三对角矩阵的特征值的相关性质,并最终解决了非负三对角矩阵特征值反问题.第三章,研究非负五对角矩阵特征值反问题.三阶非负五对角矩阵,即是三阶非负矩阵,文中给出了其特征值反问题有解的充分必要条件,而对于n阶非负五对角矩阵特征值反问题,由于其复杂性,文中仅给出了它的一些充分条件.第四章,研究非负循环矩阵特征值反问题.首先总结了NIEP近些年来取得的研究成果,提出实循环矩阵特征值反问题,并成功解决了实循环矩阵特征值反问题,得到其充分必要条件.最后在实循环矩阵特征值反问题的基础上提出非负循环矩阵特征值反问题,得到了充分条件和相关推论.第五章,根据第二、三、四章的结论给出相关算法和实例.第六章,在总结全文的同时,提出了需要进一步研究的问题.关键词:特征值,反问题,非负三对角矩阵,非负五对角矩阵,非负循环矩阵AbstractThe theory of nonnegative matrices has always been one of the most active research areas in the matrix theory and has been widely applied in mathematics and other branches of natural and social sciences. There are, for example, game theory, Markov chains (stochastic martices), theory of probability, probabilistic algorithms, numerical analysis, discrete distribution, group theory, matrix scaling, theory of small osillations of elastic systems (oscillation marrices), economics and so on. In recent years, the inverse eigenvalue problem comes to be the focus of the matrix theory. This thesis will study the inverse eigenvalue problem for nonnegative matrices (NIEP). The major researches of this theisis focus on the inverse eigenvalue problem for several special classes of nonnegative matrices, the necessary and sufficient conditions and some sufficient conditions of which are derived. Moreover, the numerical algorithms of the inverse eigenvalue problem for these special classes of nonnegative matrices are given, the accuracy of which together with the correcteness of related theories is testified by several numerical examples. The main procedures of this theisis are as follows:In the first chapter, the significance and the development of the inverse eigenvalue problem for nonnegative matrices are addressed, and the research situation home and abroad is introduced.In the second chapter, the inverse eigenvalue problem for nonnegative tridiagonal matrices is studied. First, the inverse eigenvalue problem for 33⨯ nonnegative tridiagonal matrices is solved by discussion of a variety of situations. Moveover ,the necessary and sufficient conditions of the solutions of the inverse eigenvalue problem for 33⨯ nonnegative tridiagonal matrices are derived. Then, the properties of eigenvalue of n n ⨯ nonnegative tridiagonal matrices are derived by characteristic polynomial of truncated matrices of nonnegative tridiagonal matrices, with the combination of the relationship between eigenvalues of Jacobi matrix. Finally, the inverse eigenvalue problem for nonnegative tridiagonal matrices is solved.In the third chapter, the inverse eigenvalue problem for nonnegative five-diagonal matrices is studied. 33⨯ nonnegative five-diagonal matrices is also 33⨯ nonnegative matrices, the necessary and sufficient conditions of the solutions of the inverse eigenvalue problem for which are given in this thesis. For the inverseeigenvalue problem for n nnonnegative five-diagonal matrices, only some sufficient conditions are given because of its complexity.In the fourth chapter, the inverse eigenvalue problem for nonnegative circulant matrices is studied. First, some remarkable conclusions of the inverse eigenvalue problem for nonnegative matrices in recent years are summarized. Then, the inverse eigenvalue problem for real circulant matrices is advanced and successfully solved, the necessary and sufficient conditions of which are given also. Finally, the inverse eigenvalue problem for nonnegative circulant matrices is advanced based on the inverse eigenvalue problem for real circulant matrices, whose sufficient conditions and some relevant conclusions are given.In the fifth chapter, some algorithms and numerical examples are given based on the conclusions derived in the previous three chapters.In the sixth chapter, the summary of the paper is given and the future research work is put forward.Key words: eigenvalue, inverse problem, nonnegative tridiagonal matrices, nonnegative five-diagonal matrices, nonnegative circulant matrices目录第一章 绪论 (1)1.1选题的依据与意义 (1)1.2非负矩阵特征值反问题的研究现状 (2)1.3研究的主要内容 (3)第二章 非负三对角矩阵特征值反问题 (5)2.1引言 (5)2.2三阶非负三对角矩阵特征值反问题 (6)n阶非负三对角矩阵特征值反问题 (24)2.3第三章 非负五对角矩阵特征值反问题 (33)3.1引言 (33)3.2非负五对角矩阵特征值反问题相关结论 (33)第四章 非负循环矩阵特征值反问题 (38)4.1引言 (38)4.2一类特殊矩阵的特征值反问题 (40)4.3非负循环矩阵特征值反问题 (42)第五章 算法设计及实例 (45)5.1非负三对角矩阵特征值反问题算法 (45)5.2非负五对角矩阵特征值反问题算法 (47)5.3实循环矩阵特征值反问题算法 (49)5.4非负循环矩阵特征值反问题算法 (50)第六章 总结与展望 (53)6.1全文总结 (53)6.2工作展望 (53)参考文献 (54)攻读硕士学位期间发表的论文 (57)致 谢 (58)IV第一章 绪论1.1 选题的依据与意义反问题,顾名思义是相对于正问题而言的,它是根据事物的演化结果,由可观测的现象来探求事物的内部规律或所受的外部影响,由表及里,索隐探秘.在数学中有着许多反问题,例如已知两个自然数的乘积,如何求这两个自然数;已知导数,如何求原函数;已知一个角的三角函数值,如何求这个角的度数,等等.近些年来,人们在生活、工业生产、科学探索中经常遇到反问题,对反问题的研究也越来越受到重视.事实上,对于一般问题来说,反问题要比正问题复杂.如前面提到的求角度数问题,已知一个角,求其三角函数值是唯一的,但如果只知道一个角的三角函数值而不对这个角加以约束,这样的角将会有无穷多个,因而反问题的解一般来说不唯一.另外,反问题的解也极不稳定.因此,对反问题的研究主要包括以下几个方面:存在性、唯一性、稳定性、数值方法和实际应用.矩阵特征值反问题(又称代数特征值反问题或逆特征值问题),就是根据已给定的特征值和/或特征向量等信息来确定一个矩阵,使得该矩阵满足所给的条件.矩阵特征值反问题的来源非常广泛.它不仅来自于数学物理反问题的离散化,而且来自固体力学、粒子物理、量子物理、结构设计、系统参数识别、自动控制等许多领域.由于矩阵特征值反问题的应用广泛性,因而自从此类问题被提出来的几十年里,受到了大量学者的深入研究,得到了一系列优秀成果.本文研究的非负矩阵特征值反问题正是在此期间提出来的,它作为矩阵特征值反问题的一个重要分支,尤其是在概率统计、随机分布、系统分析方面有着重要应用.所谓非负矩阵特征值反问题就是根据已给定的特征值信息来确定一个非负矩阵,使得该非负矩阵满足所给的条件.例如在概率统计中提出一类随机矩阵(矩阵的元素行和为1),这类矩阵在Markov链中有着重要应用,假如对矩阵的特征值有某些特殊要求,能否构造和如何构造出此类矩阵?非负矩阵特征值反问题从提出到现在的几十年间,虽然受到了大量学者的研究,但由于其复杂性,目前仍存在大量的疑难问题尚未解决,这也是它吸引众多学者研究的魅力所在.因而从以上可以看出对非负矩阵特征值的研究无论是对数学本身的发展还是对其它科学的发展都有着重要的意义及广阔的前景.1.2 非负矩阵特征值反问题的研究现状非负矩阵特征值反问题的提出始于上个世纪50年代,它是由矩阵特征值反问题抽离出来的一个子问题.1937年,Kolmogorov [1]首先提出了给定一个复数z 何时为某个非负矩阵特征值的问题.1949年,Suleimanova [2]扩展了Kolmogorov 提出的问题,称为非负矩阵特征值反问题(简称NIEP),即寻找以一组复数12{,,,}n σλλλ= 为特征值的n 阶非负矩阵A ,并且假若能够找到这样一个矩阵A ,就说矩阵A 实现了σ.Kolmogorov 问题显然很容易回答,Minc [3]给出了解答,即对于33⨯阶正循环矩阵,总可以找到一个这样的矩阵使得给定的复数z 作为它的特征值.然而NIEP 从提出至今仍未得到很好地解决,为此一些学者首先从NIEP 的必要条件开始研究.Loewy 和Londow [4]、Johnson [5]给出文献[6]中NIEP 的四个必要条件,其中最后一个条件称为JLL 条件.1998年,Laffey 和Meehan [7]又对奇数阶非负矩阵进行了讨论,给出了奇数阶非负矩阵迹为零的JLL 条件.由于一般的n 阶NIEP 无法直接解答,一批学者考虑了低阶矩阵的情形.1978年,Loewy 和Londow [4]完全解决3n =时的NIEP,给出了四个充分必要条件.45n =、时的NIEP,目前只解决了迹为零的情形.1996年,Reams [8]解决了4n =时迹为零的情形,即:令1234{,,,}σλλλλ=为一组复数,假若120,0,S S =≥30S ≥和2244S S ≤(这里的41k k i i S λ==∑),则必存在一个4阶非负矩阵能够实现σ.1999年,Laffey 和Meehan [9]解决了5n =时迹为零的情形.上面介绍了6n <的情形,然而当6n ≥时,NIEP 却是一个极大地挑战,到目前为止,未见任何形式的解答.虽然NIEP 未曾从正面给出很好的解答,但却吸引大批学者对12{,,}n σλλλ= ,的特殊形式作出深入探讨,这其中包括H.Suleimanova、H.Perfect、R.Kellogg、Salzman、Guo Wuwen 等等.Suleimanova [2]证明0(2,3,,)i i n λ≤= 的σ可被实现的充分必要条件是10ni i λ=≥∑.Kellogg [10]对σ的序列进行分块研究,给出了某些符合要求的分块可被实现.Guo Wuwen [11-12]对已可实现的σ修正做了研究,其中修正后可被实现与σ中最大的数有着密切关系,文献[12]定理3.1的结论尤为重要,它在研究扩展σ可被实现中被广泛引用.另外值得一提的是Ricardo.Soto、Alberto.Borobia、Julio.Moro 三位近十年来在非负矩阵特征值反问题上做了大量深入的研究,文献[13-18]集中反映了他们在这一块的研究成果.上面介绍了NIEP,如果把上面的非负矩阵换成非负对称矩阵,则称为非负对称矩阵特征值反问题(简称SNIEP);如果把上面的一组复数12{,,}n σλλλ= ,换成一组实数,则称为非负矩阵实特征值反问题(简称RNIEP).SNIEP和RNIEP都是NIEP的子问题,它们是研究NIEP的重要组成部分,虽然两者研究的都是实特征值,但它们并不完全等价.一般地,当5n≥时,这是两个完全不同的问题.目前,当4n≤时,SNIEP已被完全解决,当5n=时,R.Loewy和J.J.Mcdonald在文献[9]中做了详细的讨论.而当6n≥时,尚无人解决.文献[19-21]给出了SNIEP的相关结论.4n=时的RNIEP已被解决,事实上,Loewy和Londow在文献[4]中给出的NIEP四个必要条件也是4阶RNIEP的充分条件.当5n≥时,目前尚未有所突破.另外,文献[2,10,22,23,24,25]给出了RNIEP的相关结论.随机矩阵和双随机矩阵作为非负矩阵的两种特殊形式,在研究NIEP中有着极为重要的应用,这里把它们归为一类问题,即随机和双随机矩阵特征值反问题.Johnson[26]证明了如果一个非负矩阵A有正Perron根ρ,则存在一个随机矩阵与1Aρ同谱.1981年,Soules[27]给出了一种构造对称双随机矩阵的方法并得到构造对称双随机矩阵的充分条件.以上是NIEP的主要研究的方向,由于NIEP的复杂性和作者的水平限度,可能衍生出更多的小问题,本文没有一一涉及到,在此后面将不再叙述.此外,由于NIEP研究不够成熟,关于它的数值计算目前研究的不多.Robert.Orsi[28]利用交错射影的思想构造出一种迭代方法来计算非负矩阵特征值反问题,但需指出的是这种迭代并不一定会能得出好的结果,仍需要找到好的判定条件.O.Rojo等在文献[29-30]中通过快速Fourier变化巧妙地得到一种构造对称非负矩阵的方法,大大节省计算时间,这种方法通过在Matlab上实现,证明效率是非常高的.目前,国内尚无对此方面的研究的相关文献.从以上可以看出,虽然非负矩阵特征值反问题的研究得到了一定的成果,但仍有大量的问题需要解决,本文将从几类特殊矩阵来探讨此类问题,进一步促进此方向的研究.例如:能否给出非负(对称)三对角矩阵的特征值反问题的充要条件以及如何实现?如何实现非负循环矩阵的特征值反问题?等等.1.3 研究的主要内容本文研究几类特殊形式的非负矩阵特征值反问题,得到了相关问题的充分必要条件和一些充分条件,进而给出这几种特殊形式的非负矩阵特征值反问题算法,并通过数值算例来验证相关定理的正确性以及算法的准确性.主要工作如下:第一章是绪论部分,阐述了非负矩阵特征值反问题的重要意义和发展历程,介绍国内外研究现状.第二章,研究非负三对角矩阵特征值反问题.首先对三阶非负三对角矩阵特征值反问题,分几种情形进行讨论,解决了三阶非负三对角矩阵特征值反问题,得到了三阶非负三对角矩阵特征值反问题有解的充分必要条件.然后对n阶非负三对角矩阵特征值反问题,通过非负三对角矩阵截断矩阵特征多项式,并结合Jacobi矩阵特征值的关系,得到了非负三对角矩阵的特征值的相关性质,并最终解决了非负三对角矩阵特征值反问题.第三章,研究非负五对角矩阵特征值反问题.三阶非负五对角矩阵,即是三阶非负矩阵,文中给出了其特征值反问题有解的充分必要条件,而对于n阶非负五对角矩阵特征值反问题,由于其复杂性,文中仅给出了它的一些充分条件.第四章,研究非负循环矩阵特征值反问题.首先总结了NIEP近些年来取得的研究成果,提出实循环矩阵特征值反问题,并成功解决了实循环矩阵特征值反问题,得到其充分必要条件.最后在实循环矩阵特征值反问题的基础上提出非负循环矩阵特征值反问题,得到了充分条件和相关推论.第五章,根据第二、三、四章的结论给出相关算法和实例.第六章,在总结全文的同时,提出了需要进一步研究的问题.南昌航空大学硕士学位论文 第二章 非负矩阵特征值反问题第二章 非负三对角矩阵特征值反问题2.1 引言在控制论、振动理论、结构设计中经常要求根据已给的特征值/或特征向量来构造矩阵,即是特征值反问题(或特征值逆问题).三对角矩阵作为一类特殊矩阵,在实际问题中常出现,是研究矩阵理论的一个重要方面,因而有必要对其特征值反问题进行研究.文章的引言部分已给出了非负矩阵特征值反问题的研究现状,可以看出对于非负三对角矩阵的特征值反问题一直缺乏研究,本章将对这一问题进行研究.首先给出如下定义.定义 2.1.1 设n 阶实三对角矩阵形式如下:11112211100n n n n n n x y z x y T z x y z x ----⎡⎤⎢⎥⎢⎥⎢⎥=⎢⎥⎢⎥⎢⎥⎣⎦. (1)若0(1,2,,)i i y z i n =>= ,则称n T 为Jacobi 矩阵;(2)若0,0,0i i i x y z ≥≥≥,则称n T 为非负三对角矩阵;(3)若0,0i i i x y z ≥=≥,则称n T 为非负对称三对角矩阵;若0,0i i i x y z ≥=>,则称n T 为非负Jacobi 矩阵.非负三对角矩阵特征值反问题:给定一组复数12{,,,}n σλλλ= ,寻找非负三对角矩阵A 以σ为特征值,并且假设能够找到这样一个矩阵,就说矩阵A 实现了σ.下面再给出两个引理.引理 2.1.1[31](广义Perron 定理) 设A 是一个n n ⨯阶非负矩阵.定义Perron 根如下:()max{:()}A A ρλλσ=∈.则()A ρ为A 的特征值,并且其相应的特征向量0x ≥(即向量x 的每个元素均大于等于零).引理 2.1.2[4] 设123{,,}σλλλ=是一个由复数构成的序列,并且假设σ满足如下条件: (i)13max{:}i i i λλσσ≤≤∈∈; (ii)σσ=;(iii)11230s λλλ=++≥; (iv)2123s s ≤.则σ能被一个非负矩阵A 实现.2.2 三阶非负三对角矩阵特征值反问题设12{,,,}n σλλλ= 是一个由n 个复数构成的序列,文献[6]给出由Loewy 和Londow [4]、Johnson [5]得到的NIEP 四个必要条件,显然这四个条件对非负三对角矩阵特征值反问题也适用,即(i)Perron 根max{:}i i ρλλσσ=∈∈; (ii)σσ=;(iii)定义1(1,2,)nk k i i s k λ===∑ ,则有0k s ≥;(iv)(JLL 条件)1(,1,2,)m m kk m s n s k m -≤= .二阶非负矩阵特征值反问题有如下结论.引理2.2.1 给定两个数12,λλ,则12{,}σλλ=可以被非负矩阵实现的充分必要条件是12,λλ均为实数(不妨设12λλ≥)并且12λλ≥.证明 首先可以证明这两个数是实数.实矩阵的特征值如果是复数(虚部不为零),则会以共轭对的形式出现,不妨将12,λλ设为,(x yi x yi i +-=.假设σ可以被实现,则存在一个非负矩阵A 以12,λλ为特征值.令非负矩阵a c A d b ⎡⎤=⎢⎥⎣⎦(,,,a b c d 均大于等于零),则2()acI A a b ab cd d bλλλλλ---==-++---. (2-1) 由式(2-1)知,有1220a b x λλ+=+=≥, (2-2) 2212ab cd x y λλ=-=+. (2-3)由式(2-2)中20a b x +=≥,根据均值不等式的关系知ab 的最大值为2x .而由式(2-3)有222ab x y cd x =++≥,显然当12,λλ是复数时,20,y ab x ≠>,矛盾.故12,λλ不可能是复数.充分性.当12λλ≥时,可以分为两种情形讨论即20λ≥和20λ<.而120λλ==时,显然可以被零矩阵实现.当20λ≥时,σ可以被1200λλ⎡⎤⎢⎥⎣⎦实现.当20λ≤时,可以取定,a b 均大于等于零使得式(2-2)成立,这时120ab cd λλ=-≤,显然可以取无数个均大于等于零,c d 使得式(2-3)成立.这样就存在一个矩阵a c d b ⎡⎤⎢⎥⎣⎦实现σ. 必要性.由于12λλ≥,故只需证12λλ<时,σ不能被现实即可.当12λλ<时,由式(2-2)有120a b λλ+=+<,而,a b 均大于等于零,矛盾.证毕.引理 2.2.2 给定三个实数123,,λλλ,如果123(2,3),i i λλλλ≥=≥和1230λλλ++≥,则123{,,}σλλλ=可被非负矩阵A 实现.证明 分三种情形讨论.当0(1,2,3)i i λ≥=时,令123000000A λλλ⎡⎤⎢⎥=⎢⎥⎢⎥⎣⎦,则A 可实现σ.当1230λλλ≥≥≥时,令13131313202202200A λλλλλλλλλ+-⎡⎤⎢⎥⎢⎥-+⎢⎥=⎢⎥⎢⎥⎢⎥⎢⎥⎣⎦,则A 可实现σ. 当1230λλλ≥≥≥时,令1231231231230A λλλλλλλλλλλλ⎡+++-⎢=+-++⎢⎢⎢⎥⎣⎦,则A 可实现σ.证毕.定理 2.2.3 给定一组实数12{,,,}n σλλλ= 12()n λλλ≥≥≥ ,1n 表示其中0(1,2,,)i i n λ>= 的个数,2n 表示0(1,2,,)i i n λ<= 的个数.如果12n n ≥且120(1,2,,)i n i i n λλ+-+≥= ,则12{,,,}n σλλλ= 可以被非负三对角矩阵实现.证明 由引理2.2.1知120(1,2,,)i n i i n λλ+-+≥= 时,1{,}(1,2,,i n i i σλλ+-==2)n 可以被一个二阶非负矩阵2(1,2,,)i A i n = 实现,而2210(1,2,,i i n n n λ≥=++),则22112{,,,}n n n σλλλ++= 可以被非负三对角矩阵22112{,,,}n n n diag λλλ++ 实现.因而12{,,}n σλλλ= ,可以被非负三对角矩阵22211212{,,,,,,,,n n n n diag A A A λλλ++ 12}n n n --0实现,其中12n n n --0表示12n n n --阶零矩阵.证毕.推论 2.2.4 给定一组实数12{,,,}n σλλλ= 12()n λλλ≥≥≥ ,1n 表示其中0(1,2,,)i i n λ>= 的个数,2n 表示0(1,2,,)i i n λ<= 的个数,11{1,2,,}n Γ= 对应的正特征值112222,,,,{1,2,,}n n n n n n λλλΓ=-+-+ 对应的负特征值2212,,,n n n n n λλλ-+-+ .如果12n n ≥,对于2Γ中的每个数j 都能在1Γ中找到一个数i 使得1220(1,2,,,1,2,,)i j i n j n n n n n λλ+≥==-+-+ 且每个i 对应一个j ,则12{,,,}n σλλλ= 可以被非负三对角矩阵实现.推论 2.2.5 给定一组实数12{,,,}n σλλλ= 12()n λλλ≥≥≥ ,1n 表示其中0(1,2,,)i i n λ>= 的个数,2n 表示0(1,2,,)i i n λ<= 的个数,11{1,2,,}n Γ= 对应的正特征值112,,,n λλλ ,222{1,2,,}n n n n n Γ=-+-+ 对应的负特征值2212,,,n n n n n λλλ-+-+ .如果12n n ≥,对于2Γ中的每个数j 都能在1Γ中找到一个数i 使得1220(1,2,,,1,2,,)i j i n j n n n n n λλ+≥==-+-+ 且每个i 对应一个j ,则12{,,,}n σλλλ= 可以被非负对称三对角矩阵实现.定理2.2.6 给定一组实数123123{,,}()σλλλλλλ=≥≥,如果3(1,2)i i λλ≥=和1230λλλ++>,假若123{,,}σλλλ=能被非负三对角矩阵111222300a b A c a b c a ⎡⎤⎢⎥=⎢⎥⎢⎥⎣⎦实现,则A 中的13,a a 均不能为零.证明 设非负三对角矩阵111222300a b A c a b c a ⎡⎤⎢⎥=⎢⎥⎢⎥⎣⎦能够实现123{,,}σλλλ=,即123,,λλλ是矩阵A 的三个特征值.首先给出矩阵A 的特征多项式.11122231232211133212312132312322111332123121323112212312231100()()()()()()()()()()()a b I A c a b c a a a a b c a b c a a a a a a a a a a a a a b c a b c a a a a a a a a a a b c b c a a a a b c a b c λλλλλλλλλλλλλλλλλ---=-----=-------=-+++++-----=-+++++---++.由根与系数的关系知,有下列成立,123123a a a λλλ++=++, (2-4) 1213231213231122a a a a a a b c b c λλλλλλ++=++--, (2-5)123123122311a a a a b c a b c λλλ=--. (2-6)令123112132321233111,,,d d d b c t λλλλλλλλλλλλ++=++===和222b c t =,显然由3(1,2)i i λλ≥=和1230λλλ++>知10,0(2,3),0(1,2)i i d d i t i ><=≥=,则式(2-4)、式(2-5)和式(2-6)可改写成如下:1123d a a a =++, (2-7) 212132312d a a a a a a t t =++--, (2-8) 31231231d a a a a t a t =--. (2-9) 下面用反证法证明13,a a 均不能为零.显然13,a a 不能同时为零,否则式(2-9)不成立.由于式(2-7)、式(2-8)和式(2-9)中的13,a a 是一个对称的关系,故不妨假设10a =.当130,0a a =≠时,有12322312331d a a d a a t t d a t=+⎧⎪=--⎨⎪=-⎩.(2-10) 由式(2-10)可得133223233//t d a t a a d d a =-⎧⎨=-+⎩. (2-11) 再来分析22313,,,,a a d d λ之间的关系.3(1,2)i i λλ≥=和1230λλλ++>,由123λλλ≥≥可知20λ>且1232λλλλ++≤.由式(2-10)有23120,a a d λ≤≤<和332d λ>.将213a d a =-带入式(2-11),有2133233()/t d a a d d a =--+. (2-12)将式(2-12)可以看做成2t 关于3a 的函数,对2t 关于3a 进行求导,可得'2213332/t d a d a =--. (2-13)显然'2t 在31(0,]a d ∈上有'20t >,而2t 又是关于31(0,]a d ∈上的连续函数,故2t在31a d =时取得最大值,这时3221d t d d =-+. (2-14) 将12311213232,d d λλλλλλλλλ++=++=和1233d λλλ=带入式(2-14)中,可得32211231213231231213231231231232221232133121231232212123123121232123312()()()()()()2()()()[(d t d d λλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλ=-+=-+++++-+++++=++-+-+-+-=++-+-+-+=++-+++=1212312332312123132312123)]()[()()]()()()().λλλλλλλλλλλλλλλλλλλλλλλλλ+++-++++=++-+++=++ (2-15)由式(2-15)可知,当130,0a a =≠时,20t <.因为123{,,}σλλλ=能被非负三对角矩阵111222300a b c a b c a ⎡⎤⎢⎥⎢⎥⎢⎥⎣⎦实现时,总有11122200t b c t b c =≥⎧⎨=≥⎩,矛盾.因而13,a a 均不能为零.证毕.定理 2.2.7 给定一组全不为零的实数123{,,}σλλλ=123()λλλ≥≥,如果3(1,2)i i λλ≥=和1230λλλ++=,则123{,,}σλλλ=不能被非负三对角矩阵111222300a b A c a b c a ⎡⎤⎢⎥=⎢⎥⎢⎥⎣⎦实现.证明 设非负三对角矩阵111222300a b A c a b c a ⎡⎤⎢⎥=⎢⎥⎢⎥⎣⎦能够实现123{,,}σλλλ=,即123,,λλλ是矩阵A 的三个特征值.同定理2.2.7的证明类似,可以给出矩阵A 的特征多项式,并由根与系数的关系可以得到式(2-4)、式(2-5)和式(2-6).对于式(2-4),当1230λλλ++=时,由于0,1,2,3i a i ≥=,可知123a a a ==0=.而123,,λλλ全不为零和3(1,2)i i λλ≥=可知0(1,2)i i λ>=和30λ<.对于式(2-6)左边=1230λλλ<,右边=1231223110a a a a b c a b c --=,左右不相等,矛盾.故123{,,}σλλλ=不能被非负三对角矩阵111222300a b A c a b c a ⎡⎤⎢⎥=⎢⎥⎢⎥⎣⎦实现.证毕. 定理2.2.8 给定一组实数123123{,,}()σλλλλλλ=≥≥,如果3(1,2)i i λλ≥=和1230λλλ++>,则123{,,}σλλλ=不能被非负三对角矩阵111222000a b A c b c a ⎡⎤⎢⎥=⎢⎥⎢⎥⎣⎦实现. 证明 设非负三对角矩阵矩阵111222000a b A c b c a ⎡⎤⎢⎥=⎢⎥⎢⎥⎣⎦能够实现123{,,}σλλλ=,即123,,λλλ是矩阵A 的三个特征值.首先给出矩阵A 的特征多项式.11122212221112321212221112321212112212221100()()()()()()()()().a b I A c b c a a a b c a b c a a a a a b c a b c a a a a a b c b c a b c a b c λλλλλλλλλλλλλλλλλ---=----=------=-++----=-++--++由根与系数的关系知,有下列成立:12312a a λλλ++=+, (2-16) 121323121122a a b c b c λλλλλλ++=--, (2-17) 123122211a b c a b c λλλ=--. (2-18)令123112132321233111,,,d d d b c t λλλλλλλλλλλλ++=++===和222b c t =,显然由3(1,2)i i λλ≥=和1230λλλ++>知10,0(2,3),0(1,2)i i d d i t i ><=≥=,则式(2-16) 、式(2-17)和式(2-18)可改写成如下:112d a a =+, (2-19) 21212d a a t t =--, (2-20) 31221d a t a t =--. (2-21)先讨论12a a =的情形.当12a a =时,由式(2-19)可知1122da a ==,则式(2-20)和式(2-21)可化为:21212/4t t d d +=-, (2-22)1123()2dt t d +=-. (2-23)这里式(2-22)和式(2-23)两式中的12t t +必须相等,因而有231212/4d d d d -=-. (2-24) 将123112132321233,,d d d λλλλλλλλλλλλ++=++==代入式(2-24)中可得到只关于123,,λλλ的方程,即2123123121323123312312132312312333322222123123123123233222221231231232332()/4(),()4()()80,3()3()143()4(()(3)λλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλ-++-++=++++-+++++=+++++++++-++++++22333222212312312323322222211232322331232222112323123)0,()()()0,(())()()0,(())()(())0.λλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλ=++-+---+=--++-+--=---+--=上式最终可化为22123123()(())0λλλλλλ----=. (2-25)由式(2-25)知要使得式(2-22)和式(2-23)两式中的12t t +相等,就必须满足1230λλλ--=或22123()0λλλ--=,故可得123λλλ=+或123()λλλ=±-.已知1233,(1,2)i i λλλλλ≥≥≥=和1230λλλ++>,显然无论是123λλλ=+还是123()λλλ=±-均不满足已知条件,因而12a a ≠.下面讨论12a a ≠的情形.结合式(2-19)、式(2-20)和式(2-21)联解,可得 2111312111()2a d a d a d t a d -+-=-, (2-26)21113122111211()()2a d a d a d t a d a d a d -+-=----. (2-27)对于式(2-26)和式(2-27)可以看成12,t t 关于1111(0,)(,)22d da d ∈ 的函数,下面把式(2-26)和式(2-27)分在两个区间上讨论.(i)当11(0,2da ∈时,先讨论1t ,令21111312()H a d a d a d =-+-,实际上1H 就是1t 的分子部分.因为20d <,所以有21211113()2d dH a d a d <-+-.由式(2-14)3210d d d -+<知312d d d <,这样2121111()2d d H a d a <-+.令2122111()2d dH a d a =-+,则在11(0,]2d a ∈上有12H H <.对2H 关于1a 求导,可得'2211111123(23)H a d a a d a =-=-,显然在112(0,)3d a ∈上有'20H >,故在11(0,2d a ∈上有'20H >,因而2H 在11(0,2d a ∈上单调递增,又因2H 在112d a =处有定义,则当112da =时,2H 取得最大值,且22111212112(((4)2228d d d d dH d d d =-+=+. (2-28)由1233,(1,2)i i λλλλλ≥≥≥=和1230λλλ++>可知1232λλλλ++≤,则12d λ≤.将1231d λλλ++=和1213232d λλλλλλ++= 代入式(2-28)中,可得21212222121323222312312132321223121323(4)8[4()]8[()()3()]8[()()3()]80.d H d d λλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλλ=+≤+++=++++++=+++++< 这样,在11(0,)2d a ∈上就有20H <,故1H 在11(0,2da ∈上同样也有10H <.因为在11(0,)2d a ∈上1120a d -<,则有10t >. 下面再来讨论2t .将2t 通分化简,得3221111121232112()2a a d a d d d d d t a d -+-++-=-. (2-29)令32231111121232()H a a d a d d d d d =-+-++-,对3H 关于1a 进行求导,得到'2231111234H a a d d d =-+--.显然在11(0,2d a ∈上,'3H 单调递增,且当10a =时,'3H 取得最小值212d d --.将1231d λλλ++=和1213232d λλλλλλ++=代入212d d --中,前面已说明12d λ≤,因而可得到2212123121323221213232231231223()()()()()()()0.d d λλλλλλλλλλλλλλλλλλλλλλλλλλ--=-++-++>--++=-+-+=-++>由上面可以看到'3H 在11(0,)2d a ∈上有'30H >.因此3H 在11(0,)2da ∈上单调递增,又3H 在10a =处有定义,3123(0)0H d d d =->,故3H 在11(0,2da ∈上有30H >,则2t 在区间11(0,2da ∈有20t <.(ii)当111(,)2da d ∈时,同样先分析1t ,直接对1H 求导可得,'21111223H a d a d =--2111122()()a d a a d =--+. (2-30) 对于式(2-30)中右边第二项有221222221213231223()()()()()0.a d d λλλλλλλλλλλλ-+>-+=-+++=-++>因而在111(,)2d a d ∈上有'10H >,故1H 在此区间上单调递增,又1H 在11a d =处有定义,则在11a d =处取得最大值,即11312()0H d d d d =-<.因此,在区间111(,)2d a d ∈上,有10H <,又1120a d ->,则10t <. 下面再来分析2t .对3H 求导,可得 '2231111234()H a a d d d =-+-+211111123()()a d a a d d d =-+-+. (2-31) 对于式(2-31)中右边第三项有221222()()0d d d λ-+>-+>.因而在111(,)2d a d ∈上有'30H >,故3H 在此区间上单调递增,又3H 在112d a =处取得最小值,即322111*********1121232112123()(2(()2222()24()20.d d d dH d d d d d d d d d d d d da d d d d =-+-++-=-++->-+++-> 因此,在区间111(,)2d a d ∈上,有30H >,又1120a d ->,则20t >. 通过对式(2-26)和式(2-27)在1111(0,)(,)22d da d ∈ 上的分析,可以得出当11(0,)2d a ∈时,120,0t t ><;当111(,)2da d ∈时,120,0t t <>.因而当12a a ≠时,12,t t 无法满足同时大于等于零.这样,以上的推导就证明了不存在非负三对角矩阵111222000a b A c b c a ⎡⎤⎢⎥=⎢⎥⎢⎥⎣⎦能够实现123{,,}σλλλ=.证毕.定理2.2.9 给定一组实数123123{,,}()σλλλλλλ=≥≥,如果3(1,2)i i λλ≥=和1230λλλ++>,则123{,,}σλλλ=不能被非负三对角矩阵111222300a b A c a b c a ⎡⎤⎢⎥=⎢⎥⎢⎥⎣⎦实现,其中123,,a a a 全不为零.证明 设非负三对角矩阵111222300a b A c a b c a ⎡⎤⎢⎥=⎢⎥⎢⎥⎣⎦能够实现123{,,}σλλλ=,其中123,,a a a 均不为零,即123,,λλλ是矩阵A 的三个特征值.首先给出矩阵A 的特征多项式.11122231232211133212312132312322111332123121323112212312231100()()()()()()()()()()()a b I A c a b c a a a a b c a b c a a a a a a a a a a a a a b c a b c a a a a a a a a a a b c b c a a a a b c a b c λλλλλλλλλλλλλλλλλ---=-----=-------=-+++++-----=-+++++---++.由根与系数的关系知,有式(2-4)、式(2-5)和式(2-6)成立.令123112132321233111,,,d d d b c t λλλλλλλλλλλλ++=++===和222b c t =,由3(1,2)i i λλ≥=和1230λλλ++>知10,0(2,3),0(1,2)i i d d i t i ><=≥=,则式(2-4)、式(2-5)和式(2-6)可改写成式(2-7)、式(2-8)和式(2-9).下面分两种情形讨论:13a a =和13a a ≠.(i)当13a a =时.由式(2-8)和(2-9)式分别得到212121323212122t t a a a a a a d a a a d +=++-=+-, (2-32) 2123121a a d t t a -+=. (2-33)显然式(2-32)和式(2-33)都有120t t +>,下面证明两式不可能相等.令23221231111231121211()2a a d a a d a d d f a a a a d a a --+-=+--=-.对于上式中的分子部分,令3241111123()H a a a d a d d =-+-.对4H 求导可得'24111232H a a d d =-+.令'40H =得1a =4H 的两个极值点分别在(,0)-∞和12(,)3d+∞上,因而4H 在区间1(0,2d 单调.因为1212a a d +=,则显然112d a <.当10a =时,43(0)0H d =->.当112d a =时,333111211243(0)()024282d d d d d d d H d =-+->-->.因此在区间1(0,)2d 内无法找到1a 满足41()0H a =,即找不到1a 使得1()0f a =,则式(2-32)和式(2-33)不相等.故当13a a =时,无法找到非负三对角矩阵111222300a b A c a b c a ⎡⎤⎢⎥=⎢⎥⎢⎥⎣⎦满足条件. (ii)当13a a ≠时.由式(2-7)、式(2-8)和式(2-9)联解,得 21233231121323231()a a a d d a t a a a a a a d a a ++-=++---, (2-34) 2123323231()a a a d d a t a a ++-=-. (2-35) 将式(2-34)通分可得:2121323*********131()()()a a a a a a d a a a a a d d a t a a ++---+-+=-212332131()a a a d d a a a -+-+=-. (2-36) ①当13a a <时.由式(2-36)可知21233211312121131321121123131()2(2)422.a a a d d a t a a d d a d a a d d d d d d a a a a -+-+=--->----+>=--因为221221213231223222()()()0d d d λλλλλλλλλλλ+<+++=+++<,则10t >. 由式(2-35)有212332323121113231321121123131()42(2)4240.a a a d d a t a a d d d d d a a d d d d d d a a a a ++-=-+-<-++<=<--②当13a a >时.对于式(2-36)分子部分有2221121123321112()(2)0424d d d da a a d d a d d d -+-+>--=-+>.因而10t <.对于式(2-33)分子部分有322112112332312()(2)0424d d d d a a a d d a d d ++-<+=+<.因而20t >.由以上的分析可以得出无论123,,a a a 如何取值均不能满足12,t t 均大于等于零.这样,就证明了找不到一个非负三对角矩阵能够实现123{,,}σλλλ=.证毕.由定理2.2.6、定理2.2.7、定理2.2.8和定理2.2.9可以得出下面的结论. 推论2.2.10 给定一组实数123123{,,}()σλλλλλλ=≥≥,如果3(1,2)i i λλ≥=和1230λλλ++≥,则123{,,}σλλλ=不能被非负三对角矩阵111222300a b A c a b c a ⎡⎤⎢⎥=⎢⎥⎢⎥⎣⎦实现. 推论2.2.11 推论2.2.3、定理2.2.6、定理2.2.7、定理2.2.8、定理2.2.9和推论2.2.10的结论中非负三对角矩阵均可改为非负对称三对角矩阵,结论依然成立.注:推论2.2.10和2.2.11实际上也是对广义Perron 定理[31]一种验证. 定理 2.2.12 给定三个实数123,,λλλ,如果132(2,3),0i i λλλλ≥=≤<,和1230λλλ++≥,则123{,,}σλλλ=不能被非负三对角矩阵111222300a b A c a b c a ⎡⎤⎢⎥=⎢⎥⎢⎥⎣⎦实现. 证明 设非负三对角矩阵111222300a b A c a b c a ⎡⎤⎢⎥=⎢⎥⎢⎥⎣⎦能够实现123{,,}σλλλ=,即123,,λλλ是矩阵A 的三个特征值.首先给出矩阵A 的特征多项式.11122231232211133212312132312322111332123121323112212312231100()()()()()()()()()()()a b I A c a b c a a a a b c a b c a a a a a a a a a a a a a b c a b c a a a a a a a a a a b c b c a a a a b c a b c λλλλλλλλλλλλλλλλλ---=-----=-------=-+++++-----=-+++++---++.由根与系数的关系知,有式(2-4)、式(2-5)和式(2-6)成立.。
生物序列比较的几种数学方法及其应用
生物序列比较的几种数学方法及其应用李玲;南旭莹;姚玉华【摘要】在生物信息学中,传统的序列比对算法在理论基础和计算上都具有局限性,因此近二十年人们提出和发展了很多序列比较的数学方法.本文综述了较有代表性的几种:图形表示及其矩阵不变量方法;研究生物大分子二级结构的拓扑图论方法;基于字出现频率的统计学方法;Kolmogorov及Lempel-Ziv复杂度方法.%In the bioinformatics, sequence alignment algorithms still have the limitations of theoretic foundation , and the computational load escalates as a power function of the length of the sequences. Therefore, in recent twenty years, many mathematic categories of sequence comparison are outlined and developed. Four main mathematic categories of sequence comparison are reviewed: graphical representation and matrix invariant methods ; topologic methods that applies to biological macromolecular secondary structure; statistics methods based on the word frequency and its distribution; kolmogorov complexity and L - Z Complexity methods.【期刊名称】《渤海大学学报(自然科学版)》【年(卷),期】2013(034)001【总页数】8页(P1-7,70)【关键词】DNA;RNA二级结构;图形表示;L-Z复杂度;序列比较【作者】李玲;南旭莹;姚玉华【作者单位】浙江树人大学基础部,浙江杭州310015;浙江理工大学生命科学学院,浙江杭州310018;浙江理工大学生命科学学院,浙江杭州310018【正文语种】中文【中图分类】Q710 引言二十世纪九十年代以来,伴随着各种基因组测序计划的展开和分子结构测定技术的突破,以及各种生物的基因和蛋白序列的研究,海量的生物序列数据如雨后春笋般迅速出现.如何处理、存储和分析这些数据?这已不是生物学家本身可以解决的问题,需要其他学科的介入,由此催生了一门新的学科——生物信息学,它是多种学科交叉、渗透的产物,涉及生物学、数学、统计学、物理学、化学、信息以及计算机科学等诸多学科的知识.生物信息学不仅具有重大的科学意义,而且具有巨大的经济效益.它既属于基础研究,以探索生物学自然学自然规律为己任;又属于应用研究,它的许多研究成果可以较快或立即产业化,成为价值很高的产品.生物信息学的这一特点在现有的许多学科中几乎是独一无二的.普遍认为,生物信息学和计算分子生物学是当前生命科学和自然科学领域中最关键的、最重要的部分,是21 世纪自然科学的核心领域之一〔1-3〕.在生物信息学中,传统的序列比对(Alignment)方法是通过将两个或多个核酸序列或蛋白质序列进行比对,通常用打分矩阵描述序列两两比对,序列比对问题变成在矩阵里寻找最佳对比路径.序列比对算法是一个动态规划算法,最早是Needleman-Wunsch 动态规划算法,在此基础上又改良产生了Smith-Waterman 算法和SIM 算法,其后又有许多算法被提出.所有这些比较相似性算法都是建立在字符串的比对上,它们的共同特点是给出插入、删除、替换的距离函数,通过计算结构之间的距离来比较相似性,最后用一个回溯技术来确定最优比对.在进行序列两两比对时,有两方面问题直接影响相似性分值:替代矩阵和空位罚分.粗糙的比对方法仅仅用相同或不同来描述两个残基的关系,显然这种方法无法描述残基取代对结构和功能的不同影响效果.空位罚分是为了补偿插入和缺失对序列相似性的影响,一般的处理方法是用两个罚分值,一个对插入的第一个空位罚分,另一个对空位的延伸罚分.对于具体的比对问题,采用不同的罚分方法会取得不同的效果.但是这些方法缺乏合适的理论模型而更多带有主观色彩,存在选取罚分函数的随意性,而选取罚分函数的好坏直接影响相似性分值;比对算法的时间和空间复杂度一直没有达到令人满意的效果,特别是多重序列比对,目前尚缺乏快速而又十分有效的算法.另外这些方法都忽略了组成基的化学性质和化学结构,而且难以在统计学上估计其置信区间〔4〕.鉴于上述序列比对方法的局限性,依据分子生物学的知识,各种数学工具已被广泛地应用于生物序列的数学建模和分析,并取得了一系列重要结果,有力地推动了生命科学的发展.数学的思想与方法已在物理学中得到广泛应用并获得成功,可以相信,二十一世纪其在分子生物学中的应用将会对整个生物学科产生极其深远的影响〔5〕.本文综述了几种应用于生物序列比较的数学方法.1 生物大分子的图形表示及其数值刻画的矩阵不变量方法生物大分子的图表示是生物学数据可视化的一条重要途径,是定性地分析生物学数据的一种强有力的工具.序列的数值刻画是对海量生物学数据进行定量分析的一种常见方法,它在本质上就是构造生物序列的特征/模式向量.DNA 序列的传统表示是由A,C,G,T 四个字母来表示的,这种字母形式具有最高的解像力,即序列的每个细节都排列得清清楚楚.但是它的解像力却不可降低,因而人们通过观察一段较长的DNA 序列时,往往不能留下总体印象.1983 年Hamori 和Ruskin 提出了DNA 序列图形表示的思想:将DNA 序列表示为一条平面或空间中的曲线〔6〕.国内外不少学者提出了众多的图形表示〔7-23〕,M.Randi c'等人基于他们的图形表示,将DNA 序列转化为矩阵等数学表示,进一步用矩阵不变量来研究DNA 序列,取得了很好的结果,图1 画出了这种应用于生物序列比较分析的图形表示方法的步骤.下面介绍几种比较有影响的图形表示方法.图1 应用于生物序列比较分析的图形方法示意图Hamori 于1983 年首先提出表示DNA 序列的图形方法——G-曲线和H-曲线.G-曲线是一种5 维空间表示,其中4 个坐标方向分别为四种核苷酸,另一个方向说明DNA 序列核苷酸的位置特征,当然这一方法不能实现可视化.当用两个坐标轴的四个方向表示四种基,另一个方向表示DNA 序列核苷酸的位置时,曲线就变成3 维空间的曲线,这样的曲线被称为H-曲线.但这种方法要实现最佳可视化效果就需要2 维投影.Hamori 和Ruskin 用H- 曲线发现,bacteriophage M13、human immunodeficiency virus(HIV)以及Epstein Barr virus (EBV)等几种病毒基含量有剧烈变化的区域〔6〕.之后,各种图形表示被提出:1)1986 年Gates 构造出最早的二维表示〔7〕,规定+x 轴单位方向为C,-x 轴方向为G,+y 轴方向为T,-y 轴方向为A;2)1994 年Nandy〔8〕将x 轴的正方向赋予G,负方向赋予A,y 轴的正负方向分别赋予C 和T;3)1995 年Leong 和Morgenthaler 提出另一种表示〔9〕:+x 轴单位方向为C,-x 轴方向为A,+y轴方向为G,-y 轴方向为T.此三种表示是由于四个核苷酸基的任意选择性产生的,可以根据后来的关于化学上核苷酸基的分类来解释,即:(1)按照弱、强氢键分类:W={A,T},S={C,G};(2)按照酮基、氨基分类:M={A,C},K={G,T};(3)按照嘌呤、嘧啶分类:R={A,G},Y={C,T}.这三种图形表示的曲线上点的坐标分别为:图2 以Nandy 的方法为例,画出了图形的坐标系统设定以及一条单链DNA 序列ATGGTGCACCTGACT 的平面图形.图2 (a)Nandy 的坐标轴系统;(b)DNA 序列ATGGTGCACCTGACT 的二维图形表示我国著名理论物理专家张春霆院士也提出了一种DNA 几何图形表示——Z 曲线,Z 曲线是表示DNA序列的一个等价的三维空间曲线.通过对Z 曲线的研究来对基因组序列进行研究是一种几何学的途径.他们还利用Z 曲线研究了真核和原核基因组中若干重要问题,证明这样的思路是切实可行的.例如,在基因识别问题中,传统的方法是分别计算编码和非编码序列中大量的概率和条件概率,通过对这些概率的比较来区别它们.而他们则通过对编码与非编码序列的Z 曲线的比较来区别它们,方法既简单,效果又好.原则上说,基因组中的许多问题都可以通过这种途径加以解决,这种独树一帜别开生面的研究思路已经得到国内外学术界的普遍好评和认可,越来越多的同行(主要是国外同行)加入到对Z 曲线研究的行列中来.可以预期,用几何学方法研究基因组将会有一个广阔的发展空间〔24-26〕.M.Randi c'等人富有开创性的工作提出将图形表示转换成数学的矩阵表示,包括:(1)E 矩阵.其(i,j)元由曲线上两个基对应点的欧氏距离得到.(2)M/M 矩阵.其(i,j)元由曲线上两个基对应点的欧氏距离与|j-i|(非退化情形为它们之间存在的单位线段数)之比得到.(3)L/L 矩阵.其主对角线元为零,所有元素都小于或等于1.这种矩阵来源于DNA 原始序列的二维几何图形表示.假设一个DNA 原始序列的长度为n,即它有n 个基构成.我们构造一个n×n 阶对称矩阵如下,它的(i,j)项为Eij/Gij,这里Eij是几何图形中第i 个点和第j 个点之间的Euclidean 距离,Gij是曲线上第i 个点和第j 个点之间的线段长的和.利用这种矩阵的最大特征值可以给出DNA 原始序列的几何图形的折叠度的一种结构性解释〔10〕.利用DNA 序列的矩阵不变量,可以对DNA 原始序列进行相似性比较.其做法是:对所要比较的几个DNA 基本序列先进行处理,即先求出它们的矩阵和相应的矩阵不变量,如矩阵的特征值、行列式值、矩阵所有项的平均值、最大(小)行和、矩阵的迹等等.把某种不变量作为一个指标,对相应的序列进行相似性比较.如同一序列的图形表示中,必须由几个图形才能完全表示出所有的序列信息.每一种图形表示提取一个矩阵,对应一个矩阵不变量,得到的k 个矩阵不变量构成一个k 维向量. 设分别是两个序列a、b 所对应的k 维向量.通常地说,两个向量之间的相似性可以通过计算它们的端点之间的欧氏距离来度量,另外如果两个向量所指向的方向越相似则我们就认为这两个向量越相似.对应的,我们有两种方法可以计算:(1)d (U1,U2),即两个向量终点的欧氏距离,d 越小,就认为这两个序列越相似;(2)θ (U1,U2),即两个向量的相关角,如果两个向量所成相关角越小,就认为这两个序列越相似.上述图形表示方法目前已经在应用于很多计算生物学领域:(1)鉴定整体同源和保守模式;(2)内含子、外显子差异与识别;(3)进化分歧和分子系统发育;(4)长程关联和不规则片段分析;(5)重复序列分析等等.不过已有的几何图形表示都还有各自的缺陷,主要表现在以下两点:首先有退化现象;其次是对完整序列而言,使用的数学变量计算太复杂,有的甚至还没有算法解决.另外,目前这个领域的发展方向在于:一是拓展生物信息学的应用领域,二是由DNA 转向蛋白质的研究.〔27-31〕2 研究生物大分子二级结构的拓扑图论方法研究DNA 的二级和三级结构双螺旋及轴线的立体形状行为以及其生物功能是非常重要的问题.拓扑学与几何学特别是纽结理论是分析此问题的有力武器.DNA 中的碱基序列决定蛋白质的一级结构即氨基酸序列,在合成后蛋白质便自发折叠成一精确的三级结构然后才能执行催化调控化学输运流动和结构支持等功能.人们把DNA 序列决定氨基酸序列称为生命的第1 密码而把蛋白质氨基酸序列决定其自然结构称为第2 密码.破译第2 密码的意义十分重大,其中必将用到几何学与拓扑学. Shapiro 等人〔32,33〕基于树结构的拓扑不变量提出和发展了RNA 二级结构的结构比较,方法是将这些二级结构的子结构抽象为点和线组成的树状结构.RNA 二级结构的子结构包括:由连续的基对组成的螺旋区(stems),环(loops:端环、内环和凸包环),连接(junctions).其中螺旋区做成线,其它结构都看作点,在图3 显示的是一个tRNA (NDB:TRNA12)的二级结构及其树图表示.基于上述RNA 二级结构的树图表示,可以利用光谱图理论来分析其拓扑性质.他们提取树结构不变量——拉普拉斯矩阵的第二个小的特征值λ2表达了树图的连通性的度量,可以用来分析不同二级结构的相似性.首先对二级结构的树图顶点标号,然后提取邻接矩阵矩阵A,如果i 和j 两个点右边相连,则矩阵元素(i,j)为1,否则为0;对角矩阵D 的对角线元素表达的是这个顶点的连通度,即该顶点连结的边的个数,其它元素为0.拉帕拉斯矩阵L=D-A.NDB:TRNA12 二级结构树图的上述矩阵如下:图3 TRNA (NDB: TRNA12)二级结构及其树图表示拉普拉斯矩阵的特征值:λ1=0,λ2=1,λ3=1,λ4=1,λ5=5,其中第二个特征值表达了树图的连通性的度量.3 基于字出现频率的统计学方法记X=X1X2…Xn为n 个字符组成的序列,其中的字符取自长度为r 的特定的字母表H,即Xi∈H,i=1,…n.从序列X 中抽取一长度为L 的子串(L≤n),定义为L-元组或长度为L 的字.所有可能的L-元组构成集合WL={wL,1,wL,2,…wL,K},显然有K=rL.取一尺寸为L 的滑动窗口,从序列X 的位置1 滑动到n-L+1,得到n-L+1 个L-元组,点数WL中各wL,i的个数,组成向量.通过计算各字的相对含量还可得到字的出现率,其中,i=1,…K.对于生物序列,都是一条有限字符序列.对于DNA,核苷酸有四种,四个基构成的字母表H={A,C,G,T },而蛋白质则是20 个氨基酸构成的字母表,即r 分别为4 和20.例如:对于一个简单的DNA序列X=AATATAC,取L=3,则W3={AAT,ATA,TAT,TAC,…},长度为K=43=64.取三字符的窗口滑动n- L+ 1 次得到5 个字AAT、ATA、TAT、ATA、TAC,W3中各字在这5 个字中出现的次数矢量为=(1,2,1,1,0,…),因此字的出现率矢量=(0.2,0.4,0.2,0.2,0,…).这一类方法通过计算每一个序列的L-元组,将字符序列转换成向量,接着使用现成的线性代数和统计理论分析这些向量之间的相似性(或差异),从而实现序列的比较.设分别是两条序列X,Y 的K 个L-元组出现的次数向量,目前主要有以下几种距离方法来表示序列间差异〔34〕:另外,美国数学家C.E.Shannon 从不确定性(即随机性)和概率测度的角度定义了熵的概念.应用在生物序列比较及分析中,p 和q 是两个生物序列中字出现的频率向量,则它们之间的差异可用相关熵来表示,Kullback- Leibler 偏差是其一种表示方法〔34〕:4 Kolmogorov 复杂性及Lempel-Ziv 复杂度方法复杂性是所有序列所具有的根本属性之一.20 世纪60 年代,Kolmogorov 最先从算法意义上给复杂性做出了理论描述,定义一个(0,1)序列的复杂度为能够产生这一序列的最短程序的bit 数,他描述的复杂性被称为Kolmogorov 复杂性.Li 等人则最早将Kolmogorov 复杂性用于DNA 序列间的距离测度.因Kolmogorov 复杂性是不可计算的,1976 年,Lempel 和Ziv 将数据压缩率作为Kolmogorov 复杂性的一种近似,他们将序列复杂性应用到有限序列,并给出了数学讨论及算法描述,简称为L-Z 复杂性.序列的L-Z 复杂性反映了给定序列随其序列长度的增长出现新的序列模式的速率,是序列复杂性的一个可计算的有效度量〔35,36〕.Lempel 和Ziv 定义一个非空有限序列S 的L-Z 复杂度c(S)为S 对应的最小生成过程中生成单元的个数,最小生成过程本质上只有两个步骤是被允许的:添加一个新的字符,确保每个生成部分的序列子串都有唯一性或者从已合成的序列中拷贝最长的子串.例如,序列S=0001101001000101 可以按下面步骤生成〔27〕:(a)从空串出发开始添加0:→0;(b)最长复制+额外添加一个字符1:→0 · 001;(c)最长复制+额外添加一个字符0:→0 · 001 · 10;(d)最长复制+额外添加一个字符0:→0 · 001 · 10 · 100;(e)最长复制+额外添加一个字符0:→0 · 001 · 10 · 100 · 1000;(f)最长复制:→0 · 001 · 10 · 100 · 1000 · 101.从而序列S=0001101001000101 的L-Z 复杂度为c(S)=6.给定序列S 和Q,SQ 是将序列Q 放到序列S 后连接起来得到的序列,由L-Z 复杂度的定义,则在序列S(Q)含有Q (S)的子序列越长或者越多,那么或者c(SQ)-c(S)的值就越小.Otu 等人由此推定〔103〕:c (QS)-c(Q)的值越小,则序列S 和序列Q 越相似.在此假设下,他们给出了下面几个基于L-Z 复杂度的序列相似性的公式〔37〕:不难看出,其中公式(2)和(4)分别是公式(1)和(3)的正则化形式,它们对序列的长度做出了正则化.Out 等人将上述复杂度方法应用于34 个物种线粒体基因组的进化树构建,李斌等在此基础上提出了条件复杂性的概念,并将之应用于DNA 序列的相似性和物种的系统发育分析〔26〕,李春等人扩展到对称(Symmetric)、反向互补(Inverted complementary)和正向互补(direct Complementary)等操作,并将这些应用于11 个物种的β-球蛋白基因序列的相似性研究,可以看作为广义的L-Z 复杂度方法〔38〕.总的来说,序列比较是生物信息学中最基本、最重要的操作,通过序列比较可以发现生物序列中的功能、结构和进化的信息.在分子生物学中,DNA 或蛋白质的相似性是多方面的,可能是核酸或氨基酸序列的相似,可能是结构的相似,也可能是功能的相似.目前的研究多集中于序列本身的相似性分析,而对结构和功能的相似性研究较少.可以预计,上述几种数学方法的无比对序列比较作为可靠的预测方法,将被广泛用于序列同源性搜寻、多重比对和亲缘树的构建、蛋白质结构预测、基因组序列分析和基因发现等生物序列分析的主要方面.参考文献:【相关文献】〔1〕陈润生.生物信息学〔J〕.生物物理学报,1999,15 (11):5-12.〔2〕张新生,王梓坤.生命信息遗传中的若干数学问题〔J〕.科学通报,2000,45 (2):l13-119. 〔3〕余波.生物信息学的现状与前景〔J〕.生物学教学,2003,28 (10):1-3.〔4〕Vinga S,Almeida1 J.Alignment-free sequence comparison-a review〔J〕.J Bioinformatics,2003,19 (4):513-523.〔5〕杜世平.隐马尔可夫模型在生物信息学中的应用〔J〕.大学数学,2004,20 (5):24-29. 〔6〕Hamori E,Ruskin J.H curves,a novel method of representation of nucleotide series especially suited for long DNA sequences〔J〕.J Bio Chem,1983,258(2):1318-1327. 〔7〕Gates M A.A simple way to look at DNA〔J〕.J Theor Biol,1986,119(3):319-328. 〔8〕Nandy A.A new graphical representation and analysis of DNA sequence structure:I.Methodology and Application to Globin Genes〔J〕.Curr Sci,1994,66:309-314.〔9〕Leong P M,Morgenthaler S.Random walk and gap plots of DNA sequences 〔J〕.Comput Applic Biosci,1995,11(5):503-511.〔10〕RandiM,Vrako M,LerN,et al.Novel 2-D graphical representation of DNA sequences and their numerical characterization〔J〕.Chem Phys Lett,2003,368(1):1-6. 〔11〕Guo X F,RandiM,Basak S C.A novel 2-D graphical representation of DNA sequences of low degeneracy〔J〕.Chem Phys Lett,2000,350(1):106-112.〔12〕Yao Y H,Wang T M.A class of new 2-D graphical representation of DNA sequences and their application〔J〕.Chem Phys Lett,2004,398(4):318-323.〔13〕Yao Y H,Nan X Y,Wang T M.Analysis of similarity/dissimilarity of DNA sequences based on a 3-D graphical representation〔J〕.Chem Phys Lett,2004,388(1):195-200. 〔14〕Yao Y H,Nan X Y,Wang T M.A new 2D graphical representation-classification curve and the analysis of similarity/dissimilarity of DNA sequences〔J〕.J MolStr:THEOCHEM,2006,764(1):101-108.〔15〕Liao B,Li R F,Zhu W,et al.On the similarity of dNA primary sequences based on 5-D representation〔J〕.J Math Chem,2007,42 (1):47-57.〔16〕Dai Q,Liu X Q,Wang T M.A novel 2D graphical representation of DNA sequences and its application.J Mol Graph Mode,2006,25(3):340-344.〔17〕Dai Q,Liu X Q,Xiu Z L,et al.PNN-curve:a new 2D graphical representation of DNA sequences and its application〔J〕.J The Bio,2006,243(4):555-561.〔18〕BielinskaWaz D.Four-component spectral representation of DNA sequences〔J〕.J Math Chem,2010,47(1):41-51.〔19〕Jayalakshmi R,Natarajan R,Vivekanandan M.Extension of molecular similarity analysis approach to classification of DNA sequences using DNA descriptors〔J〕.SAR QSAR Environ Res 2011,22(1-2):21-34.〔20〕Yao Y H,Dai Q,Nan X Y,et al.Analysis of similarity/dissimilarity of DNA sequences based on a class of 2D graphical representation〔J〕.J Comput Chem 2008,29(2):1632-1639.〔21〕Yu C L,Deng M,S.Yau S T.DNA sequence comparison by a novel probabilistic method〔J〕.Inform Sciences,2011,181(8):1484-1492.〔22〕Yu J F,Wang J H,Sun X.Analysis of similarities/dissimilarities of DNA sequences based on a novel graphical representation〔J〕.MATCH Commun Math Comput Chem,2010,63(2):493-512.〔23〕Yu H J,Huang D S.Novel 20-D descriptors of protein sequences and it's applications in similarity analysis〔J〕.Chem Phys Lett,2012,531(2):261-266.〔24〕Zhang C T,Zhang R.An intuitive tool for visualizing and analyzing the DNA sequences〔J〕.J Biomol str Dyn,1994,11(4):767-782.〔25〕张春霆.用几何学方法分析DNA 序列〔J〕.中国科学基金,1999,3:152-153.〔26〕张春霆.人与其他生物基因组若干重要问题的生物信息学研究〔J〕.自然科学进展,2004,14 (12):1367-1374.〔27〕Liao B,Liao B Y,Lu X G,et al.A novel graphical representation of protein sequences and its application.J Comput Chem,2011,32(12):2539-2544.〔28〕Li C,Xing L L,Wang X.2-D graphical representation of protein sequences and its application to coronavirus phylogeny〔J〕.Bmb Reports,2008,41(3):217-222.〔29〕He P A.A new graphical representation of similarity/dissimilarity studies of protein sequences〔J〕.SAR QSAR Environ Res,2010,21(5-6):571-580.〔30〕Randic M,Novic M,Vracko M,Plavsic D.Study of proteome maps using partial ordering〔J〕.J Theor Biol,2010,266(1):21-28.〔31〕Yao Y H,Dai Q,Li L,et al.Similarity/dissimilarity studies of protein sequences based on a new 2D graphical representation〔J〕.J Comput Chem,2010,31(15):1045-1052.〔32〕Shapiro B.An algorithm for comparing multiple RNA secondary structures〔J〕.Comput Appl Biosci,1988,4 (3):387- 393.〔33〕Shapiro B,Zhang paring multiple RNA secondary structures using tree comparisons〔J〕.Comput Appl Biosci,1990,6 (4):309-318.〔34〕符维娟,汪源源,卢大儒.无比对的生物分子序列比较方法〔J〕.生物医学工程学杂志,2005,22 (3):598-601.〔35〕李斌,何红波,李义兵.基于DNA 序列LZ 复杂性距离的系统进化树重构〔J〕.高技术通讯,2006,16 (5):506-510.〔36〕李春.生物大分子的数学描述及其应用〔D〕.博士学位论文.大连:大连理工大学,2006. 〔37〕Otu H H,Sayood,K.A new sequence distance measure for phylogenetic tree construction〔J〕.Bioinformatics,2003,19(2):32122-32130.〔38〕Li C,Wang J.Similarity analysis of DNA sequences based on the generalized LZ complexity of (0,1)-sequences〔J〕.J Math Chem,2008,43(1):26-31.。
基于EEG复杂度的脑疲劳检测研究进展
信is 与电ifiChina Computer & Communication算倣语咅2021年第5期基于EEG 复杂度的脑疲劳检测研究进展蔡娇英-李胜民1"赵春临"(1.武警工程大学装备管理与保障学院,陕西西安710000; 2.第一机动总队机动第六支队,河北保定 071000; 3.武警贵州总队参谋部通信大队,贵州贵阳550081 )摘 要:脑疲劳一般是长时间从事高强度的脑力活动造成的,此过程的大脑活动可以用脑电信号(EEG )进行描述. 脑电信号的复杂性特征一直是脑疲劳检测研究的重点。
基于此,笔者重点探讨了基于EEG 复杂度的脑疲劳检测研究进展, 全面梳理了脑疲劳检测方面的相关文献。
研究结果表明,非线性参数指标爛值分析和复杂度分析具有数据要求不高、抗干扰能力强等特点,能够用于检测脑电信号的复杂性.关键词:脑疲劳;EEG;爛;复杂度中图分类号:R318; TN911.7 文献标识码:A 文章编号:1003-9767 (2021) 05-072-03Research Progress of Brain Fatigue Detection Based on EEG ComplexityCAI Jiaoying 1,2, LI Shengmin 1,3, ZHAO Chunlin 1*(1. School of Equipment Management and Support, Armed Police Engineering University, Xi "an Shaanxi 710000, China;2. Mobile Sixth Detachment of the First Mobile Corps, Baoding Hebei 071000, China;3. Communications Brigade of the Armed Police Guizhou Corps Staff, Guiyang Guizhou 550081, China)Abstract: Brain fatigue is generally caused by engaging in high-intensity mental activity for a long time. The brain activity inthis process can be described by electroencephalogram (EEG). The complexity of EEG signals has always been the focus of research on brain fatigue detection. Based on this, the author focuses on the research progress of brain fatigue detection based on EEG complexity,and comprehensively combs the relevant literature on brain fatigue detection. The research results show that entropy analysis and complexity analysis of non-linear parameter indicators have the characteristics of low data requirements and strong anti-interference ability, which can be used to detect the complexity of EEG signals.Keywords : brain fatigue; EEG; entropy; complexity0引言精神疲劳是一个逐渐累积的过程,大多是由精神紧张时 间过长或者长期从事单调乏味的工作造成的,会出现反应迟 钝、失去协调性等症状,有时会造成非常严重的后果因此, 从职业风险防护、职业健康的角度来看,有必要对精神疲劳进行深入研究。
伪随机数与准随机数的比较
伪随机数与准随机数的比较王水花 张煜东 吴乐南(东南大学 信息科学与工程学院,江苏 南京 210096)摘 要传统的随机数生成法主要采用逆转法,首先生成[0, 1]区间上的均匀分布U,然后令X=F-1(U),则X满足分布F。
然而,这样生成的均匀分布具有明显的差异性,在小样本或高维空间的情况下尤其严重。
因此,引进一种新的随机数生成方法,即准随机数生成器。
实验验证了准随机数生成器得到的随机数的差异性优于传统方法。
最后,提出一种基于准随机数生成器的蒙特卡罗积分方法,结果优于传统的蒙特卡罗积分。
关键词伪随机数;准随机数;Kolmogorov- Smirnov假设检验1 引言随机数生成算法[1]是一类重要的算法,广泛应用于仿真技术等场合。
然而,目前的伪随机数生成器(Pseudo-random number generator, PRNG)[2]存在一个重要缺陷,即样本分布与真实分布不一致,这主要发生在以下两种情况:①抽样代价过高,样本数目较少;②空间维数较高[3]。
因此,有必要寻找一类新的随机数发生器。
准随机数发生器(Quasi-random number generator, QRNG)[4]能够生成稳定、低差异性的(low-discrepancy)样本,而与样本数目或空间维数无关[5]。
故针对蒙特卡罗积分结果不稳定的情况,提出一种基于QRNG的蒙特卡罗积分,发现比传统方法性能有所提升。
2 伪随机数介绍伪随机数是由确定的算法生成的,其分布函数与相关性均能通过统计测试。
与真实随机数的差别在于,它们是由算法产生的,而不是一个真实的随机过程。
一般地,伪随机数的生成方法主要有以下3种[6]:(1) 直接法(Direct Method),根据分布函数的物理意义生成。
缺点是仅适用于某些具有特殊分布的随机数,如二项式分布、泊松分布。
(2) 逆转法(Inversion Method),假设U服从[0,1]区间上的均匀分布,令X=F-1(U),则X的累计分布函数(CDF)为F。
分数阶近似熵算法
西安理工大学学报 Journal of Xi'an University of Technology (2020) Vol 36 No. 4575DOI 10. 19322/j. cnki. issn. 1006-4 710. 2020. 04. 020分数阶近似爛算法袁利国】,杨晓婷】,余荣忠2(.华南农业大学数学与信息学院,广东广州510640; 2.九江学院理学院,江西九江332005)摘要:熵是度量时间序列混沌、随机性与复杂性的重要指标。
基于分数阶微积分理论及熵理论,文章首次给出一类分数阶熵算法,并应用于标准正态分布与泊松分布随机变量。
进一步提出分数阶近似熵算法,它是一个新的熵定义与算法,将其应用到经典混沌系统----logistic 系统与Henon 系统,通过与分岔图、李雅普诺夫指数谱作比对,表明分数阶近似熵算法能很好地度量周期轨道与混 沌序列。
最后,给出分数阶近似熵算法的程序代码。
关键词:香农熵;近似熵;分数阶近似熵;混沌中图分类号:O19, N93文献标志码:A 文章编号:10()64710(2()2())()4057506Fractional-order approximate entropy algorithmYUAN IJguo 1 , YANG Xiaoting 1 , YU Rongzhong 2(1. College of Mathematics and Informatics , South China Agricultural University , Guangzhou 5106/0, China ;2. College of Science , Jiujiang University, Jiujiang 332005 , China.)Abstract : Entropy is an important, index for measuring the chaos , randomness and complexity oftime series. Based on the theory of fractional calculus and entropy , a kind of fractional entropy isf rstly def ned and analyzed , whichis appl ed to the random variables of the standard normal dis- tributionandthePoissondistribution.Furthermore ,thefractonal-orderapproximateentropyal- gorithmisproposed ,whichisanewdefnitionofandalgorithmforentropy.Itissuccessfu l yap- plied to the classical chaotic systems ——Logistic system and Henon system. By comparing with thebifurcation diagram and Lyapunov exponent, spectrum , the fractional-order approximate entropy algorithm can better measure periodic orbits and chaotic sequences. Finally , the program codes ofthe fractional-order approximate entropy algorithm are given.Key words : Shannon entropy ; approximate entropy ; fractional-order approximate entropy ; chaos熵是衡量系统宏观结构中随机性或无序程度的 一个量[1],熵是宏观量,是系统大量微观粒子集体表现的性态。
心率变异信号的复杂度分析
收稿日期:2008204203 作者简介:徐霞(19702),女,重庆人,副教授,学士,主要从事应用电工电子技术和生物医学信号处理方法的研究和科研工作。
文章编号:100422474(2008)0520638202心率变异信号的复杂度分析徐 霞1,2,杨 浩1(1.重庆大学电气工程学院,重庆400044;2.重庆工学院电子信息与自动化学院,重庆400050) 摘 要:心率变异(HRV )反映了交感神经和迷走神经对心血管系统的综合调节作用,是评价心血管系统功能的重要指标,在临床中可用作参考实现对心血管疾病辅助诊断及其康复过程的无创性监测。
复杂度是刻画时间序列信号信息量的一个重要参数。
该文设计了阿托品和倍他乐克药物对比实验,计算了心率变异信号的K olmogor 2ov 复杂度,并对结果进行了统计分析,其结果为临床提供了参考。
关键词:非线性;心率变异(HRV )信号;K olmogorov 复杂度中图分类号:R318 文献标识码:AAnalysis of HRV B ased on the Complexity MeasureXU Xia 1,2,YANG H ao 2(1.College of Electrical Engineering ,Chongqing University ,Chongqing 400044,China ;2.Dept.of Electronic Information and Automation ,Chongqing Institute of Technology ,Chongqing 400050,China ) Abstract :The heart rate variability (HRV )reflects the synthetical f unction of the parasympathetic 2sympathetic nerve to adjust and control the heart and blood system.It is an important index to character the heart and blood sys 2tem ,and is valuable to diagose heart and blood disease in clinical diagnosis.The complexity is a valuable parameter to represent the information contained in a time series.In this paper ,designing the contrastive medicine experiments between Atropine and Betaloc ,the K olmogorov complexity indexes are calculated and analyzed ,some valuable con 2clusions are expressed according to the result.K ey w ords :nonlinear property ;HRV signals ;K olmogorov complexity 心率变异(HRV )是指逐次心搏间期之间的微小变异,通常情况下指窦性心律的微小涨落,其体现了心率或心动周期的波动性。
Kolmogorov Complexity for Possibly Infinite Computations
Journal of Logic,Language and Information(2005)14:133–148C Springer2005 Kolmogorov Complexity for Possibly Infinite ComputationsVER´ONICA BECHER and SANTIAGO FIGUEIRADepartamento de Computaci´o n,Facultad de Ciencias Exactas y Naturales,Universidad de Buenos Aires,ArgentinaE-mail:vbecher@dc.uba.ar,sfigueir@dc.uba.ar(Received5August2003;infinal form8June2004)Abstract.In this paper we study the Kolmogorov complexity for non-effective computations,that is, either halting or non-halting computations on Turing machines.This complexity function is defined as the length of the shortest input that produce a desired output via a possibly non-halting computation. Clearly this function gives a lower bound of the classical Kolmogorov complexity.In particular,if the machine is allowed to overwrite its output,this complexity coincides with the classical Kolmogorov complexity for halting computations relative to thefirst jump of the halting problem.However,on machines that cannot erase their output–called monotone machines–,we prove that our complexity for non effective computations and the classical Kolmogorov complexity separate as much as we want. We also consider the prefix-free complexity for possibly infinite computations.We study several properties of the graph of these complexity functions and specially their oscillations with respect to the complexities for effective computations.Key words:infinite computations,Kolmogorov complexity,monotone machines,non-effective computations,program-size complexity,Turing machines1.IntroductionThe Kolmogorov or program-size complexity(Kolmogorov,1965)classifies strings with respect to a static measure for the difficulty of computing them:the length of the shortest program that computes the string.A low complexity string has a short algorithmic description from which one can reconstruct the string and write it down.Conversely,a string has maximal complexity if it has no algorithmic description shorter than its full length.Due to an easy but consequential theorem of invariance,program-size complexity is independent of the universal Turing machine (or programming language)being considered,up to an additive constant.Thus, program-size complexity counts as an absolute measure of complexity(see(Li and Vit´a nyi,1997)for a thorough exposition of the subject).The prefix-free version of program-size complexity,independently introduced by Chaitin(Chaitin,1975)and Levin(Levin,1974),also serves as a measure of quantity of information,being formally identical to Shanon’s information theory (Chaitin,1975).134V.BECHER AND S.FIGUEIRA In this paper we study the Kolmogorov complexity for non-effective com-putations,that is,either halting or non-halting computations on Turing ma-chines.This complexity function,notated with K∞,is defined as the length of the shortest inputs that produce a desired output via a possibly non-halting computation.The ideas behind K∞(more precisely its prefix-free variant H∞) have been treated by Chaitin(1976a)and Solovay in(1977),and later in (Becher et al.,2001).In a recent paper(Ferbus-Zanda and Grigorieff,2004) Grigorieff and Ferbus-Zanda give a machine-free mathematical formalization of K∞.They show that K∞coincides with the Kolmogorov complexity of MAX Rec, the class of functions obtained as the maximum of a sequence of total recursive functions{0,1}∗→N.Clearly this function K∞gives a lower bound of the classical Kolmogorov com-plexity.In particular,if the machine is allowed to overwrite its output,K∞coincides with the classical Kolmogorov complexity for halting computations relative to the first jump of the halting problem.However,on machines that cannot erase their out-put–called monotone machines–,we prove that K∞and the classical Kolmogorov complexity separate as much as we want.We also consider the prefix-free complexity for possibly infinite computations, notated H∞.This complexity function was defined in(Becher et al.,2001)without a detailed study of its properties.We study several properties of the graph of K∞and H∞,specially their oscil-lations with respect to the respective complexities for effective computations.We also consider the behaviour of the complexity function along the prefix ordering on {0,1}∗in the same vein as in(Katseff and Sipser,1981).2.DefinitionsN is the set of natural numbers,and we work with the binary alphabet{0,1}.As usual,a string is afinite sequence of elements of{0,1},λis the empty string and {0,1}∗is the set of all strings.{0,1}ωis the set of all infinite sequences of{0,1}, i.e.,the Cantor space.{0,1}≤ω{0,1}∗∪{0,1}ωis the set of allfinite or infinite sequences of{0,1}.For any n∈N,{0,1}n is the set of all strings of length n.For a∈{0,1}∗,|a|denotes the length of a.If a∈{0,1}∗and A∈{0,1}ωwe denote with a n the prefix of a of length min(n,|a|)and with A n the prefix of the infinite sequence A of length n.For a,b∈{0,1}∗,we write a b if a is a prefix of b.In this case,we also say that b is an extension of a.A set X⊆{0,1}∗is prefix-free if no a∈X has a proper prefix in X.X⊆{0,1}∗is closed under extensions when for every a∈X,all its extensions are also in X.We assume the recursive bijection str:N→{0,1}∗such that str(i)is the i-th string in the length-lexicographic order over{0,1}∗.We also assume the one to one recursive function·:{0,1}∗→{0,1}∗which for every string s=b1b2...b n−1b n, s=0b10b2...0b n−11b n.This function will be useful to code inputs to Turing machines which require more than one argument.KOLMOGOROV COMPLEXITY FOR POSSIBLY INFINITE COMPUTATIONS135 If f is any partial function then,as usual,we write f(p)↓when it is defined, and f(p)↑otherwise.2.1.P OSSIBLY I NFINITE C OMPUTATIONS ON M ONOTONE M ACHINESWe work with Turing machines with a one-way read-only input tape,some work tapes,and an output tape.The input tape contains afirst dummy cell(representing the empty input)followed by0’s and1’s representing the input,and then a special end-marker indicating the end of the input.Notice that the end-marker allows the machine to know exactly where the input ends.We shall refer to two architectures of Turing machines,regarding the input and output tapes.A monotone Turing machine has a one-way write-only output tape.A prefix machine is a Turing machine with a one-way input tape containing no blanks (just zeroes and ones).Since there is no external delimitation of the input tape, the machine may eventually read the entire input tape.A prefix monotone machine contains no blank end-marker in the input tape and it has a one-way write-only output tape.A computation on a machine starts with the input head scanning the leftmost dummy cell.The output tape is written one symbol at a time.In a(prefix)monotone machine,the output grows monotonically with respect to the prefix ordering in {0,1}∗as the computational time increases.A possibly infinite computation is either a halting or a non halting computation.If the machine halts,the output ofthe computation is thefinite string written on the output tape.Else,the output is either afinite string or an infinite sequence written on the output tape as a result of a never ending process.This leads to consider{0,1}≤ωas the output space.We introduce the following maps for the behaviour of machines at a given stage of the computation.DEFINITION2.1.Let M be a Turing machine.M(p)[t]is the current output of M on input p at stage t.Notice that M(p)[t]does not require that the computation on input p halts.DEFINITION2.2.Let M be a prefix machine.M(p)[t]is the current output of M on input p at stage t if it has not read beyond the end of p.Otherwise,M(p)[t]↑. Again,notice that M(p)[t]does not require that the computation on input p halts.Observe that depending on whether M is a prefix machine or not M(p)[t]refers to Definition2.1or2.2.In both cases M(p)[t]is a partial recursive function with recursive domain.136V.BECHER AND S.FIGUEIRA REMARK2.4.If M is monotone then M(p)[t] M(p)[t+1],in case M(p) [t+1]↓.If M is a prefix machine then:1.If M(p)[t]↑then M(q)[u]↑for all q p and u≥t.2.If M(p)[t]↓then M(q)[u]↓for any q p and u≤t.Also,if at stage t,Mreaches a halting state,then M(p)[u]↓=M(p)[t]for all u≥t.We introduce maps for the possibly infinite computations on a monotone machine (resp.prefix monotone machine).In this work we restrict ourselves to possibly infinite computations which read justfinitely many symbols from the input tape. DEFINITION2.41.Let M be a Turing machine(resp.prefix machine).The input/output behaviour of M for halting computations is the partial recursive map M:{0,1}∗→{0,1}∗given by the usual computation of M,i.e.M(p)↓iff M enters into a halting state on input p(resp.iff M enters into a halting state on input p without reading beyond p).If M(p)↓then M(p)=M(p)[t]for some stage t at which M entered a halting state.2.Let M be a monotone machine(resp.prefix monotone machine).The in-put/output behaviour of M for possibly infinite computations is the map M∞:{0,1}∗→{0,1}≤ωgiven by M∞(p)=lim t→∞M(p)[t],where M(p)[t] is as in Definition2.1(resp.Definition2.3).In case M∞(p)∈{0,1}∗we say M∞(p)↓and otherwise M∞(p)↑.Observe that M∞extends M,because if the machine M halts on input p,then M∞(p)=lim t→∞M(p)[t]=M(p).REMARK2.5.1.If U is any universal Turing machine with the ability of overwriting the out-put then by Shoenfield’s Limit Lemma(Shoenfield,1959)it follows that U∞computes all∅ -recursive functions.2.Although Shoenfield’s Limit Lemma insures that for any monotone machine M,M∞:{0,1}∗→{0,1}∗is recursive in∅ ,not every∅ -recursive function can be computed in the limit by a monotone machine.One counterexample is the characteristic function of the halting problem.3.An example of a non-recursive function that is obtainable via an infinite com-putation on a monotone machine is the Busy Beaver function in unary notation bb:N→1∗,where bb(n)is the maximum number of1’s produced by any Turing machine with n states which halts with no input.bb is∅ -recursive andKOLMOGOROV COMPLEXITY FOR POSSIBLY INFINITE COMPUTATIONS137 bb(n)is the output of a non halting computation which on input n,it simulates every Turing machine with n states and for each one that halts it updates,if necessary,the output with more1’s.PROPOSITION2.6Let M be a prefix monotone machine.1.domain(M)is closed under extensions and its syntactical complexity is 01.2.domain(M∞)is closed under extensions and its syntactical complexity is 01.Proof.Item1is trivial.For item2,observe that M∞(p)↓⇔∀t M on input p does not read p0and does not read p1at stage t.Clearly,domain(M∞)is closed under extensions since if M∞(p)↓then M∞(q)↓=M∞(p)for every q p.REMARK2.7.Let M be a prefix monotone machine.An alternative and equivalent definition of M and M∞would be to consider them with prefix-free domains (instead of closed under extensions).–M(p)↓iff at some stage t M enters a halting state having read exactly p.If M∞(p)↓then its value is lim t→∞M(p)[t].–M∞(p)↓iff∃t at which M has read exactly p and for every t >t,M does not read p0nor p1.If M∞(p)↓then its value is lim t→∞M(p)[t].All properties of the complexity functions we study in this paper hold for this alternative definition.Wefix an effective enumeration of all tables of instructions.This gives an effective (M i)i∈N.Wefix the usual(prefix)monotone universal machine U,which definesthe functions U(0i1p)=M i(p)and U∞(0i1p)=M∞i (p)for halting and possiblyinfinite computations respectively.Recall that U∞is an extension of U.We also fix U∅ a monotone universal machine with an oracle for∅ .2.2.P ROGRAM-S IZE C OMPLEXITIESLet us consider inputs as programs.The Kolmogorov or program-size complexity (Kolmogorov,1965)relative to a Turing machine M is the function K M:{0,1}∗→N which maps a string s to the length of the shortest programs that output s.Thatis,K M(s)=min{|p|:M(p)=s}if s is in the range of M ∞otherwiseSince the subscript M can be any machine,even one equipped with an oracle,this is a definition of program-size complexity for both effective or relative computability.138V.BECHER AND S.FIGUEIRA In case M is a prefix machine we denote it H M rather than K M and we call it prefix complexity.In general,these program-size complexities are not recursive.The invariance theorem(Kolmogorov,1965)states that the universal Turing machine U is asymptotically optimal for program-size complexity,i.e.,∀Turing machine M∃c∀s K U(s)≤K M(s)+c.For any pair of asymptotically optimal machines M and N there is a constant c such that|K M(s)−K N(s)|≤c for every string s.Thus,program-size complexity on asymptotically optimal machines counts as an absolute measure of complexity, up to an additive constant.The same holds for prefix machines(Chaitin,1975; Levi,1974).We shall write K(resp.H)for K U(resp.H U)where U is some universal Turing (resp.universal prefix)machine.The complexity for a universal machine(resp. prefix machine)with oracle A is notated as K A(resp.H A).As expected,the help of oracles leads to shorter programs up to an additive constant(cf.Propositions2.10and2.11).2.3.P ROGRAM-S IZE C OMPLEXITY FOR P OSSIBLY I NFINITE C OMPUTATIONSLet M be a monotone machine,and M,M∞the respective maps for input/output behaviour of M for halting computations and possibly infinite computations(see Definition2.4).DEFINITION2.8.K∞M:{0,1}≤ω→N is the program-size complexity for functions M∞:K∞M(x)=min{|p|:M∞(p)=x}if x is in the range of M∞∞otherwiseFor the universal U we drop subindexes and we simply write K∞(resp.H∞).Because the set of all tables of instructions is r.e.,the Invariance Theorem holds for K∞:for every monotone machine M there is a c such that∀s∈{0,1}≤ωK∞(s)≤K∞M(s)+c.The Invariance Theorem also holds for H∞.REMARK2.9.From Remark2.5it is immediate that if U is a Turing machine with the ability of overwriting the output,that is,U is not monotone,K∞coincides with K∅ ,up to an additive constant.We mention some known results that will be used in the next sections.KOLMOGOROV COMPLEXITY FOR POSSIBLY INFINITE COMPUTATIONS139 PROPOSITION2.10.1.∃c∀s∈{0,1}∗K(s)≤|s|+c.2.∃c∀s∈{0,1}∗K∅ (s)−c<K∞(s)<K(s)+c.3.∀n∃s∈{0,1}∗of length n such that K(s)≥n.The same holds for K∅ andK∞.Proof.Item1follows directly from definition and the Invariance Theorem for K.For thefirst inequality of item2,observe that any unending computation that out-puts justfinitely many symbols can be simulated on a universal machine equipped with oracle∅ ,by increasing number of steps.At each step,the simulation polls the oracle to determine whether the computation would output more symbols or not. The simulation halts when there is no more output left.Item3holds because there are2n strings of length n,but2n−1programs of length less than n.PROPOSITION2.11.1.(Chaitin,1975)∃c∀s∈{0,1}∗H(s)≤H(|s|)+|s|+c.In particular,∃c∀s∈{0,1}∗H(s)≤|s|+c=2|s|+c.2.Items2and3of Proposition2.10are still valid considering H,H∞and H∅(see Becher et al.,2001).3.Oscillations of K∞In this section we study some properties of the complexity function K∞and compare them with K and K∅ .We know K∅ ≤K∞≤K up to addi-tive constants.The following results show that K∞is really in between K∅ and K.There are strings that separate the three complexity functions K,K∅ and K∞arbitrarily:THEOREM3.1.For every c there is a string s∈{0,1}∗such thatK∅ (s)+c<K∞(s)<K(s)−c.Proof.We know that for every n there is a string s of length n such that K(s)≥n.Let d n be thefirst string of length n in the lexicographic order satisfying this inequality,i.e.d n=min{s∈{0,1}n:K(s)≥n}.Let f:N→{0,1}∗be any recursive function with infinite range,and consider a machine C which on input i does the following:140V.BECHER AND S.FIGUEIRA j:=0RepeatWrite f(j)Find a program p,|p|≤2i,such that U(p)=f(j)j:=j+1The machine C on input i outputs(in the limit)c i=f(0)f(1)...f(j i)where K(f(j i))>2i and∀z,0≤z<j i:K(f(z))≤2i.For each i,we define e i=d i c i.Let usfix k and see that there is an i1such that∀i≥i1:K∞(e i)−K∅ (e i)>k. On the one hand,we can compute d i from i and a minimal program p such that U∞(p)=e i by simulating U(p)until it outputs i bits.If we code the input as ip we obtaini≤K(d i)≤K∞(e i)+2|i|+O(1).(1)On the other hand,with the help of the∅ oracle,we can compute e i from i.HenceK∅ (e i)≤|i|+O(1).(2)From(1)and(2)we have K∞(e i)−K∅ (e i)+O(1)≥i−3|i|and then,there is i1such that for all i≥i1,K∞(e i)−K∅ (e i)>k.Let us see now that there is i2such that∀i≥i2:K(e i)−K∞(e i)>k.Given i and a shortest program p such that U(p)=e i we construct a machine that computes f(j i).Indeed,if we code the input as ip,the following machine does the work: Obtain iCompute e:=U(p)s:=e ij:=0Repeats:=s f(j)If s=e then write f(j)and haltj:=j+1Hence,for all i2i<K(f(j i))≤K(e i)+2|i|+O(1).(3)Using the machine C we can construct a machine which,via an infinite computation, computes e i from a minimal program p such that U(p)=d i.Then,for every iK∞(e i)≤K(d i)+O(1)≤i+O(1).(4)KOLMOGOROV COMPLEXITY FOR POSSIBLY INFINITE COMPUTATIONS141 From(3)and(4)we have K(e i)−K∞(e i)+O(1)>i−2|i|so the difference between K(e i)and K∞(e i)can grow arbitrarily as we increase i.Let i2be such that for all i≥i2,K(e i)−K∞(e i)>k.Taking i0=max{i1,i2},we obtain∀i≥i0:K∅ (e i)+k<K∞(e i)<K (e i)−k.The three complexity functions K,K∅ and K∞get close infinitely many times. THEOREM3.2.There is a constant c such that for every n:∃s∈{0,1}n:|K∅ (s)−K∞(s)|≤c∧|K∞(s)−K(s)|≤c.Proof.Let s n be of length n such that K∅ (s n)≥n.From Proposition2.10, there exist c1,c2and c3such thatn≤K∅ (s n)≤K∞(s n)+c1≤K(s n)+c1+c2≤n+c1+c2+c3. Take c=c1+c2+c3.For infinitely many strings,K and K∞get close but they separate from K∅ as much as we want.THEOREM3.3.There is a constant c such that for all m∃s∈{0,1}∗:K(s)−K∅ (s)>m∧|K∞(s)−K(s)|<c.Proof.We know that#{s∈{0,1}n+2|n|:K(s)<n}<2n and then#{s∈{0,1}n+2|n|:K(s)≥n}>2n+2|n|−2n.Let S n={|w|w:w∈{0,1}n}.Notice that,if s∈S n,|s|=n+2|n|.Clearly, #S n=2n.Assume by contradiction that there is n such that S n∩{s∈{0,1}n+2|n|: K(s)≥n}=∅.Then2n+2|n|≥#S n+#{s∈{0,1}n+2|n|:K(s)≥n}>2n+2|n| which is impossible.For every n,let us define s ns n=min{s∈S n:K(s)≥n}.(5) Given a minimal program p such that U∞(p)=s n,we can compute s n in an effective way.The idea is to take advantage of the structure of s n to know when U∞stops writing in its output tape:we simulate U∞(p)until we detect¯n and we continue the simulation of U∞until we see it writes exactly n more bits.Then for each n,K(s n)≤K∞(s n)+O(1)and from Proposition2.10we have that for all n the difference|K(s n)−K∞(s n)|is bounded by a constant.142V.BECHER AND S.FIGUEIRA Using the∅ oracle,we can compute s n from n.Hence K∅ (s n)≤|n|+O(1). From(5)we conclude K(s n)−K∅ (s n)+O(1)≥n−|n|.Thus,the difference between K(s n)and K∅ (s n)can be made arbitrarily large.Infinitely many times K∞and K∅ get close but they separate from K arbitrarily.THEOREM3.4.There is a constant c such that for each m∃s∈{0,1}∗:K(s)−K∞(s)>m∧|K∞(s)−K∅ (s)|<c.Proof.As in the proof of Theorem3.1,consider a recursive f with infinite range,let c n=n f(0)f(1)...f(j n),and slightly modify the machine C such that on input i,itfirst writes i and then it continues writing f(j)until itfinds a j i such that K(f(j i))>2i and∀z,0≤z<j i:K(f(z))≤2i.Thus,given str(n),we can compute n and then c n in the limit.Hence for every nK∞(c n)≤|str(n)|+O(1).(6)Given an∅ oracle minimal program for c n,we can compute str(n)in an oracle machine.Then for every nK∅ (str(n))≤K∅ (c n)+O(1).(7)We define m n=min{s∈{0,1}n:K∅ (s)≥n}and s n=c str−1(m n).From(7)we known≤K∅ (m n)≤K∅ (s n)+O(1)(8)and from(6)we haveK∞(s n)≤|m n|+O(1).(9)From(8)and(9)we obtain K∞(s n)−K∅ (s n)≤O(1)and by Proposition2.10 we conclude that for all n,|K∞(s n)−K∅ (s n)|≤O(1).In the same way as we did in Theorem3.1,we construct an effective machine that outputs f(j n)from a shortest program such that U(p)=c n,but in this case the machine gets n from the input itself(we do not need to pass it as a distinct parameter).Hence for all n,2n<K(f(j n))≤K(c n)+O(1)and in particular for n=str−1(m n) we have2str−1(m n)<K(s n)+O(1).Since for each string s,|s|≤str−1(s)we have2|m n|<K(s n)+O(1).From(9)and recalling that|m n|=n,we have K(s n)−K∞(s n)+O(1)>n.Thus,the difference between K(s n)and K∞(s n) grows as n increases.KOLMOGOROV COMPLEXITY FOR POSSIBLY INFINITE COMPUTATIONS143 It is known that the complexity function K is smooth in the length and lexico-graphic order on{0,1}∗,i.e.|K(str(n))−K(str(n+1))|=O(1).The following result holds for K∞.PROPOSITION3.5.For all n|K∞(str(n))−K∞(str(n+1))|≤2K(|str(n)|)+O(1).Proof.Consider the following monotone machine M with input pq:Obtain y=U(p)Simulate z=U∞(q)till it outputs y bitsWrite str(str−1(z)+1)Let p,q∈{0,1}∗be such that U(p)=|str(n)|and U∞(q)=str(n).Then, M∞(pq)=str(n+1)and K∞(str(n+1))≤K∞(str(n))+2K(|str(n)|)+O(1).Similarly,if M above instead of writing str(str−1(z)+1),it writes str(str−1(z)−1),we concludeK∞(str(n))≤K∞(str(n+1))+2K(|str(n+1)|)+O(1).Since,|K(str(n))−K(str(n+1))|≤O(1),we have|K∞(str(n))−K∞(str(n+1))|≤2K(|str(n)|)+O(1).Loveland and Meyer(1969)have given a necessary and sufficient condition to characterize recursive sequences,based on the program-size complexity of their initial segments.They showed that a sequence A∈{0,1}ωis recursive iff ∃c∀n K(A n)≤K(n)+c.In this sense,the recursive sequences are those whose initial segments have minimal K complexity.We show that the advan-tage of K∞over K can be seen along the initial segments of every recursive sequence:if A∈{0,1}ωis recursive then there are infinitely many n’s such that K(A n)−K∞(A n)>c,for an arbitrary c.PROPOSITION3.6.Let A∈{0,1}ωbe a recursive sequence.ThenK(A n)−K∞(A n)=∞.lim supn→∞Proof.Let f:N→{0,1}be a total recursive function such that f(n)is the n-th bit of A.Let us consider the following monotone machine M with input p:144V.BECHER AND S.FIGUEIRA Obtain n:=U(p)Write A (str−1(0n)−1)For s:=0n to1n in lexicographic orderWrite f(str−1(s))Search for a program p such that|p|<n and U(p)=sIf U(p)=n,then M∞(p)outputs A k n for some k n such that2n≤k n<2n+1, since for all n there is a string of length n with K-complexity greater than or equal to n.Let usfix n.Then,K∞(A k n)≤|n|+O(1).However,K(A k n)+O(1)≥n, because we can compute thefirst string of length n in the lexicographic order with K-complexity≥n from a program for A k n.Hence,for each n,K(A k n)−K∞(A k n)+O(1)≥n−|n|.4.Program-Size Complexity for Possibly Infinite Computations on PrefixMonotone MachinesWe show that Theorems3.1and3.3are valid for H∞.THEOREM4.1.For every c there is a string s such thatH∅ (s)+c<H∞(s)<H(s)−c.Proof.The proof is essentially the same as that of Theorem3.1but using prefix monotone machines.Let c n=f(0)f(1)...f(j n)and slightly change the instruc-tions of machine C putting H(f(j n))>3n and∀z,0≤z<j n:H(f(z))≤3n.Let d n=min{s∈{0,1}n:H(s)≥n}and e n=d n c n.Assume p is a shortest program such that U∞(p)=e i.Consider the effective machine which on input ip does the following:Obtain iSimulate x i=U∞(p)until it outputs i bitsPrint x and haltThen,we havei≤H(d i)≤H∞(e i)+2|i|+O(1).(10) If we code the input of the oracle computation of the proof of Theorem3.1by duplicating the bits(now we cannot use just|i|bits to code i),inequality(2) becomesH∅ (e i)≤2|i|+O(1).(11) From(10)and(11)we have H∞(e i)−H∅ (e i)+O(1)≥i−4|i|,and so we can make the difference between H∞(e i)and H∅ (e i)as large as we want.To show thatKOLMOGOROV COMPLEXITY FOR POSSIBLY INFINITE COMPUTATIONS145 the difference between H(e i)and H∞(e i)can also be made arbitrarily large,we replace(3)by3i<H(f(j i))≤H(e i)+2|i|+O(1)(12)and recalling that for each string s,H(s)≤2|s|+O(1),inequality(4)is replaced byH∞(e i)≤H(d i)+O(1)≤2i+O(1).(13)From(12)and(13)we get H(e i)−H∞(e i)+O(1)>i−2|i|.For infinitely many strings,H and H∞get close but they separate from H∅ as much as we want:THEOREM4.2.There is a constant c such that for all m∃s∈{0,1}∗:H(s)−H∅ (s)>m∧|H∞(s)−H(s)|≤c.Proof.The idea of the proof is the same as the one in Theorem3.3.We redefine s n(see(5)):s n=min{s∈S n:H(s)≥n}.(14) We consider the same program as in the proof of Theorem3.3but using prefix monotone machines.Identically we obtain H(s n)≤H∞(s n)+O(1)and from Proposition2.11we have|H(s n)−H∞(s n)|≤O(1).Instead of K∅ (s n)≤|n|+O(1) we obtain H∅ (s n)≤2|n|+O(1)and from(14)we conclude H(s n)−H∅ (s n)≥n−2|n|+O(1).Thus,the difference between H(s n)and H∅ (s n)grows as n increases.We can show the following weaker version of Theorem3.4for H∞. PROPOSITION4.3.There is a sequence(s n)n∈N such thatH(s n)−H∞(s n)=∞and|H∞(s n)−H∅ (s n)|≤H(n)+O(1).limn→∞Proof.The idea is similar to the proof of Theorem3.4,but making j i such that H(f(j i))>3i and∀z,0≤z<j i:H(f(z))≤3i.We replace(6)byH∞(c n)≤H(str(n))+O(1)(15)146V.BECHER AND S.FIGUEIRA since there is a machine that via an infinite computation computes n and c n from a shortest program p such that U(p)=str(n).There is a machine with oracle∅ that computes str(n)from a minimal oracle program for c n.Then,restating(7),we have for every nH∅ (str(n))≤H∅ (c n)+O(1).(16)Let m n=min{s∈{0,1}n:H∅ (s)≥n}and s n=c str−1(m n).From(15)and(16) we have H∞(s n)−H∅ (s n)≤H(m n)−H∅ (m n)+O(1)≤H(m n)−n+O(1) and,since H(m n)≤H(|m n|)+|m n|+O(1)we conclude H∞(s n)−H∅ (s n)≤H(n)+O(1).We can construct an effective machine that computes f(j n)from a minimal program for U which outputs c n.From(15)we have H(s n)−H∞(s n)+ O(1)>3n−H(m n).Since for all n,H(m n)≤2|m n|+O(1)=2n+O(1),we get H(s n)−H∞(s n)+O(1)>n and hence the difference can be made arbitrarily large.Proposition3.5for H∞is still valid considering H(|str(n)|)+O(1)as the upper bound.It is easy to see that the recursive sequences in{0,1}ωhave minimal H complexity,i.e.,for any recursive A∈{0,1}ω∃c∀n H(A n)≤H(n)+c.It is easy to see that the analog of Proposition3.6is also true for H∞.Wefinally prove some properties that are only valid for H∞. PROPOSITION4.4.For all strings s and t1.H(s)≤H∞(s)+H(|s|)+O(1).2.H∞(ts)≤H∞(s)+H(t)+O(1).3.H∞(s)≤H∞(st)+H(|t|)+O(1).4.H∞(s)≤H∞(st)+H∞(|s|)+O(1).Proof.1.Let p,q∈{0,1}∗be such that U∞(p)=s and U(q)=|s|.Then there isa machine thatfirst simulates U(q)to obtain|s|,then it starts a simulation ofU∞(p)writing its output on the output tape,until it has written|s|symbols,and then halts.2.Let p,q∈{0,1}∗be such that U∞(p)=s and U(q)=t.Then thereis a machine thatfirst simulates U(q)until it halts and prints U(q)on the output tape.Then,it starts a simulation of U∞(p)writing its output on the output tape.3.Let p,q∈{0,1}∗be such that U∞(p)=st and U(q)=|t|.Then there isa machine thatfirst simulates U(q)until it halts to obtain|t|.Then it starts aKOLMOGOROV COMPLEXITY FOR POSSIBLY INFINITE COMPUTATIONS147 simulation of U∞(p)such that at each stage n of the simulation it writes the symbols needed to print U(p)[n] (|U(p)[n]|−|t|)on the output tape.4.Consider the following monotone machine:t:=1;v:=λ;w:=λRepeatif U(v)[t]asks for reading then v:=v bif U(w)[t]asks for reading then w:=w bwhere b is the next bit in the inputextend the actual output to U(w)[t] (U(v)[t])t:=t+1If p and q are shortest programs such that U∞(p)=|s|and U∞(q)=st respectively,then we can interleave p and q in a way that at each stage t,v p and w q(notice that eventually v=p and w=q).Thus,this machine will compute s and will never read more than H∞(st)+H∞(|s|)bits. AcknowledgementsThis work is supported by Agencia Nacional de Promoci´o n Cient´ıfica y Tecnol´o gica (V.B.),and by a grant of Fundaci´o n Antorchas(S.F.).ReferencesBecher,V.,Daicz,S.,and Chaitin,G.,2001,“A highly random number,”pp.55–68in Combina-torics,Computability and Logic:Proceedings of the Third Discrete Mathematics and Theoretical Computer Science Conference(DMTCS’01),C.S.Calude,M.J.Dineen,and S.Sburlan,eds., London:Springer-Verlag.Chaitin,G.J.,1975,A theory of program-size formally identical to information theory,Journal of the ACM22,329–340.Chaitin,G.,1976a,“Algorithmic entropy of sets,”Computers&Mathematics with Applications2, 233–245.Chaitin,G.J.,1976b,“Information-theoretical characterizations of recursive infinite strings,”Theoretical Computer Science2:45–48.Ferbus-Zanda,M.and Grigorieff,S.,2004,“Kolmogorov complexities K max,K min”(submitted). Katseff,H.P.and Sipser,M.,1981,“Several results in program-size complexity,”Theoretical Computer Science15,291–309.Kolmogorov,A.N.,1965,“Three approaches to the quantitative definition of information,”Problems of Information Transmission1,1–7.Levin,L.A.,1974,“Laws of information conservation(non-growth)and aspects of the foundations of probability theory,”Problems of Information Transmission10,206–210.Li,M.and Vit´a nyi,P.1997,An Introduction to Kolmogorov Complexity and its Applications (2nd edition),Amsterdam:Springer.。
kolmogorov准则
Kolmogorov准则1. 简介Kolmogorov准则是一种数学工具,用于判断一个字符串是否具有随机性。
它由俄罗斯数学家Andrey Kolmogorov在20世纪60年代提出,被广泛应用于信息论、复杂性理论和计算机科学等领域。
Kolmogorov准则基于一个重要的观点:随机性的字符串没有规律可言,因此它们无法通过简短的程序生成。
根据这个观点,Kolmogorov准则提出了一个度量随机性的概念:Kolmogorov复杂度。
2. Kolmogorov复杂度Kolmogorov复杂度是指生成一个字符串所需要的最短程序长度。
换句话说,对于任意给定的字符串,Kolmogorov复杂度就是能够生成该字符串的最短程序的长度。
举个例子,考虑一个简单的字符串”1010101010”。
如果我们可以使用一个循环来生成这个字符串,那么它的Kolmogorov复杂度就很低。
但如果需要编写一段非常长的代码才能生成该字符串,那么它的Kolmogorov复杂度就会很高。
需要注意的是,Kolmogorov复杂度并不依赖于某个特定的编程语言或计算机模型。
它是一个理论上的概念,用于衡量字符串的随机性。
3. 判断随机性根据Kolmogorov准则,一个字符串被认为具有随机性,当且仅当它的Kolmogorov复杂度与字符串长度接近。
换句话说,如果一个字符串很短但却需要非常长的程序才能生成,那么它就被认为是具有随机性的。
这个判断方法可以通过计算字符串长度和最短程序长度之间的差值来实现。
如果这个差值很小(比如小于某个预先设定的阈值),那么我们可以说该字符串具有较高的随机性。
4. 应用领域Kolmogorov准则在信息论、复杂性理论和计算机科学等领域有广泛应用。
以下是一些具体应用场景:4.1 数据压缩Kolmogorov准则可以用来评估数据压缩算法的效果。
如果一个数据压缩算法能够将一个字符串压缩成很短的程序,并且解压缩时能够还原原始字符串,那么这个算法就是有效的。
柯尔莫哥洛夫复杂度不可计算定理
柯尔莫哥洛夫复杂度不可计算定理下载提示:该文档是本店铺精心编制而成的,希望大家下载后,能够帮助大家解决实际问题。
文档下载后可定制修改,请根据实际需要进行调整和使用,谢谢!本店铺为大家提供各种类型的实用资料,如教育随笔、日记赏析、句子摘抄、古诗大全、经典美文、话题作文、工作总结、词语解析、文案摘录、其他资料等等,想了解不同资料格式和写法,敬请关注!Download tips: This document is carefully compiled by this editor. I hope that after you download it, it can help you solve practical problems. The document can be customized and modified after downloading, please adjust and use it according to actual needs, thank you! In addition, this shop provides you with various types of practical materials, such as educational essays, diary appreciation, sentence excerpts, ancient poems, classic articles, topic composition, work summary, word parsing, copy excerpts, other materials and so on, want to know different data formats and writing methods, please pay attention!柯尔莫哥洛夫复杂度不可计算定理引言在计算理论中,柯尔莫哥洛夫复杂度不可计算定理是一项重要的理论结果。
基于Kolmogorov复杂性的异常检测算法研究
基于Kolmogorov复杂性的异常检测算法研究随着人类社会越来越依赖计算机和互联网,各种类型的异常事件也越来越多地发生。
这些异常事件可能是网络攻击、欺诈、异常行为等。
因此,异常检测成为了许多领域的重要问题。
异常检测技术可以在数据集中识别异常点,并帮助我们了解潜在的问题。
基于 Kolmogorov 复杂性的异常检测算法是一种被广泛研究和应用的方法,本文将对其进行详细介绍。
Kolmogorov 复杂性是一个用于衡量一个字符串的复杂程度的概念。
简单来说,一个字符串的 Kolmogorov 复杂性越高,表示它越难以被简单的描述。
如果一个字符串可以用一个很短的程序描述,那么它的 Kolmogorov 复杂性就很低。
反之,如果一个字符串没有任何简单的描述方式,那么它的 Kolmogorov 复杂性就很高。
基于 Kolmogorov 复杂性的异常检测方法将数据集中的每个数据点都看作一个字符串,并计算它们的 Kolmogorov 复杂性,然后将复杂度高于某个阈值的数据点视为异常点。
Kolmogorov 复杂度的计算是不可计算的,但我们可以使用某些启发式算法来估计它。
一种常见的估算方法是使用最短的编码长度来表示一个字符串。
具体地,我们可以利用 Lempel Ziv 算法或者 Huffman 编码等算法来实现这个过程。
这些算法都会生成一个被压缩并能保持原始数据点的信息的二进制字符串。
这种基于 Kolmogorov 复杂性的异常检测算法的优点在于:它可以自适应地适用于各种不同的数据类型,且不需要对数据进行过多的前期处理或改变。
另外,这种算法对于噪声数据的鲁棒性非常好,能够自动消除噪声的影响。
Kolmogorov 复杂度也能够从理论上证明它的结果是最优的,这也是算法广泛应用的原因之一。
然而,这种异常检测算法也存在一些不足之处。
由于 Kolmogorov 复杂性计算的不可能性,实际中我们只能通过启发式算法对其进行估计。
数学k是什么意思数学q是什么意思
数学k是什么意思数学q是什么意思数学k是什么意思数学q是什么意思数学,作为一门学科,涵盖了广泛的知识领域,其中包括了各种符号、符号组合和特定的概念。
在数学中,我们经常会遇到各种字母代表特定的数或概念,其中包括了数学中常见的字母k和q。
那么,在数学中,k和q分别代表了什么意思呢?1. 数学中的k的意思在数学中,字母k可以代表多种含义,具体取决于上下文和特定的数学领域。
下面是一些常见的k的诠释:a. 常数 (Constant):在代数中,字母k经常用于表示一个常数。
常数是指一个固定的、无限不变的数值。
b. 无穷大 (Infinity):在解析几何和极限理论中,字母k有时也用来表示无穷大(Infinity)。
c. 科尔莫哥罗夫复杂度 (Kolmogorov Complexity):字母k还可以代表科尔莫哥罗夫复杂度,它是对一个对象的信息内容进行衡量的一种度量方式。
d. 指数 (Exponent):在指数函数中,字母k有时用来表示指数。
e. 控制变量 (Control Variable):在统计学中,字母k也可用于表示一个控制变量。
这些只是字母k在数学中的一些表示方法,具体含义还应根据上下文和特定领域来进行解读。
2. 数学中的q的意思同样,字母q在数学中也有多种不同的含义,以下是其中几个常见的解释:a. 平方根(Square Root):在代数学中,字母q有时用于表示平方根。
b. 分位数 (Quantile):在统计学中,字母q常用于表示分位数。
c. 圆 (Circle):在几何学中,字母q有时代表圆。
d. 圈 (Quadrant):在坐标系中,字母q表示第四象限。
e. 素数 (Prime Number):在数论中,字母q有时用于表示素数。
这些只是字母q在数学中的几种表示方式,具体含义还需根据具体情况和数学领域来进行解读。
综上所述,数学中的字母k和q代表了多种不同的含义,具体使用取决于上下文和特定的数学领域。
基于Kolmogorov复杂性的数据压缩技术
基于Kolmogorov复杂性的数据压缩技术Kolmogorov复杂性是一种用来描述信息的复杂性的理论模型,而数据压缩是一种利用这种理论来实现的技术。
在这篇文章中,我们将探讨基于Kolmogorov复杂性的数据压缩技术,包括其原理、应用和未来发展方向。
什么是Kolmogorov复杂性?在信息学领域中,Kolmogorov复杂性是一个重要的理论概念。
简单来说,它是描述一个信息的最短可能的实现方法所需的大小的度量。
也就是说,如果我们知道了如何以最简单的方式来表示一个信息,那么我们就能够计算出这个信息的Kolmogorov复杂性。
举个例子,假设我们有一个简单的字符串“ABABABAB”。
我们可以把它表示为“8个A和B的交替排列”,也可以表示为“AB出现4次”。
前者需要8个字符,后者只需要10个字符(包括ABB)。
因此,我们可以说这个字符串的Kolmogorov复杂性为10,因为这是它最简单的表示方式。
基于Kolmogorov复杂性的数据压缩技术原理基于Kolmogorov复杂性的数据压缩技术的原理很简单,就是通过找到数据的最短可表示方式,从而压缩数据的大小。
具体来说,它通过以下两个步骤实现:1. 找到数据的最短可表示方式这一步相当于寻找数据的Kolmogorov复杂性。
一旦找到了这个值,我们就可以知道最短可能的表示方式是什么。
为了达到这个目标,我们需要使用一些特殊的算法,例如Kolmogorov–Chaitin复杂度算法。
2. 将数据表示为最短可表示方式一旦找到了数据的最短可表示方式,我们就可以使用压缩算法将数据表示为这个方式。
这种算法可以用多种方式实现,例如Lempel-Ziv算法、Huffman编码等。
基于Kolmogorov复杂性的数据压缩技术应用基于Kolmogorov复杂性的数据压缩技术在多个领域都有广泛的应用。
以下是其中的一些例子:文本压缩在文本压缩方面,基于Kolmogorov复杂性的数据压缩技术可以显著地减少文本文件的大小。
Minimum Message Length criteria. Several theoretical
This effort is funded by DARPA ISO contract number F33615-00-C-1629. Accepted for publication at the DARPA Information Survivability Conference and Exposition II (DISCEX-II 2001) to be held 12-14 June 2001 in Anaheim, Information Assurance through Kolmogorov ComplexityScott Evans, Stephen F. Bush, and John Hershey GE Corporate Research and Developmentevans@AbstractThe p roblem of Information Assurance is ap p roached from the p oint of view of Kolmogorov Comp lexity and Minimum Message Length criteria. Several theoretical results are obtained, p ossible ap p lications are discussed and a new metric for measuring complexity is introduced. Utilization of Kolmogorov Comp lexity like metrics as conserved parameters to detect abnormal system behavior is exp lored. Data and p rocess vulnerabilities are p ut forward as two different dimensions of vulnerability that can be discussed in terms of Kolmogorov Comp lexity. Finally, these results are utilized to conduct comp lexity-based vulnerability analysis.1. IntroductionInformation security (or lack thereof) is too often dealt with after security has been lost. Back doors are opened, Trojan horses are placed, passwords are guessed and firewalls are broken down – in general, security is lost as barriers to hostile attackers are breached and one is put in the undesirable position of detecting and patching holes. In fact many holes go undetected. Breaches in other complex systems that people care about are not handled in such an inept manner. Thermodynamic systems, for example, can be assured of their integrity by the pressure, heat or mass the system contains. Hydrostatic tests can be performed to ensure that there are no “holes”, and the general health of the system can be ascertained by measuring certain parameters. One doesn’t wait, for example, for all the water to drain out of a heat exchanger and a rat to come inside to announce that there is a problem. A problem is identified as soon as the temperature or pressure drops and immediately one can take action to both correct the problem and to isolate other areas of the system from harm. But does one perform a hydrostatic test of an information system? What conserved parameters exist to measure the health or vulnerability of the system? How can one couple the daunting task of providing a system where vulnerabilities are readily measurable with the required need for simplicity of use for authorized users? This paperexplores these issues and proposes that only through monitoring objective quantities inherently related to information itself can the science of information assurance move beyond patching holes.Kolmogorov Complexity is proposed as a fundamental property of information that has properties of conservation that may be exploited to provide information assurance. In this paper, Kolmogorov Complexity is reviewed and current work in this area explored for possible applications in providing information assurance. The concept of Minimum Message Length is explored and applied to information assurance, yielding examples of possible benefits for system optimization as well as security that can be achieved through the use of Kolmogorov Complexity based ideas. Finally, complexity based vulnerability analysis is demonstrated through simulation.2. BackgroundCurrently information security is achieved through the use of multiple techniques to prevent unauthorized use. E ncryption, authentication/password protection, and policies all provide some level of security against unauthorized use. But other than simply relying on these secure barriers, how does one measure the health of their security system. If a password or encryption key is compromised, what indication will be available? The degree to which a system is compromised is difficult to ascertain. For example, if one password has been guessed, or two encryption keys determined, how secure is the information system? Are all detectable security issues equal, or are some more important than others? These difficulties reflect the fact that there is no objective, fundamental set of parameters that can be evaluated to determine if security is maintained. Insecurity may not be detected until an absurd result (rat in a tank) discloses the presence of an attacker. An inherent property of information itself is desired that can be monitored to ensure the security of an information system. The descriptive complexity of the information itself – the Kolmogorov complexity is a strong candidate for this purpose. Kolmogorov Complexity is reviewed in the following section.2.1. Kolmogorov ComplexityKolmogorov Complexity is a measure of descriptive complexity contained in an object. It refers to the minimum length of a program such that a universal computer can generate a specific sequence. A good introduction to Kolmogorov Complexity is contained in [3] with a solid treatment in [4]. Kolmogorov Complexity is related to Shannon entropy, in that the expected value of K(x) for a random sequence is approximately the entropy of the source distribution for the process generating the sequence [3]. However, Kolmogorov Complexity differs from entropy in that it relates to the specific string being considered rather than the source distribution. Kolmogorov Complexity can be described as follows, where ϕ represents a universal computer, p represents a program, and x represents a string:{})(min)()(p l x Kxp ==ϕϕ.Random strings have rather high Kolmogorov Complexity – on the order of their length, as patterns cannot be discerned to reduce the size of a program generating such a string. On the other hand, strings with a large amount of structure have fairly low complexity. Universal computers can be equated through programs of constant length, thus a mapping can be made between universal computers of different types, and the Kolmogorov Complexity of a given string on two computers differs by known or determinable constants. The Kolmogorov Complexity K(y|x) of a string y given string x as input is described by the equation below:þýüîíì=∞==y x p that such p no is there if p l x y K y x p ),(,)(min )|(),(ϕϕϕ ,where l(p) represents program length p and ϕ is a particular universal computer under consideration. Thus, knowledge or input of a string x may reduce the complexity or program size necessary to produce a new string y. The major difficulty with Kolmogorov Complexity is that you can’t compute it. Any program that produces a given string is an upper bound on the Kolmogorov Complexity for this string, but you can’t compute the lower bound [4]. A best estimate of Kolmogorov Complexity may be useful in determining and providing information assurance due to links between Kolmogorov Complexity and information security that will be discussed later. Various estimates have been considered, including compressibility, or pseudo-randomness, whichmeasure the degree to which strings have patterns or structure. A new metric that is related to the power spectral density of the sequence auto-correlation is introduced in Section 4. However, all metrics are at best crude estimates. The inability to compute Kolmogorov Complexity persists as the major impediment to widespread utilization.Despite the problems with measurement, Kolmogorov Complexity and information assurance are related in many ways. Cryptography, for example attempts to take strings that have structure and make them appear random. The quality of a cryptographic system is related to the systems ability to raise the apparent complexity of the string, an idea discussed in detail later, while keeping the actual complexity of the string relatively the same (within the bounds of the encryption algorithm). In other words, cryptography achieves its purpose by making a string appear to have a high Kolmogorov Complexity through the use of a difficult or impossible to guess algorithm or key.Security vulnerabilities may also be analyzed from the viewpoint of Kolmogorov Complexity. One can even relate insecurity fundamentally to the incomputibility of Kolmogorov Complexity and show why security vulnerabilities exist in a network. Vulnerabilities can be thought of as the identification of methods to accomplish tasks on an information system that are easier than intended by the system designer. Essentially the designer intends for something to be hard for an unauthorized user and the attacker identifies an easier way of accomplishing this task. Measuring and keeping track of a metric for Kolmogorov Complexity in an information system provides a method to detect such short-circuiting of the intended process.2.2. Minimum Message Length PrincipleSince it is not computable, few applications exist for Kolmogorov Complexity. One growing application is a statistical technique with strong links to information theory known as Minimum Message Length (MML) coding [8]. MML coding encodes information as a hypothesis that identifies the presumptive distribution, from which data originated, appended with a string of data, coded in an optimal way. The length of an MML message is determined as follows:#M = #H + #D,where #M is the message length, #H is the length of the specification of the hypothesis regarding the data, and #D is the length of the data, encoded in an optimal manner given hypothesis H. As discussed in [8], MML coding approaches the Kolmogorov Complexity or actual bound on the minimum length required for representing a string of data.3. Conserved VariablesConserved variables enable one to deduce parameters from the presence or absence of other parameters. The Law of Conservation of Matter and E nergy [1] for example allows one to deduce how well a thermodynamic system is functioning without knowing every parameter in the system. Heat gain in one part of the system was either produced by some process or traveled from (and was lost from) another part of the system. One knows that if the thermal efficiency of a thermodynamic system falls below certain thresholds then there is problem. On the other hand, if more heat is produced by a system than expected, some unintended process is at work. A similar situation is desirable for information systems – the ability to detect lack of assurance by the presence of something unexpected, or the absence of something that is expected. This seems to be far from reach, given that information is easily created and destroyed with little residual evidence or impact.One possible candidate for a conserved variable in an information system is Kolmogorov Complexity. Suppose you could easily know the exact Kolmogorov Complexity K(S) of a string of data S. You would essentially have a conserved parameter that could be used to detect, resolve or infer events that occur in the system, just as tracking heat in a thermodynamic system enables monitoring of that system. Operations that affect string S and cause it to gain or lose complexity can be accounted for, and an expected change in complexity should be resolvable with the known (secured) operations occurring in the information system to produce expected changes in complexity. Complexity changes that occur in a system that cannot be accounted for by known system operations are indications of unauthorized processes taking place. Thus, in the ideal case where Kolmogorov Complexity is known, a check and balance on an information system that enables assurance of proper operation and detection of unauthorized activity is possible. Unfortunately (as previously discussed) a precise measure of Kolmogorov Complexity is not computable. We can, however, bound the increase in Kolmogorov Complexity as shown in the theorems below.3.1. Theorems of ConservationKolmogorov Complexity, K(x), can be thought of as a conserved parameter that changes through computational operations conducted upon strings. In order for K(x) to be a conserved parameter one must account for changes in K(x). Two theorems are presented below that enable bounds to be placed on the changes in K(x) that occur due to computational operations occurring in an information system. The two theorems below show bounds on the amount of complexity that can exist due to knowledge of other strings or computational operations. 3.1.1. Theorem 1: Bound on Conditional Complexity)()|(yKxyKϕϕ≤ProofSince K(y) is the minimal length program that produces string y with no input, input x can only reduce the length of the program required to produce y. At worst, x can be ignored completely, in which case K(y|x)=K(y). However, knowing x may reduce the program to produce y, depending on the extent that string x contributes towards generating string y or enables a more efficient generation of y. QED3.1.2. Theorem 2: Bound on Complexity Increase Dueto Computational Operation)()(),|(pLxKpxyK+≤ϕϕProofProgram p of length L(p) takes input string x to produce output string y. Proof by contradiction: consider a program p that could be run on input string x to produce string y. Assume that the complexity of y = K(y|x,p) > K(x) + L(p). But one could produce string y by first forming string x with program of length K(x), then running program p of length L(p), thus producing y with a program of length K(x) + L(p). But this violates the definition of Kolmogorov Complexity as being the minimum length program, since a program of smaller length has been found. Thus the assumption is false and K(y|x,p) must be <= K(x) + L(p). QED3.3. Conservation of ComplexityAs shown above, while not computable from below, upper bounds on the increase in Kolmogorov Complexity can be crudely known by keeping track of the size of programs that affect data. This bound may be incredibly loose, as it is quite possible to operate on a string and make it much less complex than the input. One would need a method to recognize this simplification. However, these results provide an intuitively attractive method for quantifying the “work” performed by a computational operation on information – the change in complexity introduced by the operation. A thorough treatment of bounds related to K(y|x) and the “Information Distance”between strings is contained in Bennett et al.[9].4. A Measure for Binary String ComplexityAs previously discussed, due to its non-computable nature, estimates of K(x) are difficult. Numeroustechniques for estimating K(x) are discussed in [4]. The task of estimating K(x) is related to the task of assessing string structure. A new primitive approach to this related issue is introduced based on the power spectral density of a string’s auto-correlation. This approach highlights the ability to gain knowledge of K(x) without any higher knowledge about the system producing string x or the meaning of the information.Recognizing that the complexity of a binary string may be defined in many ways. A useful complexity measure may be related to properties of the string’s non-cyclic auto-correlation. Specifically, given an n-bit binary string, S, where:nii sS<≤=0)},({,andii s∀±∋}1{)(.Define the non-cyclic auto-correlation, R, as:nii rR<≤=0)},({whereå−−=+=1)()()(injji s jsi r.From R, calculate the sequence’s non-negative powerspectral density, Φi , by multiplying the Fourier transformof R by its conjugate. The measure for binary string complexity that is formed is denoted by Ψ and is defined asåΦΦ=Ψi iifactornorm log.1.The motivation to this approach is found in the rich and venerable field of synchronization sequence design. Sequences that have an auto-correlation who’s side-lobes are of very low magnitude provide good defense against ambiguity in time localization. Such an auto-correlation function will approximate a “thumbtack” and its Fourier transform will approximate that of band-limited white noise.The authors of this paper expect that Ψ will be of utility in assessing complexity as it relates to the compressibility of a binary string. To begin the testing of this hypothesis, strings are generated from the Markov process diagrammed in Figure 1. A series of binary sequences of 8000 bits were generated, each for different values of p. Ψwas computed for each of these strings and also packed into 1000-kilobyte files. These were subjected to the UNIX compress routine. The Inverse Compression Ratio (ICR) was computed which is the size of the compressed file normalized to its uncompressed size, 1000 kilobytes in these cases.Figure 1. Markov model for string generation.The hypothesis is that Ψ and the ICR should vary in a similar manner and that Ψ might be a useful measure of sequence compressibility and hence complexity. The graph in Figure 2 seems to endorse this hypothesis and further research is motivated.Psi& ICR Versus p0.90.910.920.930.940.950.960.970.980.99Figure 2. Variation of Psi and ICR with p.The above results show that fundamental parameters such as power spectral density of sequence auto-correlation and compressibility are related and follow similar trends. These fundamental metrics are possible candidates for measuring trend of increase or decrease in K(x). However, also illustrated by these results (the unequal rate of change between the two metrics) are the loose bounds within which estimates of K(x) are related. Other methods of estimating K(x) are described in [4]. In the next section we introduce a method for attacking the issue of loose bounds in order to make complexity metrics useful for the purposes of assessing and providing information assurance.5. Apparent ComplexityThe results in Section 3 give an upper bound on complexity increase due to computational operations, but perhaps one can do better. In fact, the size of the shortest program one can find to produce a particular string is the best estimate for K(S). Since Kolmogorov Complexity is unknowable, the best that we can do is estimate well. This motivates the idea of apparent complexity: the best attainable estimate of K(S) given the limited information available to a particular party. The benefit or possible way to exploit the idea of apparent complexity is that a user generating a string should have the best idea of how hard it is to generate the string. There are many reasons why a user may not choose to generate a string using the minimal size program. Perhaps a longer program can execute faster, or perhaps the generator is unknowingly using an inefficient process. However, the generator of a string of data is presumed to have knowledge of the process used to generate that data. This may in fact make the non-computability of Kolmogorov Complexity an asset: a good candidate for use in providing information assurance, for the follow reason. The information system designer or an authorized user generating data should have better knowledge of the data process than an attacker. An attacker cannot simply compute the optimal process. Additionally, conservation of apparent complexity enables abnormalities to be tracked when the expected number of computational operations is not utilized in transforming string x into string y. Thus, even if one cannot know or compute the most efficient process for creating a string of data, one can at least gain benefit from ensuring through monitoring resources that the expected process is used. This type of assurance has in fact been used informally to detect network security problems for many years. Discrepancies in computer account charges have lead to detection of attack [10]. The idea of using Kolmogorov Complexity provides the possibility of using this type of technique on a more fundamental level where knowledge about the information content would not be required to determine unauthorized activity. The term apparent complexity will be used to reflect the best measurement of Kolmogorov complexity available to the party undertaking the measurement.5.1 Process vs. Data ComplexityApparent complexity can be applied to the problem of information assurance in two ways. As discussed above, conservation of apparent complexity may enable detecting and correcting abnormal behavior. Another method of using apparent complexity for information assurance is in the identification of weak areas or vulnerabilities in the system. Consider the postulate that the more apparently complex the data, the more difficult for an attacker to understand the data and exploit the system. Thus, the more apparently complex, the less vulnerable and vice versa. One proposed metric for vulnerability relates to evaluating the apparent complexity of the concatenated input and output K(X.Y). This relates to the joint complexity of the data input and output from a certain process (Black Box). The lower the complexity of K(X.Y) the easier the data is for an attacker to understand, thus we will regard K(X.Y) as a measure of data vulnerability. A competing metric is the relative complexity K(Y|X) of the process. This is the “work” done on the information by the process, or the complexity added to or removed from X to produce Y. Thus K(Y|X) is a measure of process vulnerability. The relationships between these respective complexity metrics and the black box process is shown in Figure 3.Figure 3. Process vs. data vulnerabilities.Data vulnerability relates to how vulnerable a system is to an attacker knowing information. This type is perhaps best measured by K(X.Y), where the cumulative complexity of input and output data is observed to measure the difficulty an attacker would face in decrypting or identifying messages contained in input and output. For example, hopefully K(encrypted message) appears >> K(decrypted message) to the casual observer and is only recognized to be on the order of K(decrypted message) to an authorized user with the correct key after the decryption algorithm has been run.Process vulnerability relates a system’s susceptibility to an attacker understanding the processes that manipulate information. This vulnerability is best quantified by thecomplexity injected or removed from the data by the process at work. For example, a copy or pass-through process adds little complexity, K(Y|X) is zero. But if encrypted data is sent through the copy process, K(X.Y) will be high. The attacker will be unable to discern the messages that are sent, but can learn to perhaps simulate this particular black box quite effectively. Whereas if plain text data is sent through the copy process K(X.Y) will be low, and in addition to understanding the process at work, and attacker may be able to know the particular messages that are sent. Both vulnerabilities are undesirable and represent two different dimensions of vulnerability to be avoided. To make systems secure one must maximize both process and data complexity to a non-authorized user while keeping the systems simple to authorized users. Proper accounting of K(Y|X) and K(X.Y) throughout the system will enable both identification of weak areas as well as identification of foul play through the conservation principles discussed earlier.6. Vulnerability Red uction by means of System OptimizationIn this section issues related to system optimization that can be achieved through Kolmogorov Complexity and various related tradeoffs are discussed. Compression and security are strongly linked in that they are bounded optimally by the most random sequence that can be produced. But smallest program size is not the only or even most important performance metric. Execution time is and example of another metric that must be considered. The tradeoff is indicated in Figure 4, where the possibility of having programs with large K(X) and small execution time and vice versa are highlighted.Through the use of active network techniques [11] the tradeoff indicated may be dynamically addressed using a concept called Active Packet Morphing for network optimization. As shown in Figure 5, changing the form of information from data to code as information flows through a system can optimize CPU resources and bandwidth resources.optimization.This idea can be extended to optimize or prevent adverse effects from critical resources in addition to bandwidth and CPU. Memory, time of execution or buffer space could be use to trade off forms of data representation to optimize certain system parameters. The ability of data to change form within a system opens up multiple optimization paths that were previously invariant in the system. Rigorous security quantification resulting from this work allows active packets to morph by adding the required security overhead along specific communication links such that the security of the link along with the security of the morphed packet yield the proper level of security required by a given policy. Thus, security overhead is minimized.Another parameter that can optimize system resources is the knowledge of how a piece of data is used. MP3 audio is a good example of how leaving out information (specifically that which is undetectable by the human ear) can optimize data size. We introduce here the idea of “necessary” data to augment the idea of “sufficient” data or sufficient statistics that represent all information contained in the original data [3]. Sufficient representation of data contains all the information that the source data contains. Necessary data contains only the information that the source data contains that the destination instrument can effectively use. If one efficiently encapsulates all the information in source data in a statistical parameter one may have achieved a minimum sufficient statistic. If one further reduces this statistic such that one encapsulates only the information that is usable by the end node, one has obtained the minimal necessary sufficient statistic. Thus, KolmogorovComplexity related ideas have tremendous impact for system optimization as well as security.7. Automated Discovery of Vulnerabilities without a priori Knowledge of Vulnerability TypesAn information system can be designed in such a manner that the apparent complexity of the system under attack can be determined with respect to the attacker and that information used to maximize the distance in the apparent complexity between the attacker and defenders in an automatically reconstituted system. An Active Network [11] is an ideal environment in which to experiment withan implementation of automated system reconstitution because it provides extreme flexibility in fine-grain code movement and composition of code. Apparent complexityis used to reconstitute the system such that the complexity difference is maximized between legitimate users and attackers of the system. In this section, the discussion is limited to the automated hardening of a system based upon information about an attacker and a new form of vulnerability analysis, called complexity-based vulnerability analysis.The motivation for complexity-based vulnerability analysis comes from the fact that vulnerability analysis tools today require types of vulnerabilities to be known a priori. This is unacceptable, but understandable given the challenge of finding all potential vulnerabilities in a system. Information assurance is a hard problem in part because it involves the application of the scientific method by a defender to determine a means of evaluating and thwarting the scientific method applied by an attacker. This self-reference of scientific methods would seem to imply a non-halting cycle of hypotheses and experimental validation being applied by both offensive and defensive entities. Information assurance depends upon the ability to discover the relationships governing this cycle and then quantifying and measuring the progress made by both an attacker and defender. This work attempts to lay the foundation for quantifying information assurance.Quantification is necessary because tools have been developed to measure and analyze security assuming rigorously defined security metrics exist. See [12] for an example of such a tool whose sample vulnerability chain output is shown in Figure 6. The numbers shown in Figure 6 are opportunities for an attacker to move across vulnerabilities. More precisely, tools such as these rely onan “insecurity flow” metric. However, a rigorously defined metric has not yet been derived. One focus of this paper is upon mathematically quantifying and refining insecurity flows. It is extremely important that a proper metric space is chosen, because the entire foundation of Information Assurance will rest upon this space. Particularly notable is the fact that relationships involving physics of information are being developed whose operations will be facilitated by the choice of metric.Figure 6. Vulnerability results from an analysis tool.Any vulnerability analysis technique for information assurance must account for the innovation of an attacker. Such a metric was suggested about 700 years ago by William of Ockham [13]. Ockham’s Razor has been the basis of much of this paper and the complexity-based vulnerability method to be presented. The salient point of Ockham’s Razor and complexity-based vulnerability analysis is that the better one understands a phenomenon, the more concisely the phenomenon can be described. This is the essence of the goal of science: to develop theories that require a minimal amount of irrelevant information, all the knowledge required to describe a phenomenon should be algorithmically contained in formulae.7.1 Mozart and Vulnerability AnalysisScience is art and art is science. One of the most mathematical of art forms is the composition of music. Music is compressed and transported over the Internet very frequently and most listeners of such music probably have little interest in the compression ratio of a particular piece of music. However, this piece of information can be very interesting and informative with regard to the complexity of a piece of music. One would expect an incompressible piece of music to be highly complex; perhaps bordering on random noise; while a highly compressible piece of music would have a very simple repetitive nature. Most people would probably prefer music that falls in a mid-level range of complexity; sounds that are not repetitious and boring yet not random and annoying, but follow an internal pattern in the listeners’ minds. Music is a mathematical sequence that the composer is posing to the listener; the more easily the listener can extrapolate the sequence without being too challenged or too bored the more pleasing the music sounds. Carrying the music analogy forward in a more。
心率变异信号的高阶复杂性分析
心率变异信号的高阶复杂性分析朱家富;杨浩【期刊名称】《西南师范大学学报(自然科学版)》【年(卷),期】2005(030)001【摘要】心率变异性反映了交感神经和迷走神经对心血管系统的综合调节作用,是评价心血管系统功能的重要指标.复杂度是刻画时间信号序列信息量的一个重要参数,但其传统算法中的过分粗略化会丢失大量的有用信息,而高阶复杂度的引入可较大程度地避免这一问题.分别对25例正常人样本和25例充血性心力衰竭病人样本的心率变异信号的1~10阶Kolmogorov复杂度进行了计算与对比分析,结果表明,5阶Kolmogorov复杂度在临床医学上可以为分析心率变异信号获得最为理想的效果.%The heart rate variability(HRV) reflects the synthetical function of the parasympathetic-sympathetic nerve to adjust and control the heart and blood system and is an important index to characterize the heart and blood system.The complexity is a valuable parameter to represent the information contained in a time series,because the algorithm to measure complexity is so rough that part of the information contained in the time series is lost.To avoid loss of information,higher order complexity measure is used in this research and the Kolmogorov complexity indexes of two groups of samples consisting of 25 normal persons and 25 slight congestive heart failure patients are calculated and analyzed.One important conclusion obtained in this research is that the 5th order complexity is the most suitable order to represent the HRV.【总页数】5页(P59-63)【作者】朱家富;杨浩【作者单位】重庆大学,电气工程学院,重庆,400044;渝西学院,物理与电子信息工程系,重庆,402168;重庆大学,电气工程学院,重庆,400044【正文语种】中文【中图分类】R318.04【相关文献】1.多尺度熵在心率变异信号复杂性分析中的应用 [J], 蔡瑞;卞春华;宁新宝2.基于高阶复杂性测度的心率变异信号分析 [J], 朱家富;杨浩;何为3.基于心率变异性信号的睡眠呼吸暂停检测方法 [J], 牛艳玲;刘健;朱一荻;李翔;李锦;凤飞龙4.力竭运动后女子运动员脑电信号与心率变异性的变化研究 [J], 杜维平;庞耀辉;张越;池爱平5.心率变异性与脑白质高信号严重程度的相关性分析 [J], 康丽媛;徐辉;姚燕雯;白宏英因版权原因,仅展示原文概要,查看原文内容请购买。
库洛米折法
库洛米折法库洛米折法(Kolmogorov Complexity)是一种用于衡量信息的压缩程度的数学概念。
它由苏联数学家安德烈·尼古拉耶维奇·库洛莫戈洛夫于1965年提出,是信息论和计算复杂性理论的重要组成部分。
1. 什么是库洛米折法?库洛米折法衡量的是一个对象的信息内容的大小。
它定义为将该对象通过某种算法进行压缩所需的最小码长。
换句话说,库洛米折法是一个对象的最短描述长度。
这个概念可以用来衡量信息的复杂性和随机性。
2. 库洛米折法的计算方法计算一个对象的库洛米折法并不是一件容易的事情,因为它涉及到找到该对象的最短描述。
理论上,库洛米折法是无法计算的,因为我们无法找到一个通用的算法来计算任意对象的最短描述。
然而,在实际应用中,我们可以使用一些启发式算法来近似计算一个对象的库洛米折法。
这些算法通常基于某种压缩算法,如Lempel-Ziv算法,通过对对象进行压缩来估计其库洛米折法。
3. 库洛米折法的应用库洛米折法在计算机科学和信息论中有广泛的应用。
下面是一些库洛米折法的应用示例:3.1 数据压缩库洛米折法可以用来评估不同压缩算法的效果。
一个好的压缩算法应该能够将对象的库洛米折法尽可能地减小,从而达到更高的压缩比。
3.2 数据挖掘库洛米折法可以用来衡量数据集的复杂性。
一个复杂的数据集可能需要更长的描述长度,而一个简单的数据集则可以用更短的描述来表示。
通过计算数据集的库洛米折法,我们可以评估数据集的复杂性,并选择合适的数据挖掘算法来处理它。
3.3 算法复杂性库洛米折法可以用来评估算法的复杂性。
一个复杂的算法可能需要更长的描述长度来表示,而一个简单的算法则可以用更短的描述来表示。
通过计算算法的库洛米折法,我们可以比较不同算法的复杂性,并选择合适的算法来解决问题。
4. 库洛米折法的局限性尽管库洛米折法在理论上是一个很有用的概念,但它也存在一些局限性。
4.1 不可计算性如前所述,计算一个对象的库洛米折法是不可计算的。
水文系统复杂性研究
:x(t
+ 1)
<
x(t )
两种排列。
11⎥ ⎥
3 ⎥⎦
我们可以看出,在原序列中,排列 π 1 出现 4 次,排列 π 2 出现 2 次。则该序列相应的排
列熵为:
∑ H (2)
=
2
−
i =1
p(π k ) log[p(π k )] =
−
4 6
log
2
⎜⎛ ⎝
4 6
⎟⎞ ⎠
−
2 6
log
2
⎜⎛ ⎝
2 6
1 排列熵
设一维时间序列为 {x(t),t = 1,2,L,T}。将此时间序列延拓为 n 维相空间的一个相
型分布为[8-9]:
X i = [x(i), x(i + L),L, x(i + (n −1)L)] ; i = 1,2,L,T − n + 1
(1)
式中, n 为相空间维数; L 为延迟时间,一般取为 L = 1 。
和 n = 3 ,说明排列熵的计算。
当 n = 2 时,序列可延拓为T − n + 1 = 7 − 2 + 1 = 6 个矢量组成的相型,即:
X1 ⎡ 4
X
2
⎢ ⎢
7
X X
3 4
⎢9 ⎢⎢10
X
5
⎢ ⎢
6
X 6 ⎢⎣11
7⎤
9
⎥ ⎥
10⎥
6
⎥ ⎥
。每个矢量
Xi
有π1
:x(t )
<
x(t
+ 1)
和π2
算法复杂度 C0 、Pincus近似熵A pEn和徐京华复杂度C , 研究了河川径流序列的复杂特性,
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Kolmogorov Complexity with ErrorLance Fortnow1,Troy Lee2,and Nikolai Vereshchagin31University of Chicago,1100E.58th Street,Chicago,IL60637.fortnow@,/~fortnow2CWI and University of Amsterdam,413Kruislaan,1098SJ Amsterdam,TheNetherlands.tlee@cwi.nl,http://www.cwi.nl/~tlee3Moscow State University,Leninskie Gory,Moscow,Russia119992.ver@mccme.ru,http://lpcs.math.msu.su/~verSupported in part by the RFBR grants03-01-00475,358.20003.1.Work done whilevisiting CWI.Abstract.We introduce the study of Kolmogorov complexity with er-ror.For a metric d,we define C a(x)to be the length of a shortest programp which prints a string y such that d(x,y)≤a.We also study a condi-tional version of this measure C a,b(x|y)where the task is,given a stringy′such that d(y,y′)≤b,print a string x′such that d(x,x′)≤a.This def-inition admits both a uniform measure,where the same program shouldwork given any y′such that d(y,y′)≤b,and a nonuniform measure,where we take the length of a program for the worst case y′.We studythe relation of these measures in the case where d is Hamming distance,and show an example where the uniform measure is exponentially largerthan the nonuniform one.We also show an example where symmetry ofinformation does not hold for complexity with error under either notionof conditional complexity.1IntroductionKolmogorov complexity measures the information content of a string typically by looking at the size of a smallest program generating that string.Suppose we received that string over a noisy or corrupted channel.Such a channel could change random bits of a string,possibly increasing its Kolmogorov complexity without adding any real information.Alternatively,suppose that we do not have much memory and are willing to sacrificefidelity to the original data in order to save on compressed size.What is the cheapest approximation to a string within our level of tolerance to distortion? Such compression where some,less important we hope,information about the original data is lost is known as lossy compression.Intuitively,these scenarios are in some sense complementary to one another: we expect that if we lossy compress a string received over a corrupted channel2with our level of tolerance equal to the number of expected errors,then thecheapest string within the level of tolerance will be the one with the high com-plexity noise removed.Ideally we would get back our original string.For certain compression schemes and models of noise this intuition can be made precise[8].In this paper we explore a variation of Kolmogorov complexity designed tohelp us measure information in these settings.We define the Kolmogorov com-plexity of a string x with error a as the length of a smallest program generating a string x′that differs from x in at most a bits.We give tight bounds(up tologarithmic factors)on the maximum complexity of such strings and also look at time-bounded variations.We also look at conditional Kolmogorov complexity with errors.Traditionalconditional Kolmogorov complexity looks at the smallest program that convertsa string y to a string x.In our context both x and y could be corrupted.We want the smallest program that converts a string close to y to a string close to x.We consider two variations of this definition,a uniform version where we have asingle program that that converts any y′close to y to a string x′close to x and a nonuniform version where the program can depend on y′.We show examplesgiving a large separation between the uniform and nonuniform definitions.Finally we consider symmetry of information for Kolmogorov complexity witherror.Traditionally the complexity of the concatenation of strings x,y is roughly equal to the sum of the complexity of x and the complexity of y given x.Weshow that for any values of d and a the complexity of xy with error d is at most the sum of the complexity of x with error a and the complexity of converting astring y with d−a error given x with a bits of error.We show the other directionfails in a strong sense—we do not get equality for any a.2PreliminariesWe use|x|to denote the length of a string x,and A to denote the cardinalityof a set A.All logarithms are base2.We use d H(x,y)to denote the Hamming distance between two binary strings x,y,that is the number of bits on which they differ.For x∈{0,1}n we letB n(x,R)denote the set of n-bit strings within Hamming distance R from x, and V(n,R)= R i=0 n i denote the volume of a Hamming ball of radius R over n-bit strings.For0<λ≤1/2the binary entropy ofλis H(λ)=−λlogλ−(1−λ)log(1−λ).The binary entropy is useful in the following approximation of V(n,R)which we will use on several occasions(a proof can be found in[1]).Lemma1.Suppose that0<λ≤1/2andλn is an integer.Then2nH(λ)≤V(n,λn)≤2nH(λ).8nλ(1−λ)3 3Defining Kolmogorov Complexity with ErrorWe consider several possible ways of defining Kolmogorov complexity with error. In this section we present these alternatives in order to evaluate their relative merits in the coming sections.First,we review the standard definition of Kol-mogorov complexity.More details can be found in[6].For a Turing machine T,the Kolmogorov complexity C T(x|y)of x given y is the length of a shortest program p such that T(p,y)=x.The theory of Kolmogorov complexity begins from the following invariance theorem:there is a universal machine U such that for any other Turing machine T,there exists a constant c T such that C U(x|y)≤C T(x|y)+c T,for all x,y.We nowfix such a U and drop the subscript.Now we define also the unconditional Kolmogorov complexity C(x)=C(x|empty string).Definition1.Let d:({0,1}n)2→R be a metric,and a∈R.The complexity of x with error a,denoted C a(x)is C a(x)=min x′{C(x′):d(x′,x)≤a}.We will also consider a time bounded version of this definition,C t a(x)= min x′{C t(x′|empty string):d(x,x′)≤a},where C t(x|y)is the length of a shortest program p such that U(p,y)prints x in less than t(|x|+|y|)time steps. Here we assume that the machine U is universal in the following sense:for any other Turing machine T,there exists a constant c T and a polynomial q such thatC q(|x|,|y|,t) U (x|y)≤C t T(x|y)+c T,for all x,y,t.A relative version of Kolmogorov complexity with error is defined by Im-pagliazzo,Shaltiel and Wigderson[4].That is,they use the definition Cδ(x)= min{C(y):d H(x,y)≤δ|x|}.We prefer using absolute distance here as it be-haves better with respect to concatenations of strings—using relative distance has the disadvantage of severe nonmonotonicity over prefixes.Take,for example, x∈{0,1}n satisfying C(x)≥n.Let y=02n.Then C1/3(x)≥n−log V(n,n/3) while C1/3(xy)≤log n+O(1).Using absolute error we have that C a(xy)≥C a(x)−O(log n),that is it only suffers from logarithmic dips as with standard definition.Defining conditional complexity with error is somewhat more subtle.We introduce both uniform and nonuniform versions of conditional complexity with error.Definition2.For a Turing machine T,the uniform conditional complexity, denoted(C u a,b)T(x|y),is the length of a shortest program p such that,for any y′satisfying d(y,y′)≤b it holds that T(p,y′)outputs a string whose distance from x is less than a.The invariance theorem remains true:there is a universal machine U such that for any other Turing machine T,there exists a constant c T such that (C u a,b)U(x|y)≤(C u a,b)T(x|y)+c T,for all x,y,a,b.Wefix such a U and drop the subscript.Definition3.Nonuniform conditional complexity,which we denote C a,b(x|y) is defined as C a,b(x|y)=max y′min x′{C(x′|y′):d(x′,x)≤a and d(y′,y)≤b}.In section6we study the difference between these two measures.44Strings of Maximal ComplexityOne of the most famous applications of Kolmogorov complexity is the incom-pressibility method(see[6],Chapter6).To prove there exists an object with a certain property,we consider an object with maximal Kolmogorov complexity and show that it could be compressed if it did not possess this property.This method relies on a simple fact about strings of maximal complexity: for every length n,there is a string x of complexity at least n.This follows from simple counting.It is also easy to see that,up to an additive constant, every string has complexity at most its length.What is the behavior of maximal complexity strings in the error case?In this paper we restrict ourselves to the Hamming distance case.Again by a counting argument,we see that for every n there is an x of length n with C a(x)≥log2n/V(n,a)=n−log V(n,a).Upper bounding the complexity of strings in the error case requires a bit more work,and has a close connection with the construction of covering codes.A covering code C of radius a is a set of strings such that for every x∈{0,1}n there is an element y∈C such that d H(x,y)≤a.Thus an upper bound on the maximum complexity strings will be given by the existence of covering codes of small size.The following Lemma is well known in the covering code literature,(see[1]or[5]).Lemma2.For any n and integer R≤n,there exists a set C⊆{0,1}n with the following properties:1. C ≤n2n/V(n,R)2.for every x∈{0,1}n,there exists c∈C with d H(x,c)≤R3.The set C can be computed in time poly(2n)Proof:For thefirst two items we argue by the probabilistic method.Fix a point x∈{0,1}n.We uniformly at random choose k elements x1,...,x k of{0,1}n. The probability P x that x is not contained in∪k i=1B(x i,R)is preciselyP x=(1−V(n,R)/2n)k≤e−kV(n,R)/2n.For the inequality we have used the fact that e z≥1+z for any z.Taking k to be n2n/V(n,R)makes this probability strictly less than2−n.Thus the probability of the union of the events P x over x∈{0,1}n is,by the union bound,less than 1and there exists a set of n2n/V(n,R)centers which cover{0,1}n.This gives items1and2.For item3we now derandomize this argument using the method of condi-tional probabilities.The argument is standard as found in[7],and omitted here.To achieve part3of Lemma2one could alternatively apply a general theorem that the greedy algorithm alwaysfinds a covering of a set X of size at most a ln X multiplicative factor larger than the optimal covering(see Corollary37.5 in[2]).This would give the slightly worse bound of O(n22n/V(n,R)).5 Theorem1.For every n,a and x∈{0,1}n,C a(x)≤n−log V(n,a)+O(log n).Proof:Use the lexicographicallyfirst covering code of radius a whose existence is given by Lemma2.One nice property of covering codes is that they behave very well under concatenation.Let C1be a covering code of{0,1}n1of radius R1and C2be a covering code of{0,1}n2of radius R2.Now let C={cc′:c∈C1,c′∈C2}be the set of all ordered concatenations of codewords from C1with codewords from C2. Then C is a covering code over{0,1}n1+n2of radius R1+R2.We can use this idea in combination with item3of Lemma2to efficiently construct near-optimal covering codes.This construction has already been used for a complexity-theoretic application in[3].Theorem2.There is a polynomial time bound p(n)such that C p(n)a(x)≤n−log V(n,a)+O(n log log n/log n)for every x∈{0,1}n and every a.Proof:We construct a covering code over{0,1}n with radius a such that the i th element of the covering can be generated in time polynomial in n.Letℓ=log n and divide n into n/ℓblocks of lengthℓ.Let r=(a/n)ℓ.Now by item3of Lemma2we can in time polynomial in n construct a covering code over{0,1}ℓof radius r and of cardinalityℓ2ℓ/V(ℓ,r).Call this covering Cℓ.Our covering code C over{0,1}n will be the set of codewords{c1c2···c n/ℓ:c i∈Cℓ}.The size of this code will be:C ≤(2ℓ−log V(ℓ,r)+logℓ)n/ℓ=(2ℓ−ℓH(a/n)+O(logℓ))n/ℓ(1)=2n−nH(a/n)+O(n logℓ/ℓ)=2n−log V(n,a)+O(n logℓ/ℓ).The second and last inequalities hold by Lemma1.In this proof we assumed that log n,n/log n,and a log n/n are all integer. The general case follows with simple modifications.5Dependence of Complexity on the Number of Allowed ErrorsBoth the uniform and the non-uniform conditional complexities C u a,b and C a,b are decreasing functions in a and increasing in b.Indeed,if b decreases and a increases then the number of y′’s decreases and the number of x′’s increases, thus the problem to transform every y′to some x′becomes easier.What is the maximal possible rate of this decrease/increase?For the uniform complexity, we have no non-trivial bounds.For the non-uniform complexity,we have the followingTheorem3.For all x,y of length n and all a≤a′,b′≤b it holdsC a,b(x|y)≤C a′,b′(x|y)+log(V(n,a)/V(n,a′))+log(V(n,b′)/V(n,b))+O(log n).6Proof:Let y′be a string at distance b from y.We need tofind a short programmapping it to a string at distance a from x.To this end we need the followinglemma from[9].Lemma3.For all d≤d′≤n having the form i/n,every Hamming ball of radius d′in the set of binary strings of length n can be covered by at mostO(n4V(n,d′)/V(n,d))Hamming balls of radius d.Apply the lemma to d′=b,d=b′and to the ball of radius b centered at y′.Let B1,...,B N,where N=O(n4V(n,b)/V(n,b′)),be the covering balls.Let B i be a ball containing the string y and let y′′be its center.There is a program,call it p,of length at most C a′,b′(x|y)mapping y′′to a string at distance a′from x.Again apply the lemma to d=a,d′=a′and to the ball of radiusd′centered at x′.Let C1,...,C M,where M=O(n4V(n,a′)/V(n,a)),be thecovering balls.Let C j be a ball containing the string x and let x′′be its center. Thus x′′is at distance a from x and can be found from y′,p,i,j.This implies that K(x′′|y′)≤|p|+log N+log M+O(log n)(extra O(log n)bits are needed to separate p,i and j).In the above proof,it is essential that we allow the program mapping y′to a string close to x depend on y′.Indeed,the program is basically the triple(p,i,j) where both i and j depend on y′.Thus the proof is not valid for the uniform conditional complexity.And we do not know whether the statement itself is true for the uniform complexity.By using Theorem2one can prove a similar inequality for time bounded complexity with the O(log n)error term replaced by O(n log log n/log n).6Uniform vs.Nonuniform Conditional ComplexityIn this section we show an example where the uniform version of conditional complexity can be exponentially larger than the nonuniform one.Our example will be for C0,b(x|x).This example is the standard setting of error correction: given some x′such that d H(x,x′)≤b,we want to recover x exactly.An obvious upper bound on the nonuniform complexity C0,b(x|x)is log V(n,b)+O(1)—as we can tailor our program for each x′we can simply say the index of x in the ball of radius b around x′.In the uniform case the same program must work for every x′in the ball of radius b around x and the problem is not so easy.The following upper bound was pointed out to us by a referee.(x|x)≤log V(n,2b)+O(1).Proposition1.C u0,bProof:Let C⊆{0,1}n be a set with the properties:1.For every x,y∈C:B n(x,b)∩B n(y,b)=∅.2.For every y∈{0,1}n∃x∈C:d H(x,y)≤2b.7 We can greedily construct such a set as if there is some string y with no string x∈C of distance less than2b,then B n(y,b)is disjoint from all balls of radius b around elements of C and so we can add y to C.Now for a given x,let x∗be the closest element of C to x,with ties broken by lexicographical order.Let z=x⊕x∗.By the properties of C this string has Hamming weight at most2b and so can be described with log V(n,2b)bits. Given input x′with d H(x,x′)≤b,our program does the following:computes the closest element of C to x′⊕z,call it w,and then outputs w⊕z=w⊕x∗⊕x. Thus for correctness we need to show that w=x∗or in other words that d H(x′⊕z,x∗)≤b.Notice that d H(α⊕β,β)=d H(α,0),thusd H(x′⊕z,x∗)=d H(x′⊕x⊕x∗,x∗)=d H(x′⊕x,0)=d H(x,x′)≤b.We now turn to the separation between the uniform and nonuniform mea-sures.The intuition behind the proof is the following:say we have some com-putable family S of Hamming balls of radius b,and let x be the center of one of these balls.Given any x′such that d H(x,x′)≤b,there may be other centers of the family S which are also less than distance b from x′.Say there are k of them. Then x has a nonuniform description of size about log k by giving the index of x in the k balls which are of distance less than b from x′.In the uniform case,on the other hand,our program can no longer be tailored for a particular x′,it must work for any x′such that d H(x,x′)≤b.That is, intuitively,the program must be able to distinguish the ball of x from any other ball intersecting the ball of x.To create a large difference between the nonuniform and uniform conditional complexity measures,therefore,we wish to construct a large family of Hamming balls,every two of which intersect,yet that no single point is contained in the intersection of too many balls.Moreover,we can show the stronger statement that C0,b(x|x)is even much smaller than C u a,b(x|x),for a non-negligible a.For this,we further want that the contractions of any two balls to radius a are disjoint.The next lemma shows the existence of such a family.Lemma4.For every length m of strings and a,b,and N satisfying the inequal-itiesN2V(m,2a)≤2m−1,N2V(m,m−2b)≤2m−1,NV(m,b)≥m2m+1(2) there are strings x1,...,x N such that the balls of radius a centered at x1,...,x N are pairwise disjoint,and the balls of radius b centered at x1,...,x N are pairwise intersecting but no string belongs to more than NV(m,b)21−m of them.Proof:The proof is by probabilistic arguments.Take N independent random strings x1,...,x N.We will prove that with high probability they satisfy the statement.First we estimate the probability that there are two intersecting balls of radius a.The probability that twofixed balls intersect is equal to V(m,2a)/2m.8The number of pairs of balls is less than N2/2,and by union bound,there are two intersecting balls of radius a with probability at most N2V(m,2a)/2m+1≤1/4 (use thefirst inequality in(2)).Let us estimate now the probability that there are two disjoint balls of radius b.If the balls of radius b centered at x j and x i are disjoint then x j is at distance at most m−2b from the string¯x i,that is obtained from x i byflipping all bits. Therefore the probability that for afixed pair(i,j)the balls are disjoint is at most V(m,m−2b)/2m.By the second inequality in(2),there are two disjoint balls with probability at most1/4.It remains to estimate the probability that there is a string that belongs to more than NV(m,b)21−m balls of radius b.Fix x.For every i the probability that x lands in B i,the ball of radius b centered at x i,is equal to p=|B i|/2m= V(m,b)/2m.So the average number of i with x∈B i is pN=NV(m,b)/2m.By Chernoffinequality the probability that the number of i such that x lands in B i exceeds twice the average is at mostexp(−pN/2)=exp(−NV(m,b)/2m+1)≤exp(−m)≪2−m(use the third inequality in(2)).Thus even after multiplying it by2m the number of different x’s we get a number close to0.Using this lemma wefind x with exponential gap between C0,b(x|x)and C u0,b(x|x)and even between C0,b(x|x)and C u a,b(x|x)for a,b linear in the length n of x.Theorem4.Fix rational constantsα,β,γsatisfyingγ≥1and0<α<1/4<β<1/2,2H(β)>1+H(2α),2H(β)>1+H(1−2β)(3)Notice that ifβis close to1/2andαis close to0then these inequalities are satisfied.Then for all sufficiently large m there is a string x of length n=γm with C0,βm(x|x)=O(log m)while C uαm,βm(x|x)≥m(1−H(β))−O(log m).Proof:Given m let a=αm,b=βm and N=m2m+1/V(m,b).Let us verify that for large enough m the inequalities(2)in the condition of Lemma4are fulfilled.Taking the logarithm of thefirst inequality(2)and ignoring all terms of order O(log m)we obtain2(m−mH(β))+mH(2α)<mThis is true by the second inequality in(3).Here we used that,ignoring loga-rithmic terms,log V(m,b)=mH(β)and log V(m,2a)=mH(2α)as bothβ,2αare less than1/2.Taking the logarithm of the second inequality(2)we obtain2(m−mH(β))+mH(1−2β)<m.This is implied by the third inequality in(3).Finally,the last inequality(2) holds by the choice of N.9 Find thefirst sequence x1,...,x N satisfying the lemma.This sequence has complexity at most C(m)=O(log m).Append0n−m to all strings x1,...,x N. Obviously the resulting sequence also satisfies the lemma.For each string x i we have C0,b(x i|x i)=O(log m),as given any x′at distance at most b from x i we can specify x i by specifying its index among centers of the balls in the family containing x′in log(NV(m,b)21−m)=log4m bits and specifying the family itself in O(log m)bits.It remains to show that there is x i with C u a,b(x i|x i)≥log N.Assume the contrary and choose for every x i a program p i of length less than log N such that U(p,x′)is at distance a from x i for every x′at distance at most b from x i. As N is strictly greater than the number of strings of length less than log N,by the Pigeon Hole Principle there are different x i,x j with p i=p j.However the balls of radius b with centers x i,x j intersect and there is x′at distance at most b both from x i,x j.Hence U(p,x′)is at distance at most a both from x i,x j,a contradiction.Again,at the expense of replacing O(log m)by O(m log log m/log m)we can prove an analog of Theorem4for time bounded complexity.We defer the proof to thefinal version.Theorem5.There is a polynomial p such that for all sufficiently large m thereis a string x of length n=γm with C p(n)0,βm (x|x)=O(m log log m/log m)whileC uαm,βm(x|x)≥m(1−H(β))−O(m log log m/log m).(Note that C u has no time bound;this makes the statement stronger.)7Symmetry of InformationThe principle of symmetry of information,independently proven by Kolmogorov and Levin[10],is one of the most beautiful and useful theorems in Kolmogorov complexity.It states C(xy)=C(x)+C(y|x)+O(log n)for any x,y∈{0,1}n. The direction C(xy)≤C(x)+C(y|x)+O(log n)is easy to see—given a program for x,and a program for y given x,and a way to tell these programs apart,we can print xy.The other direction of the inequality requires a clever proof.Looking at symmetry of information in the error case,the easy direction is again easy:The inequality C d(xy)≤C a(x)+C d−a,a(y|x)+O(log n)holds for any a—let p be a program of length C a(x)which prints a string x∗within Hamming distance a of x.Let q be a shortest program which,given x∗,prints a string y∗within Hamming distance d−a of y.By definition,C d−a,a(y|x)= max x′min y′C(y′|x′)≥min y′C(y′|x∗)=|q|.Now given p and q and a way to tell them apart,we can print the string xy within d errors.For the converse direction we would like to have the statementFor every d,x,y there exists a≤d such thatC d(xy)≥C a(x)+C d−a,a(y|x)−O(log n).(∗)We do not expect this statement to hold for every a,as the shortest program for xy will have a particular pattern of errors which might have to be respected10in the programs for x and y given x.We now show,however,that even the formulation(∗)is too much to ask.Theorem6.For every n and all d≤n/4there exist x,y∈{0,1}n such that for all a≤d the difference∆(a)=(C a(y)+C d−a,a(x|y))−C d(xy)is more than bothlog V(n,d)−log V(n,a),log V(n,d+a)−log V(n,d−a)−log V(n,a), up to an additive error term of the order O(log n).Since C u d−a,a(x|y)≥C d−a,a(x|y),Theorem6holds for uniform conditional com-plexity as well.Before proving the theorem let us show that in the case,say,d=n/4it implies that for some positiveεwe have∆(a)≥εn for all a.Letα<1/4be the solution to the equationH(1/4)=H(1/4+α)−H(1/4−α).Note that the function in the right hand side increases from0to1asαincreases from0to1/4.Thus this equation has a unique solution.Corollary1.Let d=n/4and let x,y be the strings existing by Theorem6. Then we have∆(a)≥n(H(1/4)−H(α))−O(log n)for all a.The proof is simply a calculation and is omitted.Now the proof of Theorem6. Proof:Coverings will again play an important role in the proof.Let C be the lexicographicallyfirst minimal size covering of radius d.Choose y of length n with C(y)≥n,and let x be the lexicographically least element of the covering within distance d of y.Notice that C d(xy)≤n−log V(n,d),as the string xx is within distance d of xy,and can be described by giving a shortest program for x and a constant many more bits saying“repeat”.(In the whole proof we neglect additive terms of order O(log n)).Let us provefirst that C(x)=n−log V(n,d) and C(y|x)=log V(n,d1)=log V(n,d),where d1stands for the Hamming distance between x and y.Indeed,n≤C(y)≤C(x)+C(y|x)≤n−log V(n,d)+C(y|x)≤n−log V(n,d)+log V(n,d1)≤n.Thus all inequalities here are equalities,hence C(x)=n−log V(n,d)and C(y|x)=log V(n,d1)=log V(n,d).Let us prove now thefirst lower bound for∆(a).As y has maximal complex-ity,for any0≤a≤d we have C a(y)≥n−log V(n,a).Summing the inequalities−C d(xy)≥−n+log V(n,d),C a(y)≥n−log V(n,a),C d−a,a(x|y)≥0,11 we obtain the lower bound∆(a)≥log V(n,d)−log V(n,a).To prove the secondlower bound of the theorem,we need to show thatC d−a,a(x|y)≥log V(n,d+a)−log V(n,d−a)−log V(n,d).(4)To prove that C d−a,a(x|y)exceeds a certain value v we need tofind a y′at distance at most a from y such that C(x′|y′)≥v for all x′at distance at mostd−a from x.Let y′be obtained from y by changing a random set of a bits on which x and y agree.This means that C(y′|y,x)≥log V(n−d1,a).It sufficesto show thatC(x|y′)≥log V(n,d+a)−log V(n,d).Indeed,then for all x′at distance at most d−a from x we will haveC(x′|y′)+log V(n,d−a)≥C(x|y′)(knowing x′we can specify x by its index in the ball of radius d−a centered atx′).Summing these inequalities will yield(4).We use symmetry of information in the nonerror case to turn the task of lower bounding C(x|y′)into the task of lower bounding C(y′|x)and C(x).Thisworks as follows:by symmetry of information,C(xy′)=C(x)+C(y′|x)=C(y′)+C(x|y′).As C(y′)is at most n,using the second part of the equality we have C(x|y′)≥C(x)+C(y′|x)−n.Recall that C(x)=n−log V(n,d).Thus to complete the proof we need to show the inequality C(y′|x)≥log V(n,d+a),that is,y′isa random point in the Hamming ball of radius d+a with the center at x.Tothis end wefirst note that log V(n,d+a)=log V(n,d1+a)(up to a O(log n) error term).Indeed,as a+d≤n/2we have log V(n,d+a)=log n d+a and log V(n,d)=log n d .The same holds with d1in place of d.Now we will show that log V(n,d)−log V(n,d1)=O(log n)implies that log V(n,d+a)−log V(n,d1+a)=O(log n).It is easy to see that n d+1 / n d1+1 ≤ n d / n d1 provided d1≤d. Using the induction we obtain n d+a / n d1+a ≤ n d / n d1 .Thus we havelog V(n,d+a)−log V(n,d1+a)=log n d+a / n d1+a≤log n d / n d1 =log V(n,d)−log V(n,d1)=O(log n).Again we use(the conditional form of)symmetry of information:C(y′y|x)=C(y|x)+C(y′|y,x)=C(y′|x)+C(y|y′,x).The string y differs from y′on a bits out of the d1+a bits on which y′and x differ.Thus C(y|y′,x)≤log d1+a a .Now using the second part of the equality12we haveC(y′|x)=C(y|x)+C(y′|y,x)−C(y|y′,x)≥log V(n,d1)+log V(n−d1,a)− d1+a a . We have used that log V(n−d1,a)=log n−d1a ,as a≤(n−d1)/2.Hence, C(y′|x)≥log n d1 +log n−d1a −log d1+a a =log V(n,d+a).Again,at the expense of replacing O(log n)by O(n log log n/log n)we can prove an analog of Theorem6for time bounded complexity.AcknowledgmentWe thank Harry Buhrman for several useful discussions and the anonymous referees for valuable remarks and suggestions.References1.G.Cohen,I.Honkala,S.Litsyn,and A.Lobstein.Covering Codes.North-Holland,Amsterdam,1997.2.T.Cormen,C.Leiserson,and R.Rivest.Introduction to Algorithms.MIT Press,1990.3. E.Dantsin,A.Goerdt,E.Hirsch,and U.Sch¨o ning.Deterministic algorithms fork-SAT based on covering codes and local search.In Proceedings of the27th Inter-national Colloquium On Automata,Languages and Programming,Lecture Notes in Computer Science,pages236–247.Springer-Verlag,2000.4.R.Impagliazzo,R.Shaltiel,and A.Wigderson.Extractors and pseudo-randomgenerators with optimal seed length.In Proceedings of the32nd ACM Symposium on the Theory of Computing,pages1–10.ACM,2000.5.M.Krivelevich,B.Sudakov,and V.Vu.Covering codes with improved density.IEEE Transactions on Information Theory,49:1812–1815,2003.6.M.Li and P.Vit´a nyi.An Introduction to Kolmogorov Complexity and its Applica-tions.Springer-Verlag,New York,second edition,1997.7.R.Motwani and P.Raghavan.Randomized Algorithms.Cambridge UniversityPress,1997.8. B.Natarajan.Filtering random noise from deterministic signals via data compres-sion.IEEE transactions on signal processing,43(11):2595–2605,1995.9.N.Vereschagin and P.Vit´a nyi.Algorithmic rate-distortion theory./abs/cs.IT/0411014,2004.10. A.Zvonkin and L.Levin.The complexity offinite objects and the algorithmicconcepts of information and randomness.Russian Mathematical Surveys,25:83–124,1970.。