Information Theory

合集下载

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Information is uncertainty. Information Theory tells us how to measure information, and the possibility of transmitting the information, which may be counted in bits if wanted. Quantum information offers a new intriguing possibility of information processing and computation.
Historical Notes

Claude E. Shannon (1916-2001) himself in 1948, has established almost everything we will talk about today. He was dealing with communication aspects. He first used the term “bit.”

Qubit vs. bit Measure and collapse, noncloneable property Parallel vs. sequential access
Aspects of Qபைடு நூலகம்antum Information: transmitting
Conclusion

The Channel Coding Theorem

The Channel is characterized by its input X and output Y, with its capacity C=I(X;Y). If the coding rate <C, we can transmit without error; if coding rate>C, then error is bounded to occur. limit the ability of a channel to convey information
Digression: Information and Data Mining

Data Mining is the process to make data meaningful; i.e., to get the statistical correlation of data in certain aspects. In some sense, we can view this as a process to generate some bases to present the information content in the data.
Digression: Entropy in thermodynamics or statistical mechanics

Entropy is the measure of disorder of a thermodynamic system. The definition is identical with the information entropy, but the summation now runs on all possible physical states. Actually, entropy is first introduced in thermodynamics and Shannon found out his measure is just entropy in physics!
Classical and Beyond

Quantum entanglement and its probable application: quantum computing How do they relate to the classical information theory?
Quantum vs. Classical

Thanks for your attention!
Conditional Entropy and Mutual Information

If the objects (for examples, random variables) are not independent with each other, then the total entropy does not equals to the sum of all individual entropy. Conditional entropy: H(Y|X)= ∑x p(x) H(Y|X=x) H(Y|X=x)= -∑y p(y|x) log p(y|x) Clearly, H(X,Y)=H(X)+H(Y|X) Mutual Information I(X;Y)=H(X)-H(X|Y)=H(Y)-H(Y|X) =H(X)+H(Y)-H(X,Y), represents the common information in X and Y. Mutual Information is the overlap!
Motivation: What is information?

By definition, information is certain knowledge about certain things, which may or may not be conceived by an observer. Example: music, story, news, etc. Information means. Information propagates. Information corrupts…

Information is not always meaningful!
Motivation: Information Theory tells us…

What exactly information is How they are measured and presented Implications, limitations, and applications
A Brief Introduction to
Information Theory
12/2/2004 陳冠廷
Outline

Motivation and Historical Notes Information Measure Implications and Limitations Classical and Beyond Conclusion
Applications

The Source Coding Theorem The Channel Coding Theorem
The Source Coding Theorem

To encode a random information source X into bits, we need at least H(X) (log base 2) bits. That is why H(X) base 2 is in the unit of bits. the possibility of lossless compression
Information Measure: behavioral definitions

The information measure H should be additive for independent objects; i.e., with 2 information sources which has no relations with each other, H=H1+H2. H(X)=0 if X is determined. H should be maximized when the object is most unknown. H is the information entropy!
Information Measure: Entropy

Entropy H(X) of a random variable X is defined by H(X)= -∑ p(x) log p(x) We can verify that the measure H(X) satisfies the three criterion stated. If we choose the logarithm in base 2, then the entropy may be claimed to be in the unit of bits; the use of the unit will be clarified later.
How do we define information?

The Information within a source is the uncertainty of the source.
Example

Every time we row a dice, we get points from 1 through 6. The information we get is larger than we throw a coin or row an “infair” dice. The less we know, the more the information contained! For a source (random process) that is known to generate a sequence 01010101… with 0 right after 1 and 1 right after 0, though has an average chance of 50% to get either 0 or 1, but the information is the same as a fair coin. If knowing for sure, nothing gains.