A Game Theoretic Formulation for Intrusion Detection in Mobile Ad Hoc Networks
Strategic Game Theory For Managers
R.E.Marks © 2003
Lecture 1-7
1. Strategic Decision Making
Business is war and peace. ➣ Cooperation in creating value. ➣ Competition in dividing it up. ➣ No cycles of War, Peace, War, .... but simultaneously war and peace. “You have to compete and cooperate at the same time.” — Ray Noorda of Novell.
It’s no good sticking to your knitting if there’s no demand for jumpers.
R.E.Marks © 2003
Lecture 1-11
Question: High or low? You can choose Left or Right: Profits: Left You Rival $40 m $20 m Right $80 m $160 m
R.E.Marks © 2003y
❝Conventional economics takes the structure of markets as fixed.
People are thought of as simple stimulus-response machines. Sellers and buyers assume that products and prices are fixed, and they optimize production and consumption accordingly. Conventional economics has its place in describing the operation of established, mature markets, but it doesn’t capture people’s creativity in finding new ways of interacting with one another. Game theory is a different way of looking at the world. In game theory, nothing is fixed. The economy is dynamic and evolving. The players create new markets and take on multiple roles. They innovate. No one takes products or prices as given. If this sounds like the free-form and rapidly transforming marketplace, that’s why game theory may be the kernel of a new economics for the new economy.❞ — Brandenburger & Nalebuff Foreword to Co-opetition
重庆市第八中学2023-2024学年高一上学期期中考试英语
重庆八中 2023—2024学年度(上)半期考试高一年级英语试题第一部分听力(共两节,满分30分)第一节(共5 小题: 每小题 1.5 分,满分分)听下面5段对话。
每段对话后有一个小题,从题中所给的A、B、C三个选项中选出最佳选项。
听完每段对话后,你都有10秒钟的时间来回答有关小题和阅读下一小题。
每段对话仅读一遍。
1. When willthe charity party start?A. At 5p. m.B. At 7 p. m:C. At9p. m.2. How is the weather now?A. Rainy.B. Cloudy.C. Sunny.3、Which is the best way for the man to get to the airport?A. Taking the subway.B. Catching a bus.C. Getting a taxi.4.What did Mary do last night?A. She didn't return.B. She went out late.C. She held a fancy dress party.5.What are the speakers discussing?A. A national holiday.B. A TV programme.C. The president.第二节(共 15 小题; 每小题分,满分分)请听下面5段对话或独白。
每段话或独白后有几个小题,从题中所给的A、B、C三个选项中选出最佳选项,并标在试题卷的相应位置。
听每段对话或独白前,你将有时间阅读各小题,每小题5秒钟:听完后,各小题将给出5秒钟的作答时间。
每段对话或独白读两遍。
听第6段材料,回答第6至7题。
6. What was the woman's father in China?A. A traveller.B. A businessman.C. A teacher.7. How old was the woman when she went to Beijing for the first time?A. Five.B. Six.C. Eight.听第7段材料,回答第8至 10题。
Primary School English Game Teaching Methods
primary school students, encouraging them to participate more
actively in English learning.
02
Easy to understand and remember
Through games, dull knowledge points are made vivid and
The content of this research includes: the application of game teaching method in primary school English teaching, the classification and characteristics of game teaching method, the effect of game teaching method on students' English ability and interest, and the challenges and solutions of applying game teaching method in primary school English teaching
Research purpose and significance
This research aims to explore the application and effect of game teaching method in primary school English teaching
Through the research, we can provide a basis for teachers to select appropriate game teaching methods and improve the
概率论与数理统计英文文献
Introduction to probability theory andmathematical statisticsThe theory of probability and the mathematical statistic are carries on deductive and the induction science to the stochastic phenomenon statistical rule, from the quantity side research stochastic phenomenon statistical regular foundation mathematics discipline, the theory of probability and the mathematical statistic may divide into the theory of probability and the mathematical statistic two branches. The probability uses for the possible size quantity which portrays the random event to occur. Theory of probability main content including classical generally computation, random variable distribution and characteristic numeral and limit theorem and so on. The mathematical statistic is one of mathematics Zhonglian department actually most directly most widespread branches, it introduced an estimate (rectangular method estimate, enormousestimate), the parameter supposition examination, the non-parameter supposition examination, the variance analysis and the multiple regression analysis, the fail-safe analysis and so on the elementary knowledge and the principle, enable the student to have a profound understanding tostatistics principle function. Through this curriculum study, enables the student comprehensively to understand, to grasp the theory of probability and the mathematical statistic thought and the method, grasps basic and the commonly used analysis and the computational method, and can studies in the solution economy and the management practice question using the theory of probability and the mathematical statistic viewpoint and the method.Random phenomenonFrom random phenomenon, in the nature and real life, some things are interrelated and continuous development. In the relationship between each other and developing, according to whether there is a causal relationship, very different can be divided into two categories: one is deterministic phenomenon. This kind of phenomenon is under certain conditions, will lead to certain results. For example, under normal atmospheric pressure, water heated to 100 degrees Celsius, is bound to a boil. This link is belong to the inevitability between things. Usually in natural science is interdisciplinary studies and know the inevitability, seeking this kind of inevitable phenomenon.Another kind is the phenomenon of uncertainty. This kind of phenomenon is under certain conditions, the resultis uncertain. The same workers on the same machine tools, for example, processing a number of the same kind of parts, they are the size of the there will always be a little difference. As another example, under the same conditions, artificial accelerating germination test of wheat varieties, each tree seed germination is also different, there is strength and sooner or later, respectively, and so on. Why in the same situation, will appear this kind of uncertain results? This is because, we say "same conditions" refers to some of the main conditions, in addition to these main conditions, there are many minor conditions and the accidental factor is people can't in advance one by one to grasp. Because of this, in this kind of phenomenon, we can't use the inevitability of cause and effect, the results of individual phenomenon in advance to make sure of the answer. The relationship between things is belong to accidental, this phenomenon is called accidental phenomenon, or a random phenomenon.In nature, in the production, life, random phenomenon is very common, that is to say, there is a lot of random phenomenon. Issue such as: sports lottery of the winning Numbers, the same production line production, the life of the bulb, etc., is a random phenomenon. So we say: randomphenomenon is: under the same conditions, many times the same test or survey the same phenomenon, the results are not identical, and unable to accurately predict the results of the next. Random phenomena in the uncertainties of the results, it is because of some minor, caused by the accidental factors.Random phenomenon on the surface, seems to be messy, there is no regular phenomenon. But practice has proved that if the same kind of a large number of repeated random phenomenon, its overall present certain regularity. A large number of similar random phenomena of this kind of regularity, as we observed increase in the number of the number of times and more obvious. Flip a coin, for example, each throw is difficult to judge on that side, but if repeated many times of toss the coin, it will be more and more clearly find them up is approximately the same number.We call this presented by a large number of similar random phenomena of collective regularity, is called the statistical regularity. Probability theory and mathematical statistics is the study of a large number of similar random phenomena statistical regularity of the mathematical disciplines.The emergence and development of probability theoryProbability theory was created in the 17th century, it is by the development of insurance business, but from the gambler's request, is that mathematicians thought the source of problem in probability theory.As early as in 1654, there was a gambler may tired to the mathematician PASCAL proposes a question troubling him for a long time: "meet two gamblers betting on a number of bureau, who will win the first m innings wins, all bets will be who. But when one of them wins a (a < m), the other won b (b < m) bureau, gambling aborted. Q: how should bets points method is only reasonable?" Who in 1642 invented the world's first mechanical addition of computer.Three years later, in 1657, the Dutch famous astronomy, physics, and a mathematician huygens is trying to solve this problem, the results into a book concerning the calculation of a game of chance, this is the earliest probability theory works.In recent decades, with the vigorous development of science and technology, the application of probability theory to the national economy, industrial and agricultural production and interdisciplinary field. Many of applied mathematics, such as information theory, game theory, queuing theory, cybernetics, etc., are based on the theory of probability.Probability theory and mathematical statistics is a branch of mathematics, random they similar disciplines are closely linked. But should point out that the theory of probability and mathematical statistics, statistical methods are each have their own contain different content.Probability theory, is based on a large number of similar random phenomena statistical regularity, the possibility that a result of random phenomenon to make an objective and scientific judgment, the possibility of its occurrence for this size to make quantitative description; Compare the size of these possibilities, study the contact between them, thus forming a set of mathematical theories and methods.Mathematical statistics - is the application of probability theory to study the phenomenon of large number of random regularity; To through the scientific arrangement of a number of experiments, the statistical method given strict theoretical proof; And determining various methods applied conditions and reliability of the method, the formula, the conclusion and limitations. We can from a set of samples to decide whether can with quite large probability to ensure that a judgment is correct, and can control the probability of error.- is a statistical method provides methods are used in avariety of specific issues, it does not pay attention to the method according to the theory, mathematical reasoning.Should point out that the probability and statistics on the research method has its particularity, and other mathematical subject of the main differences are:First, because the random phenomena statistical regularity is a collective rule, must to present in a large number of similar random phenomena, therefore, observation, experiment, research is the cornerstone of the subject research methods of probability and statistics. But, as a branch of mathematics, it still has the definition of this discipline, axioms, theorems, the definitions and axioms, theorems are derived from the random rule of nature, but these definitions and axioms, theorems is certain, there is no randomness.Second, in the study of probability statistics, using the "by part concluded all" methods of statistical inference. This is because it the object of the research - the range of random phenomenon is very big, at the time of experiment, observation, not all may be unnecessary. But by this part of the data obtained from some conclusions, concluded that the reliability of the conclusion to all the scope.Third, the randomness of the random phenomenon, refers to the experiment, investigation before speaking. After the real results for each test, it can only get the results of the uncertainty of a certain result. When we study this phenomenon, it should be noted before the test can find itself inherent law of this phenomenon.The content of the theory of probabilityProbability theory as a branch of mathematics, it studies the content general include the probability of random events, the regularity of statistical independence and deeper administrative levels.Probability is a quantitative index of the possibility of random events. In independent random events, if an event frequency in all events, in a larger range of stable around a fixed constant. You can think the probability of the incident to the constant. For any event probability value must be between 0 and 1.There is a certain type of random events, it has two characteristics: first, only a finite number of possible results; Second, the results the possibility of the same. Have the characteristics of the two random phenomenon called"classical subscheme".In the objective world, there are a large number of random phenomena, the result of a random phenomenon poses a random event. If the variable is used to describe each random phenomenon as a result, is known as random variables.Random variable has a finite and the infinite, and according to the variable values is usually divided into discrete random variables and the discrete random variable. List all possible values can be according to certain order, such a random variable is called a discrete random variable; If possible values with an interval, unable to make the order list, the random variable is called a discrete random variable.The content of the mathematical statisticsIncluding sampling, optimum line problem of mathematical statistics, hypothesis testing, analysis of variance, correlation analysis, etc. Sampling inspection is to pair through sample investigation, to infer the overall situation. Exactly how much sampling, this is a very important problem, therefore, is produced in the sampling inspection "small sample theory", this is in the case of the sample is small, the analysis judgment theory.Also called curve fitting and optimal line problem. Some problems need to be according to the experience data to find a theoretical distribution curve, so that the whole problem get understanding. But according to what principles and theoretical curve? How to compare out of several different curve in the same issue? Selecting good curve, is how to determine their error? ...... Is belong to the scope of the optimum line issues of mathematical statistics.Hypothesis testing is only at the time of inspection products with mathematical statistical method, first make a hypothesis, according to the result of sampling in reliable to a certain extent, to judge the null hypothesis.Also called deviation analysis, variance analysis is to use the concept of variance to analyze by a handful of experiment can make the judgment.Due to the random phenomenon is abundant in human practical activities, probability and statistics with the development of modern industry and agriculture, modern science and technology and continuous development, which formed many important branch. Such as stochastic process, information theory, experimental design, limit theory, multivariate analysis, etc.译文:概率论和数理统计简介概率论与数理统计是对随机现象的统计规律进行演绎和归纳的科学,从数量侧面研究随机现象的统计规律性的基础数学学科,概率论与数理统计又可分为概率论和数理统计两个分支。
科技英语第二次课_game_theory
Nash equilibrium
纳什均衡,又称为非合作博弈均衡 纳什均衡 又称为非合作博弈均衡 A Nash equilibrium, named after John Nash, is a set of strategies, one for each player, such that no player has incentive to unilaterally change her action.
Game theory was pioneered by Princeton mathematician John von Neumann.
更多具有代表性的例子可能会导致共同得利博弈和共同损 失博弈,同样的情况还会发生在另外一些冲突中。 失博弈,同样的情况还会发生在另外一些冲突中。
Princeton
当我们把博弈的结果表述为一种均衡的时候,并不能假 当我们把博弈的结果表述为一种均衡的时候, 定博弈的每个参与者的个人最佳策略将带来共同的最优 化结果。 化结果。
Nash’s notion of equilibrium remains an incomplete solution to the problem of circular reasoning in simultaneous-move games.
Prisoners’ dilemma
Two suspects are arrested by the police. The police have insufficient evidence for a conviction, and, having separated both prisoners, visit each of them to offer the same deal. If one testifies (defects from the other) for the prosecution against the other and the other remains silent (cooperates with the other), the betrayer goes free and the silent accomplice receives the full 8-year sentence. If both remain silent, both prisoners are sentenced to only one year in jail for a minor charge. If each betrays the other, each receives a five-year sentence. Each prisoner must choose to betray the other or to remain silent. Each one is assured that the other would not know about the betrayal before the end of the investigation. If we assume that each player cares only about minimizing his or her own time in jail, how should the prisoners act?
数学不好是她何解决的英语作文
数学不好是她何解决的英语作文Growing up, math was never my strong suit. The abstract concepts and complex equations always seemed to elude me, and I often found myself struggling to keep up with my peers in class. However, my lack of aptitude for math did not deter me from finding a solution to my problem. Here's how I tackledthe challenge:1. Acceptance and Acknowledgment: The first step was acknowledging that I had a weakness in math. It was important for me to accept this reality without letting it affect myself-esteem.2. Seeking Help: I reached out to my teachers afterschool for extra help. They were more than willing to provide additional resources and one-on-one tutoring sessions to help me grasp the concepts.3. Study Groups: Joining a study group was a game-changer. My peers and I would meet regularly to work through problems together. This collaborative approach not only helped me understand the material but also made learning more enjoyable.4. Online Resources: I utilized online platforms likeKhan Academy and educational YouTube channels that break down complex math problems into simpler, understandable steps.5. Consistent Practice: I made it a point to practicemath problems daily. Consistency is key when it comes to learning, and the more I practiced, the more comfortable I became with the subject.6. Understanding the 'Why': Instead of just memorizing formulas, I tried to understand the logic behind each mathematical concept. This deeper understanding made it easier for me to apply what I learned to different types of problems.7. Positive Mindset: Maintaining a positive attitude was crucial. I reminded myself that it's okay to make mistakes and that each mistake was a learning opportunity.8. Setting Goals: I set small, achievable goals for myself. Each time I reached a goal, it boosted my confidence and motivated me to keep going.9. Time Management: I learned to manage my study time effectively. I would allocate specific hours to math and ensure that I was fully focused during those times.10. Celebrating Progress: No matter how small the improvement, I made sure to celebrate my progress. This helped to keep me motivated and to appreciate the journey of learning.In conclusion, overcoming my difficulties with math was a process that involved a combination of seeking help,utilizing resources, practicing consistently, and maintaining a positive outlook. It wasn't easy, but with determinationand the right strategies, I was able to improve my math skills significantly.。
Eitan Altman Pricing Differentiated Services A Game-Theoretic Approach
1 Pricing Differentiated Services:A Game-Theoretic ApproachEitan Altman Dhiman Barman Rachid El Azouzi David Ros Bruno TuffinAbstract—The goal of this paper is to study pricing of differentiated services and its impact on the choice of service priority at equilibrium.We consider both TCP connections as well as non controlled(real time) connections.The performance measures(such as throughput and loss rates)are determined according to the operational parameters of a RED buffer management. The latter is assumed to be able to give differentiated services to the applications according to their choice of service class.We consider a best effort type of service differentiation where the QoS of connections is not guaranteed,but by choosing a better(more expensive) service class,the QoS parameters of a session can improve (as long as the service class of other sessions arefixed). The choice of a service class of an application will depend both on the utility as well as on the cost it has to pay.We first study the performance of the system as a function of the connections’parameters and their choice of service classes.We then study the decision problem of how to choose the service classes.We model the problem as a noncooperative game.We establish conditions for an equilibrium to exist and to be uniquely defined.We further provide conditions for convergence to equilibrium from non equilibria initial states.Wefinally study the pricing problem of how to choose prices so that the resulting equilibrium would maximize the network benefit.Keywords:TCP,Buffer Management,RED/AQM, Nash equilibrium,Pricing,Mathematical program-ming/optimization,EconomicsI.I NTRODUCTIONWe study in this paper the performance of com-peting connections that share a bottleneck link.Both TCP connections with controlled rate as well as CBR Address:INRIA,B.P.93,2004Route des Lucioles,06902, Sophia-Antipolis Cedex.The work of these authors was supported by a research contract with France Telecom R&D001B001.111Cummington Street,Dept.of Computer Science,Boston University,Boston,MA02215,USA.The work of this author was performed during internship at INRIA,financed by the INRIA’s PrixNet ARC collaboration projectGET/ENST Bretagne,Rue de la chˆa taigneraie CS17607,35567 Cesson S´e vign´e Cedex,FranceIRISA/INRIA,Campus Universitaire de Beaulieu,35042Rennes Cedex,France (Constant Bit Rate)connections are considered.A RED buffer management is used for early drop of packets. We allow for service differentiation between the con-nections through the rejection probability(as a function of the average queue size),which may depend on the connection(or on the connection class).More specif-ically,we consider a buffer management scheme that uses a single averaged queue length to determine the rejection probabilities(similar to the way it is done in the RIO-C(coupled RIO)buffer management,see[9]); for any given averaged queue size,packets belonging to connections with higher priority have smaller probability of being rejected than those belonging to lower priority classes.To obtain this differentiation in loss probabilities, we assume that the loss curve of RED is scaled by a factor that represents the priority level of the application. We obtain various performance measures of interest such as the throughput,the average queue size and the average drop probability.We then address the question of the choice of pri-orities.Given utilities that depend on the performance measures on one hand and on the cost for a given priority on the other hand,the sessions at the system are faced with a non-cooperative game in which the choice of priority of each session has an impact on the quality of services of other sessions.For the case of CBR traffic, we establish conditions for an equilibrium to exist.We further provide conditions for convergence to equilibrium from non equilibria initial states.We shallfinally study numerically the pricing problem of how the network should choose prices so that the resulting equilibrium would maximize its benefit.We briefly mention some recent work in that area. Reference[5]has considered a related problem where the traffic generated by each session was modeled as a Poisson process,and the service time was exponentially distributed.The decision variables were the input rates and the performance measure was the goodput(output rates).The paper restricted itself to symmetric users and symmetric equilibria and the pricing issue was not considered.In this framework,with a common RED buffer,it was shown that an equilibrium does not exist. An equilibrium was obtained and characterized for an2 alternative buffer management that was proposed,calledVLRED.We note that in contrast to[5],since we alsoinclude in the utility of CBR traffic a penalty for losses(which is supported by studies of voice quality in packet-based telephony[6]),we do obtain an equilibrium whenusing RED.For other related papers,see for instance[8] (in which a priority game is considered for competing connections sharing a drop-tail buffer),[1]as well as the survey[2].In[13],the authors present mechanisms (e.g.,AIMD of TCP)to control end-user transmission rate into differentiated services Internet through poten-tial functions and corresponding convergence to Nash equilibrium.The approach of our pricing problem is related to the Stackelberg methodology for hierarchical optimization: for afixed pricing strategy one seeks the equilibrium among the users(the optimization level corresponding to the“follower”),and then the network(considered as the“leader”)optimizes the pricing strategy.This type of methodology has been used in other contexts of networking in[3],[7].The structure of this paper is as follows.In Section II we describe the model of RED,then in Section III we compute the throughputs and the loss probabilities of TCP and of CBR connections for given priorities chosen by the connections.In Section IV we introduce the model for competition between connections at given prices.In section V we focus on the game in the case of only CBR connections or only TCP connections and provide properties of the equilibrium:existence,uniqueness and convergence.In section VI we provide an algorithm for computing Nash equilibrium for symmetric case.The optimal pricing is then discussed in Section VII.We present numerical examples in sectionVIII to validate the model.II.T HE MODELRED is based on the following idea:there are two thresholds and such that the drop probability is0if the average queue length is less than,1 if it is above,and if it is with;the latter is the conges-tion avoidance mode of operation.This is illustrated in Figure1.We consider a set containing TCPflows(or aggregate offlows)and a set containing real time flows that can be differentiated by RED;they all share a common buffer yet RED treats them differently1.We assume that they all have common values of and 1RED punishes aggressiveflows more by dropping more packets from thoseflows dropprobability1q qmin maxaverage queue length p(i)Fig.1.Drop probability in RED as functionbut eachflow may have a different value of, which is the value of the drop probability as the average queue tends to(from the left).In other words,the slope of the linear part of the curve in Figure1depends theflow:(1)where and are TCPflow’s round trip time and drop probability,respectively.is typically taken as (when the delayed ack option is disabled)or(when it is enabled).We shall assume throughout the paper that the queueing delay is negligible with respect to for the TCP connections.In contrast,the rates,for,of real timeflows are not controlled and are assumed to befixed.Ifwe assume throughout the paper that(unless otherwise specified),otherwise the RED buffer is not a bottleneck.Similarly,if we assume that TCP senders are not limited by the receiver window.In general,since the bottleneck queue is seen as afluid queue,we can writeIf we operate in the linear part of the RED curve then this leads to the system of equations:3 with()unknowns:(average queue length),and,where,is given by(1).Substituting(1)and(2) into thefirst equation of the above set,we obtain a single equation for:,then(3)can be written as a cubic equation in:(4) whereNote that,in the case of only real-time connections ()operating in the linear region,we have(6) (Recall that,throughout the paper,when considering this case we shall assume that.)In the case of only TCP connections()operating in the linear region,we haveand(7)(8)IV.U TILITY,PRICING AND EQUILIBRIUMWe denote a strategy vector by t for allflows such that th entry is.By(),we define a strategy whereflow uses and all otherflows use from vector.We associate toflow a utility.The utility will be a function of the QoS parameters and the price payed byflow,and is determined by the actions of allflows. More precisely,is given bywhere thefirst term stands for the utility for the goodput, the second term stands for the dis-utility for the loss rate and the last term corresponds to the price to be paid byflow to the network.In particular,wefind it natural to assume that a TCP flow has(as lost packets are retransmitted anyhow,and their impact is already taken into account in the throughput).Moreover,since for TCP already includes the loss term,the utility function of TCP is assumed to be4We assume that the strategies or actions available to session are given by a compact set of the form: Eachflow of the network strives tofind its best strategy so as to maximize its own objective function. Nevertheless its objective function depends upon its own choice but also upon the choices of the otherflows.In this situation,the solution concept widely accepted is the concept of Nash equilibrium.Definition1:A Nash equilibrium of the game is a strategy profile wherefrom which noflow have any incentive to deviate.More precisely the strategy profile,is a Nash equilibrium,if the following holds true for anyis the bestflow can do if the otherflows choose the strategies.Note that the network income is given by. Since the’s are functions of and, can include pricing per volume of traffic successfully transmitted.In particular,we allow for to depend on the uncontrolled arrival rates of real-time sessions(but since these are constants,we do not make them appear as an argument of the function).We shall sometimesfind it more convenient to rep-resent the control action of connection as instead of as.Clearly,properties such as existence or uniqueness of equilibrium in terms of directly imply the corresponding properties with respect to.V.E QUILIBRIUM FOR ONLY R EAL-T IME SESSIONSOR ONLY TCP CONNECTIONSWe assume throughout thatfor all connections.The bound for is given so that we have.From(2)we see that with equality obtained only for the case.2In our analysis,we are interested mainly in the lin-ear region.For only real-time sessions or only TCP connections,we state the assumptions and describe the conditions for linear region operations and we show the existence of a Nash equilibrium.2Note that if the assumption does not hold then for some value we would already have for some so one could redefine to be.An important feature in our model is that the queue length beyond which should be the same for all.Theorem1:A sufficient condition for the system to operate in linear region is that for all:1-For only real time connections:and(10)where and.Proof:The condition(9)(resp.(10))will ensure that the value of obtained in the linear region(see(5) (resp.(7)))is not larger that.Indeed,for real time connections,(9)implies thatThe following result establishes the existence of Nash equilibrium for only real time sessions or only TCP connections.Theorem2:Assume that the functions are convex in.Then a Nash equilibrium exists. Proof:See Appendix X-B.A.Supermodular GamesIn Theorem3(resp.Theorem5)we present alterna-tive conditions that provide sufficient conditions for a supermodular structure for real-time connections(resp. for only TCP connections).This implies in particular the existence of an equilibrium.Another implication of su-permodularity is that a simple,so-called tatˆo nnement or Round Robin scheme,for best responses converges to the equilibrium.To describe it,we introduce the following asynchronous dynamic greedy algorithm(GA). Greedy Algorithm:Assume a given initial choice for allflows.At some strictly increasing times, ,flows update their actions;the actions at time are obtained as follows.A singleflow at time updates its so as to optimizewhere is the vector of actions of the otherflows .We assume that eachflow updates its actionsinfinitely often.In particular,for the case of only real time sessions,we update as follows:(11) where in(11)is given by(6).For the TCP-only case,we update as follows:which will lead to update of as follows,and corresponds to utility function of real time session.Then is given by:ifififotherwiseTheorem3:For the case of only real-time connections we assume that,,andwhere and. Then there is smallest equilibrium,and the GA dynamic algorithm converges toleading toIt is non-positive if and only if.A sufficient condition is that Thus the game is super-modular.The result then follows from standard theory of super-modular games[11],[12].(13) Then the game is super-modular.Proof:See Appendix X-D.where denotes(with some abuse of notation)the strat-egy where allflows use,and where the maximization is taken with respect to.Then is a symmetric equilibrium ifTheorem6:Consider real time connections operating in linear region.The symmetric equilibrium satisfies:(15)where andwhich gives when taking the derivativewe obtain(15).To ensure that the symmetric TCPflows operate inthe linear region,we satisfy the condition on.E.Real-time connections and TCPflowsIn this experiment,we combine both real-time andTCP connections.We have,,Mbps,RTT=10ms,Mbps.The highest network revenue is achieved at.In the simulations,we9IX.C ONCLUSIONS AND F UTURE W ORKWe have studied in this paper afluid model of the RED buffer management algorithm with different drop probabilities applied to both UDP and TCP traffic.We first computed the performance measures forfixed drop policies.We then investigated how the drop policies are also convergence properties of best-response dynamics). The equilibrium depends on the pricing strategy of the network provider.Wefinally addressed the problem of optimizing the revenue of the network provider. Concerning the future work,we are working on de-riving sufficient and necessary conditions for operating at the linear region when there are both real time and10other versions of RED will be considered as the gentle-RED variant).We will also examine well thefluid model is suitable for the packet-level that it approximates.R EFERENCEST.Alpcan and T.Basar,“A game-theoretic framework for congestion control in a general topology networks”,41st IEEE Conference on Decision and Control,Las Vegas,Nevada,Dec. 10-13,2002.E.Altman,T.Boulogne,R.El Azouzi,T.Jimenez and L.Wynter,“A survey on networking games”, Telecommunication Systems,2000,under revision.Available at http://www-sop.inria.fr/mistral/personnel/ Eitan.Altman/ntkgame.htmlT.Basar and R.Srikant,“A Stackelberg network game with a large number of followers”,J.Optimization Theory and Applications,115(3):479-490,December2002F.Bernstein and A.Federgruen,“A general equilib-rium model for decentralized supply chains with price-and service-competition”,Available at http://faculty./˜fernando/bio/D.Dutta,A.Goel and J.Heidemann,“Oblivious AQM and Nash Equilibria”,IEEE Infocom,2003.J.Janssen,D.De Vleeschauwer,M.B¨u chli and G.H.Petit,“Assessing voice quality in packet-based telephony”,IEEE Internet Computing,pp.48–56,May–June,2002.Y.A.Korilis,zar and A.Orda,“Achieving network optima using Stackelberg routing strategies”,IEEE/ACM Trans-actions on Networking,5(1),pp.161–173,1997.M.Mandjes,“Pricing strategies under heterogeneous service requirements”,Computer Networks42,pp.231–249,2003. P.Pieda,J.Ethridge,M.Baines and F.Shallwani,A Network Simulator Differentiated Services Implementation,Open IP, Nortel Networks,July,2000.Available at http://www.isi.edu/nsnam/nsJ.B.Rosen,“Existence and uniqueness of equilibrium points for concave N-person games”,Econometrica,33:153–163, 1965.D.Topkis,“Equilibrium points in nonzero-sum n-person sub-modular games”,SIAM J.Control and Optimization,17:773–787,Nov.1979.D.D.Yao,“S-modular games with queueing applications”, Queueing Systems,21:449–475,1995.Youngmi Jin and Geroge Kesidis,“Nash equilibria of a generic networking game with applications to circuit-switched networks”,IEEE INFOCOM’03Numerical Recipes in C,The Art of Scientific Computing,2nd Edition,Section5.6/webRoot/ Books/Numerical_Recipes/bookc.html1X.A PPENDIXProof of part2of Theorem1For only TCP connections,we have,11From equation(7),we get the following sufficient and necessary condition for:or equivalently,A sufficient condition for the latter iswhich is convex in.Hence are concave in and continuous in.The existence then follows from[10]. For TCP connection,we have(18)where.On the other hand,(1)impliesandThen(18)becomes(19) Since the function is convex in,then form(19), it suffices to show that the second derivative of with respect to is non-positive.We havewhere and.Now,we must prove that the second derivative of the functions and are non-positive for all and.We begin by taking the second derivative of.After some simplification,we obtain12 which is positive.For the second function,since thefunction is positive,it suffices to show that the secondderivative of function is non-positive,we havewhich is non-positive.C.Proof of Theorem4Under supermodular condition,to show the unique-ness of Nash equilibrium,it suffices to show that[4],(20) or equivalently,(21) For the case of only real time sessions,.We have,This leads to the sufficient condition:.It follows that Thus a sufficient condition for supermodularity (。
lecture_notes_ch1-4
1IntroductionThis chapter introduces the concept of a game and encourages the reader to begin thinking about the formal analysis of strategic situations.The chapter contains a short history of game theory,followed by a description of“non-cooperative theory”(which the book emphasizes),a discussion of the notion of contract and the related use of“cooperative theory,”and comments on the science and art of applied theoretical work.The chapter explains that the word“game”should be associated with any well-defined strategic situation,not just adversarial contests.Finally,the format and style of the book are described.Lecture NotesThe non-administrative segment of afirst lecture in game theory may run as follows.•Definition of a strategic situation.•Examples(have students suggest some):chess,poker,and other parlor games;tennis,football,and other sports;firm competition,international trade,inter-national relations,firm/employee relations,and other standard economic exam-ples;biological competition;elections;and so on.•Competition and cooperation are both strategic topics.Game theory is a generalmethodology for studying strategic settings(which may have elements of bothcompetition and cooperation).•The elements of a formal game representation.•A few simple examples of the extensive form representation(point out the basiccomponents).Examples and Experiments1.Clap game.Ask the students to stand and then,if they comply,ask them toclap.(This is a silly game.)Show them how to diagram the strategic situationas an extensive form tree.The game starts with your decision about whether toask them to stand.If you ask them to stand,then they(modeled as one player)have to choose between standing and staying in their seats.If they stand,thenyou decide between saying nothing and asking them to clap.If you ask them toclap,then they have to decided whether to clap.Write the outcomes at terminalnodes in descriptive terms such as“professor happy,students confused.”Thenshow how these outcomes can be converted into payoffnumbers.13Instructors' Manual for Strategy:Copyright 2002, 2008 by Joel Watson1INTRODUCTION142.Auction the textbook.Many students will probably not have purchased thetextbook by thefirst class meeting.These students may be interested in pur-chasing the book from you,especially if they can get a good deal.However,quite a few students will not know the price of the book.Without announcingthe bookstore’s price,hold a sealed-bid,first-price auction(using real money).This is a common-value auction with incomplete information.The winning bidmay exceed the bookstore’s price,giving you an opportunity to talk about the“winner’s curse”and to establish a fund to pay students in future classroomexperiments.Instructors' Manual for Strategy:Copyright 2002, 2008 by Joel Watson2The Extensive FormThis chapter introduces the basic components of the extensive form in a non-technical way.Students who learn about the extensive form at the beginning of a course are much better able to grasp the concept of a strategy than are students who are taught the normal formfirst.Since strategy is perhaps the most important concept in game theory,a good understanding of this concept makes a dramatic difference in each student’s ability to progress.The chapter avoids the technical details of the extensive form representation in favor of emphasizing the basic components of games.The technical details are covered in Chapter14.Lecture NotesThe following may serve as an outline for a lecture.•Basic components of the extensive form:nodes,branches.Nodes are wherethings happen.Branches are individual actions taken by the players.•Example of a game tree.•Types of nodes:initial,terminal,decision.•Build trees by expanding,never converging back on themselves.At any placein a tree,you should always know exactly how you got there.Thus,the treesummarizes the strategic possibilities.•Player and action labels.Try not to use the same label for different places wheredecisions are made.•Information sets.Start by describing the tree as a diagram that an externalobserver creates to map out the possible sequences of decisions.Assume theexternal observer sees all of the players’actions.Then describe what it meansfor a player to not know what another player did.This is captured by dashedlines indicating that a player cannot distinguish between two or more nodes.•We assume that the players know the game tree,but that a given player maynot know where he is in the game when he must make any particular decision.•An information set is a place where a decision is made.•How to describe simultaneous moves.•Outcomes and how payoffnumbers represent preferences.15Instructors' Manual for Strategy:Copyright 2002, 2008 by Joel Watson2THE EXTENSIVE FORM16Examples and ExperimentsSeveral examples should be used to explain the components of an extensive form.In addition to some standard economic examples(such asfirm entry into an industry and entrant/incumbent competition),here are a few I routinely use:1.Three-card poker.In this game,there is a dealer(player1)and two potentialbetters(players2and3).There are three cards in the deck:a high card,amiddle card,and a low card.At the beginning of the game,the dealer looks atthe cards and gives one to each of the other players.Note that the dealer candecide which of the cards goes to player2and which of the cards goes to player3.(There is no move by Nature in this game.The book does not deal with movesof Nature until Part IV.You can discuss moves of Nature at this point,but itis not necessary.)Player2does not observe the card dealt to player3,nor doesplayer3observe the card dealt to player2.After the dealer’s move,player2observes his card and then decides whether to bet or to fold.After player2’sdecision,player3observes his own card and also whether player2folded orbet.Then player3must decide whether to fold or bet.After player3’s move,the game ends.Payoffs indicate that each player prefers winning to folding andfolding to losing.Assume the dealer is indifferent between all of the outcomes(or specify some other preference ordering).2.Let’s Make a Deal game.This is the three-door guessing game that was madefamous by Monty Hall and the television game show Let’s Make a Deal.Thegame is played by Monty(player1)and a contestant(player2),and it runs asfollows.First,Monty secretly places a prize(say,$1000)behind one of threedoors.Call the doors a,b,and c.(You might write Monty’s actionsas a ,b ,and c ,to differentiate them from those of the contestant.)Then,without observing Monty’s choice,the contestant selects oneof the doors(by saying“a,”“b,”or“c”).After this,Monty must open one of the doors,but he is not allowedto open the door that is in front of the prize,nor is he allowed to openthe door that the contestant selected.Note that Monty does not havea choice if the contestant chooses a different door than Monty chosefor the prize.The contestant observes which door Monty opens.Notethat she will see no prize behind this door.The contestant then has the option of switching to the other unopeneddoor(S for“switch”)or staying with the door she originally selected(D for“don’t switch”).Finally,the remaining doors are opened and the contestant wins theprize if it is behind the door she chose.The contestant obtains a Instructors' Manual for Strategy:Copyright 2002, 2008 by Joel Watson2THE EXTENSIVE FORM17 payoff1if she wins,zero otherwise.Monty is indifferent between allof the outcomes.For a bonus question,you can challenge the students to draw the extensive formrepresentation of the Let’s Make a Deal game or the Three-Card Poker game.Students who submit a correct extensive form can be given points for the classcompetition.The Let’s Make a Deal extensive form is pictured on the nextpage.Instructors' Manual for Strategy:Copyright 2002, 2008 by Joel Watson2THE EXTENSIVE FORM18Instructors' Manual for Strategy:Copyright 2002, 2008 by Joel Watson3Strategies and the Normal FormAs noted already,introducing the extensive form representation at the beginning ofa course helps the students appreciate the notion of a strategy.A student that doesnot understand the concept of a“complete contingent plan”will fail to grasp the sophisticated logic of dynamic rationality that is so critical to much of game theory.Chapter3starts with the formal definition of strategy,illustrated with some examples.The critical point is that strategies are more than just“plans.”A strategy prescribes an action at every information set,even those that would not be reached because of actions taken at other information sets.Chapter3proceeds to the construction of the normal-form representation,starting with the observation that each strategy profile leads to a single terminal node(an outcome)via a path through the tree.This leads to the definition of a payofffunction.The chapter then defines the normal form representation as comprising a set of players, strategy spaces for the players,and payofffunctions.The matrix form,for two-player,finite games,is illustrated.The chapter then briefly describes seven classic normal form games.The chapter concludes with a few comments on the comparison between the normal and extensive forms.Lecture NotesThe following may serve as an outline for a lecture.•Formal definition of strategy.•Examples of strategies.•Notation:strategy space S i,individual strategy s i∈S i.Example:S i={H,L}and s i=H.•Refer to Appendix A for more on sets.•Strategy profile:s∈S,where S=S1×S2×···×S n(product set).•Notation:i and−i,s=(s i,s−i).•Discuss howfinite and infinite strategy spaces can be described.•Why we need to keep track of a complete contingent plan:(1)It allows theanalysis of games from any information set,(2)it facilitates exploring how aplayer responds to his belief about what the other players will do,and(3)itprescribes a contingency plan if a player makes a mistake.•Describe how a strategy implies a path through the tree,leading to a terminalnode and payoffvector.•Examples of strategies and implied payoffs.19Instructors' Manual for Strategy:Copyright 2002, 2008 by Joel Watson3STRATEGIES AND THE NORMAL FORM20•Definition of payofffunction,u i:S→R,u i(s).Refer to Appendix A for moreon functions.•Example:a matrix representation of players,strategies,and payoffs.(Use anyabstract game,such as the centipede game.)•Formal definition of the normal form.•Note:The matrix representation is possible only for two-player,finite games.Otherwise,the game must be described by sets and equations.•The classic normal form games and some stories.Note the different strategicissues represented:conflict,competition,coordination,cooperation.•Comparing the normal and extensive forms(translating one to the other).Examples and Experiments1.Ultimatum-offer bargaining game.Have students give instructions to others asto how to play the game.Those who play the role of“responder”will have tospecify under what conditions to accept and under what conditions to reject theother player’s offer.This helps solidify that a strategy is a complete contingentplan.2.The centipede game(like the one in Figure3.1(b)if the textbook).As with thebargaining game,have some students write their strategies on paper and givethe strategies to other students,who will then play the game as their agents.Discuss mistakes as a reason for specifying a complete contingent plan.Thendiscuss how strategy specifications helps us develop a theory about why playersmake particular decisions(looking ahead to what they would do at variousinformation sets).3.Any of the classic normal forms.4.The Princess Bride poison scene.Show the“poison”scene(and the few minutesleading to it)from the Rob Reiner movie The Princess Bride.In this scene,protagonist Wesley matches wits with the evil Vizzini.There are two gobletsfilled with wine.Away from Vizzini’s view,Wesley puts poison into one ofthe goblets.Then Wesley sets the goblets on a table,one goblet near himselfand the other near Vizzini.Vizzini must choose from which goblet to drink.Wesley must drink from the other goblet.Several variations of this game can bediagrammed for the students,first in the extensive form and then in the normalform.Instructors' Manual for Strategy:Copyright 2002, 2008 by Joel Watson3STRATEGIES AND THE NORMAL FORM215.A3×3dominance-solvable game,such as the following.The payoffs are in dollars.It is very useful to have the students play a gamesuch as this before you lecture on dominance and best response.This will helpthem to begin thinking about rationality,and their behavior will serve as areference point for formal analysis.Have the students write their strategiesand their names on slips of paper.Collect the slips and randomly select aplayer1and a player2.Pay these two students according to their strategyprofile.Calculate the class distribution over the strategies,which you can lateruse when introducing dominance and iterated dominance.6.Repeated Prisoners’Dilemma.Describe the k-period,repeated prisoners’dilemma.For a bonus question,ask the students to compute the number of strategies forplayer1when k=3.Challenge the students tofind a mathematical expressionfor the number of strategies as a function of k.Instructors' Manual for Strategy:Copyright 2002, 2008 by Joel Watson4Beliefs,Mixed Strategies,and Expected PayoffsThis chapter describes how a belief that a player has about another player’s behavior is represented as a probability distribution.It then covers the idea of a mixed strat-egy,which is a similar probability distribution.The appropriate notation is defined.The chapter defines expected payoffand gives some examples of how to compute it.At the end of the chapter,there are a few comments about cardinal versus ordinal utility(although it is not put in this language)and about how payoffnumbers reflect preferences over uncertain outcomes.Risk preferences are discussed in Chapter25.Lecture NotesThe following may serve as an outline for a lecture.•Example of belief in words:“Player1might say‘I think player2is very likelyto play strategy L.’”•Translate into probability numbers.•Other examples of probabilities.•Notation:µj∈∆S j,µj(s j)∈[0,1], s j∈S jµj(s j)=1.•Examples and alternative ways of denoting a probability distribution:for S j={L,R}andµj∈∆{L,R}defined byµj(L)=1/3andµj(R)=2/3,we canwriteµj=(1/3,2/3).•Mixed strategy.Notation:σi∈∆S i.•Refer to Appendix A for more on probability distributions.•Definition of expected value.Definition of expected payoff.•Examples:computing expected payoffs.•Briefly discuss how payoffnumbers represent preferences over random outcomes,risk.Defer elaboration until later.22Instructors' Manual for Strategy:Copyright 2002, 2008 by Joel WatsonBELIEFS AND EXPECTED PAYOFFS23 Examples and Experiments1.Let’s Make a Deal game again.For the class competition,you can ask thefollowing two bonus questions:(a)Suppose that,at each of his information sets,Monty randomizes by choosing his actions with equal probability.Is it optimal for the contestant to select“switch”or“don’t switch”when she has this choice?Why?(b)Are there conditions(a strategy for Monty)under which it is optimal for the contestant to make the other choice?2.Randomization in sports.Many sports provide good examples of randomizedstrategies.Baseball pitchers may desire to randomize over their pitches,and batters may have probabilistic beliefs about which pitch will be thrown to them.Tennis serve and return play is another good example.11See Walker,M.,and Wooders J.“Minimax Play at Wimbledon,”American Economic Review 91(2001):1521-1538.Instructors' Manual for Strategy: An Introduction to Game Theory Copyright 2002, 2008 by Joel Watson For instructors only; do not distribute.。
behaviors
1
Introduction
The last decade has witnessed increasing interest in the area of formal methods for the specification and analysis of probabilistic systems [22,5,3,20,26,7]. In [28] van Glabbeek et al. classified probabilistic models into reactive, generative
Preprint submitted to Elsevier Science
6 December 2006
1/2
1/2
a
a
b
1/3
b
a
a
b b
1/8 1/8
1/3
2/3
a b
1/2
1/2
2/3
1/2
1/4
a
Hale Waihona Puke (1) reactive(2) generative
(3) stratified
a a
1/2 1/2
The work of Yuxin Deng was done when he was doing his PhD study at INRIA and Universit´ e Paris 7, France, under the support of the EU project PROFUNDIS. The work of Catuscia Palamidessi was partially supported by the Project Rossignol of the ACI S´ ecurit´ e Informatique (Minist` ere de la recherche et nouvelles technologies). An extended abstract of this paper appeared at FOSSACS 2005. Email addresses: deng-yx@ (Yuxin Deng), catuscia@lix.polytechnique.fr (Catuscia Palamidessi).
学术英语理工类abstrcat1,3,5,6,8,9,11,15,18
学术英语理工类abstrcat1,3,5,6,8,9,11,15,18南昌大学学术英语理工类AbstractComputer vulnerabilities are often utilized by hackers or crackers. The security of each computer is challenging. This paper firstly redefines the term “hacker”, “cracker” and “getting inside” the computers and describes the procedure in detail. The term “unauthorized user” (UU) will be a better choice for defining the insider group. The known and unknown vulnerabilities will be taken advantage of by UUs ranging from poor password protection to leaving a computer turned on and physically accessible to visitors in the office. The first step of employing technical exploits will be the determination of the specifications of the target system. There are two ways of attacking including being through capabilities inherent in hypertext transfer protocol (http) and being preprogrammed against specific vulnerabilities and launched without any specific target. The variability of hacking action including the weak system and the strong system warns the users to choose the right way to protect the computerand do not authorize the computer to others easily. Lastly, the solution of avoiding vulnerabilities has been given, including updating patches, making complex passwords, getting information only from the reliable websites or services, updating anti-virus software and backing up the data to protect the computer not being hacked.南昌大学学术英语理工类AbstractThis article aims to account for the advantages of cloud computering.At the begin of this article,it states what cloud computing is and the overall feature of cloud computering.Then the author lists all kinds of the cloud computering in order to reflect the virtue of it:the most basic ones being remote accessibility,lower costs,and quick re-provisions.At the section of Green computeing,it talks about energy efficient usage of computering resources which is an important advantage of the cloud computering.To draw a conclusion,the cloud computering combines remote accessibility,easy expansion,security and environmentally friendly into one,it is making changes to the whole world.南昌大学学术英语理工类AbstractWith the development of social and technology, Artificialintelligence may replace human jobs in the future. There are a lot of news that reported artificial intelligence has play a important role in our life. For decades. People wrote about how machines replace humans. It will be better or ill. But all expected did not come. Around the time of the Revolution. Most of Americans worked in the farm. They farmed to keep themselves alive. With the development of traffic. Farming increasingly became a cash business. But as the agricultural industry grew, there are fewer and fewer workers who worked at farming and ranching. Today agricultural provides fewer than two million jobs. Because of automation happened. It bring better plows, planting and sowing machines. Agricultural become more and more scientific. The farmers’ children found new kinds of jobs in the city, they do not like stay on the farm. The early water-and-steam-powered factories also displace millions of craftsmen, because machine-handing factory workers made the goods better than the goods made by craftsmen. So that the number of factory jobs growing rapidly at that time. The automation of farming, craft work and manufacturing made products. Among them, food become cheaper and cheaper, so people can save money from food, then spend money on other expensive goods. Will A.I. machines takeover the best occupations? The author is optimist and may not agree that machines will replace南昌大学学术英语理工类Abstracthuman jobs.Text 6The article gives detailed explanation for game theory.Firstly the author states the research object of the game theory.Recent research has focused on games that are neitherzero-sum nor purely cooperative,but the games in which palyers must make choice allow for both competition and cooperation.The essence of a game is the interdependence of player strategies.Some games can be “solved" completely like tic-tac-toe,while others such as chess are too complex to perform in practice.A game with simultaneous moves involves a logical circle is squared using a concept of equilibrium developed by John Nash,but this notion remains an incomplete solution to the problem of circular reasoning in simultaneous-move games.Then the author gives several examples including the prisoners' dilemma,mixing moves,strategic moves,bargaining,concealing and revealing information toillustrate some of the fundamentals of game theory.Though game theory has made some progress in solving several situations of conflict and cooperation recently,it remains far from complete and people still need to try more about the design of successful strategy.南昌大学学术英语理工类AbstractText 8At the begin of this article,it states that there are unprecedented multidisciplinary convergence scientists dedicated to the study of a world so small that we can’t see it―even with a light microscope and tells us the important of nanotechnology.Then in order to understand the unusual world of nanotechnology ,we need to get an idea of the units of measure involved.The long of one nanometer is so small.When we measure the atomic scale,we can find that it ’s still small compare to the nannmeter.But in a lecture called “Small Wonders:The World of Nanoscience”,Nobel Prize winne r Dr.Horst Stormer said that the nanoscale is more interstesting than the atomic scale because the nanoscale is the first point where we can assemble something―it’s not until we start putting atoms together that we can make anything useful.Then the article statesthat some predictions of nanotechnology such as the use of the rule of quantum mechanics,nanorobot.It nanotechnology in future.南昌大学学术英语理工类AbstractText 11The principal risks associated with nuclear power arise from health effects of radiation.The radiation mainly comes form the radioactive material.They can penetrate deep inside the human body where they can damage biological cells and thereby initiate a cancer. If they strike sex cells, they can cause genetic diseases in progeny.But the rate of the latter is far less than the former. Reactor accidents is also one of the risk of nuclear power.But the nuclear power plant design strategy for preventing accidents ,back-up system and mitigating their potential effects is “defence in depth”, so they happen probability is exceedingly small. If they all fails,very high radiation doses can destroy body functions and lead to death within 60 days.The radioactive waste products from the nuclear industry must be isolated from contact withpeople for very long time periods. The bulk of the radioactivity is contained in the spent fuel, which is quite small involume and therefore easily handled with great care. At other radiation problems,for example, exploitation of materials and transport of radioactive materials also produce radiation.The effects of routine releases of radioactivity from nuclear plants depend somewhat on how the spent fuel is handled.南昌大学学术英语理工类AbstractText 15Genetically modified food caused a fierce debate in the modern society, especially in long agrarian tradition and vocal green lobbies in the country。
Finite variable logics in descriptive complexity theory
FINITE VARIABLE LOGICS IN DESCRIPቤተ መጻሕፍቲ ባይዱIVE COMPLEXITY THEORY
MARTIN GROHE
Throughout the development of finite model theory, the fragments of firstorder logic with only finitely many variables have played a central role. This survey gives an introduction to the theory of finite variable logics and reports on recent progress in the area. For each k ≥ 1 we let Lk be the fragment of first-order logic consisting of all formulas with at most k (free or bound) variables. The logics Lk are the simplest finite-variable logics. Later, we are going to consider infinitary variants and extensions by so-called counting quantifiers. Finite variable logics have mostly been studied on finite structures. Like the whole area of finite model theory, they have interesting model theoretic, complexity theoretic, and combinatorial aspects. For finite structures, first-order logic is often too expressive, since each finite structure can be characterized up to isomorphism by a single first-order sentence, and each class of finite structures that is closed under isomorphism can be characterized by a first-order theory. The finite variable fragments seem to be promising candidates with the right balance between expressive power and weakness for a model theory of finite structures. This may have motivated Poizat [67] to collect some basic model theoretic properties of the Lk . Around the same time Immerman [45] showed that important complexity classes such as polynomial time (PTIME) or polynomial space (PSPACE) can be characterized as collections of all classes of (ordered) finite structures definable by uniform sequences of first-order formulas with a fixed number of variables and varying quantifier-depth. Although these early results from descriptive complexity theory have been put in much more elegant forms later using so-called fixed-point logics, the importance of the number of variables as a complexity measure remained. In 1990, Kolaitis and Vardi [52] proved a 0-1 law for the infinitary finite variable logics. As a corollary, they re-proved a result of Blass, Gurevich, and Kozen [7] that there is a 0-1 law for least-fixed point logic. The fact that makes Kolaitis’ and Vardi’s paper so remarkable is that it uses finite variable logics as a technical tool to obtain results concerning fixed-point logics, which are
产业组织理论 教案 Part 3
Types of games(2)
• In dynamic games we can distinguish between games of perfect information, where all players know the entire history of the game when it is their turn to move, and games of imperfect information in which at least some players have only a partial idea of the history of the game when it is their turn to move. • In a game of complete information, players know not only their own payoffs, but also the payoffs of all the other players. In a game of incomplete information, players know their own payoffs, but there are some players who do not know the payoffs of some of the other players. • We can distinguish between 4 types of games: (a) static games of complete information; (b) dynamic games of complete information; (c) static games of incomplete information; (d) dynamic games of incomplete information.
Approximating game-theoretic optimal strategies for full-scale poker
Approximating Game-Theoretic Optimal Strategies for Full-scale Poker D.Billings,N.Burch,A.Davidson,R.Holte,J.Schaeffer,T.Schauenberg,and D.SzafronDepartment of Computing Science,University of AlbertaEdmonton,Alberta,T6G2E8,CanadaEmail:darse,burch,davidson,holte,jonathan,terence,duane@cs.ualberta.caAbstractThe computation of thefirst complete approxima-tions of game-theoretic optimal strategies for full-scale poker is addressed.Several abstraction tech-niques are combined to represent the game of2-player Texas Hold’em,having size,usingclosely related models each having size.Despite the reduction in size by a factor of100billion,the resulting models retain the key prop-erties and structure of the real game.Linear pro-gramming solutions to the abstracted game are usedto create substantially improved poker-playing pro-grams,able to defeat strong human players and becompetitive against world-class opponents.1IntroductionMathematical game theory was introduced by John von Neu-mann in the1940s,and has since become one of the founda-tions of modern economics[von Neumann and Morgenstern, 1944].V on Neumann used the game of poker as a basic model for2-player zero-sum adversarial games,and proved thefirst fundamental result,the famous minimax theorem.A few years later,John Nash added results for-player non-cooperative games,for which he later won the Nobel Prize [Nash,1950].Many decision problems can be modeled using game theory,and it has been employed in a wide variety of domains in recent years.Of particular interest is the existence of optimal solutions, or Nash equilibria.An optimal solution provides a random-ized mixed strategy,basically a recipe of how to play in each possible ing this strategy ensures that an agent will obtain at least the game-theoretic value of the game,re-gardless of the opponent’s strategy.Unfortunately,finding exact optimal solutions is limited to relatively small problem sizes,and is not practical for most real domains.This paper explores the use of highly abstracted mathemat-ical models which capture the most essential properties of the real domain,such that an exact solution to the smaller prob-lem provides a useful approximation of an optimal strategy for the real domain.The application domain used is the game of poker,specifically Texas Hold’em,the most popular form of casino poker and the poker variant used to determine the world champion at the annual World Series of Poker.Due to the computational limitations involved,only simpli-fied poker variations have been solved in the past(e.g.[Kuhn, 1950;Sakaguchi and Sakai,1992]).While these are of the-oretical interest,the same methods are not feasible for real games,which are too large by many orders of magnitude ([Koller and Pfeffer,1997]).[Shi and Littman,2001]investigated abstraction tech-niques to reduce the large search space and complexity of the problem,using a simplified variant of poker.[Takusagawa, 2000]created near-optimal strategies for the play of three specific Hold’emflops and betting sequences.[Selby,1999] computed an optimal solution for the abbreviated game of preflop Hold’em.Using new abstraction techniques,we have produced vi-able“pseudo-optimal”strategies for the game of2-player Texas Hold’em.The resulting poker-playing programs have demonstrated a tremendous improvement in performance. Whereas the previous best poker programs were easily beaten by any competent human player,the new programs are capa-ble of defeating very strong players,and can hold their own against world-class opposition.Although some domain-specific knowledge is an asset in creating accurate reduced-scale models,analogous methods can be developed for many other imperfect information do-mains and generalized game trees.We describe a general method of problem reformulation that permits the indepen-dent solution of sub-trees by estimating the conditional prob-abilities needed as input for each computation.This paper makes the following contributions:1.Abstraction techniques that can reduce anpoker search space to a manageable,without losing the most important properties of the game.2.A poker-playing program that is a major improvementover previous efforts,and is capable of competing with world-class opposition.2Game TheoryGame theory encompasses all forms of competition between two or more agents.Unlike chess or checkers,poker is a game of imperfect information and chance outcomes.It can be represented with an imperfect information game tree hav-ing chance nodes and decision nodes,which are grouped into information sets.Since the nodes in this tree are not independent,divide-and-conquer methods for computing sub-trees(such as the alpha-beta algorithm)are not applicable.For a more detailed description of imperfect information game tree structure,see [Koller and Megiddo,1992].A strategy is a set of rules for choosing an action at ev-ery decision node of the tree.In general,this will be a ran-domized mixed strategy,which is a probability distribution over the various alternatives.A player must use the same pol-icy across all nodes in the same information set,since from that player’s perspective they are indistinguishable from each other(differing only in the hidden information component). The conventional method for solving such a problem is to convert the descriptive representation,or extensive form,into a system of linear equations,which is then solved by a lin-ear programming(LP)system such as the Simplex algorithm. The optimal solutions are computed simultaneously for all players,ensuring the best worst-case outcome for each player. Traditionally,the conversion to normal form was accom-panied by an exponential blow-up in the size of the prob-lem,meaning that only very small problem instances could be solved in practice.[Koller et al.,1994]described an alter-nate LP representation,called sequence form,which exploits the common property of perfect recall(wherein all players know the preceding history of the game),to obtain a system of equations and unknowns that is only linear in the size of the game tree.This exponential reduction in representation has re-opened the possibility of using game-theoretic analy-sis for many domains.However,since the game tree itself can be very large,the LP solution method is still limited to moderate problem sizes(normally less than a billion nodes).3Texas Hold’emA game(or hand)of Texas Hold’em consists of four stages, each followed by a round of betting:Preflop:Each player is dealt two private cards face down (the hole cards).Flop:Three community cards(shared by all players)are dealt face up.Turn:A single community card is dealt face up.River:Afinal community card is dealt face up.After the betting,all active players reveal their hole cards for the showdown.The player with the bestfive-card poker hand formed from their two private cards and thefive public cards wins all the money wagered(ties are possible).The game starts off with two forced bets(the blinds)put into the pot.When it is a player’s turn to act,they must ei-ther bet/raise(increase their investment in the pot),check/call (match what the opponent has bet or raised),or fold(quit and surrender all money contributed to the pot).The best-known non-commercial Texas Hold’em program is Poki.It has been playing online since1997and has earned an impressive winning record,albeit against generally weak opposition[Billings et al.,2002].The system’s abilities are based on enumeration and simulation techniques,expert knowledge,and opponent modeling.The program’s weak-nesses are easily exploited by strong players,especially in the2-playergame.Figure1:Branching factors for Hold’em and abstractions.4AbstractionsTexas Hold’em has an easily identifiable structure,alternat-ing between chance nodes and betting rounds in four distinct stages.A high-level view of the imperfect information game tree is shown in Figure1.Hold’em can be reformulated to produce similar but much smaller games.The objective is to reduce the scale of the problem without severely altering the fundamental structure of the game,or the resulting optimal strategies.There are many ways of doing this,varying in the overall reduction and in the accuracy of the resulting approximation.Some of the most accurate abstractions include suit equiv-alence isomorphisms(offering a reduction of at most a factor of),rank equivalence(only under certain conditions), and rank near-equivalence.The optimal solutions to these ab-stracted problems will either be exactly the same or will have a small bounded error,which we refer to as near-optimal so-lutions.Unfortunately,the abstractions which produce an ex-act or near-exact reformulation do not produce the very large reductions required to make full-scale poker tractable.A common method for controlling the game size is deck ing less than the standard52-card deck greatly reduces the branching factor at chance nodes.Other methods include reducing the number of cards in a player’s hand(e.g. from a2-card hand to a1-card hand),and reducing the num-ber of board cards(e.g.a1-cardflop),as was done by[Shi and Littman,2001]for the game of Rhode Island Hold’em. [Koller and Pfeffer,1997]used such parameters to generate a wide variety of tractable games to solve with their Gala sys-tem.We have used a number of small and intermediate sized games,ranging from eight cards(two suits,four ranks)to24 cards(three suits,eight ranks)for the purpose of studying abstraction methods,comparing the results with known exact or near-optimal solutions.However,these smaller games are not suitable for use as an approximation for Texas Hold’em, as the underlying structures of the games are different.To produce good playing strategies for full-scale poker,we look for abstractions of the real game which do not alter that basicstructure.The abstraction techniques used in practice are powerful in terms of reducing the problem size,and subsume those previously mentioned.However,since they are also much cruder,we call their solutions pseudo-optimal,to emphasize that there is no guarantee that the resulting approximations will be accurate,or even reasonable.Some will be low-risk propositions,while others will require empirical testing to de-termine if they have merit.4.1Betting round reductionThe standard rules of limit Hold’em allow for a maximum of four bets per player per round.1Thus in2-player limit poker there are19possible betting sequences,of which two do not occur in practice.2Of the remaining17sequences,8end in a fold(leading to a terminal node in the game tree),and9end in a call(carrying forward to the next chance node).Using ,,,,, and capital letters for the second player,the tree of possible betting sequences for each round is:kK kBf kBc kBrF kBrC kBrRf kBrRc kBrRrF kBrRrC bF bC bRf bRc bRrF bRrC bRrRf bRrRc We call this local collection of decision nodes a betting tree,and represent it diagramatically with a triangle.With betting round reduction,each player is allowed a maximum of three bets per round,thereby eliminating the last two sequences in each line.The effective branching factor of the betting tree is reduced from nine to seven.This does not appear to have a substantial effect on play,or on the expected value(EV)for each player.This observation has been verified experimentally.In contrast,we computed the corresponding postflop models with a maximum of two bets per player per round,and found radical changes to the optimal strategies, strongly suggesting that that level of abstraction is not safe. 4.2Elimination of betting roundsLarge reductions in the size of a poker game tree can be ob-tained by elimination of betting rounds.There are several ways to do this,and they generally have a significant impact on the nature of the game.First,the game may be truncated, by eliminating the last round or rounds.In Hold’em,ignor-ing the last board card and thefinal betting round produces a 3-round model of the actual4-round game.The solution to the3-round model loses some of the subtlety involved in the true optimal strategy,but the degradation applies primarily to advanced tactics on the turn.There is a smaller effect on the flop strategy,and the strategy for thefirst betting round may have no significant changes,since it incorporates all the out-comes of two future betting rounds.We use this particular abstraction to define an appropriate strategy for play in the first round,and thus call it a preflop model(see Figure2).same information set.3The sub-trees for our postflop models can be computed in isolation,provided that the appropriate preconditions are given as input.Unfortunately,knowing the correct conditional probabilities would normally entail solv-ing the whole game,so there would be no advantage to the decomposition.For simple postflop models,we dispense with the prior probabilities.For the postflop models used in PsOpti0and PsOpti1,we simply ignore the implications of the preflop betting actions,and assume a uniform distribution over all possible hands for each player.Different postflop solutions were computed for initial pot sizes of two,four,six,and eight bets(corresponding to preflop sequences with zero,one,two, or three raises,but ignoring which player initially made each raise).In PsOpti1,the four postflop solutions are simply ap-pended to the Selby preflop strategy(Figure2).Although these simplifying assumptions are technically wrong,the re-sulting play is still surprisingly effective.A better way to compose postflop models is to estimate the conditional probabilities,using the solution to a preflop model.With a tractable preflop model,we have a means of estimating an appropriate strategy at the root,and thereby de-termine the consequent probability distributions.In PsOpti2,a3-round preflop model was designed and solved.The resulting pseudo-optimal strategy for the pre-flop(which was significantly different from the Selby strat-egy)was used to determine the corresponding distribution of hands for each player in each context.This provided the nec-essary input parameters for each of the seven preflop betting sequences that carry over to theflop stage.Since each of these postflop models has been given(an approximation of) the perfect recall knowledge of the full game,they are fully compatible with each other,and are properly integrated un-der the umbrella of the preflop model(Figure2).In theory, this should be equivalent to computing the much larger tree, but it is limited by the accuracy and appropriateness of the proposed preflop betting model.4.4Abstraction by bucketingThe most important method of abstraction for the computa-tion of our pseudo-optimal strategies is called bucketing.This is an extension of the natural and intuitive concept that has been applied many times in previous research(e.g.[Sklansky and Malmuth,1994][Takusagawa,2000][Shi and Littman, 2001]).The set of all possible hands is partitioned into equiv-alence classes(also called buckets or bins).A many-to-one mapping function determines which hands will be grouped together.Ideally,the hands should be grouped according to strategic similarity,meaning that they can all be played in a similar manner without much loss in EV.If every hand was played with a particular pure strategy (ie.only one of the available choices),then a perfect mapping function would group all hands that follow the same plan,andFigure3:Transition probabilities(six buckets per player). The number of buckets that can be used in conjunction witha3-round model is very small,typically six or seven for each player(ie.36or49pairs of bucket assignments).Obviouslythis results in a very coarse-grained abstract game,but it maynot be substantially different from the number of distinctions an average human player might make.Regardless,it is thebest we can currently do given the computational constraintsof this approach.Thefinal thing needed to sever the abstract game from the underlying real game tree are the transition probabilities.Thechance node between theflop and turn represents a particularcard being dealt from the remaining stock of45cards.In the abstract game,there are no cards,only buckets.The effect ofthe turn card in the abstract game is to dictate the probabilityof moving from one pair of buckets on theflop to any pair of buckets on the turn.Thus the collection of chance nodes inthe game tree is represented by an to tran-sition network as shown in Figure3.For postflop models, this can be estimated by walking the entire tree,enumeratingall transitions for a small number of characteristicflops.Forpreflop models,the full enumeration is more expensive(en-compassing all possibleflops),so it is estimated either by sampling,or by(parallel)enumeration of a truncated tree.For a3-round postflop model,we can comfortably solve abstract games with up to seven buckets for each player in each round.Changing the distribution of buckets,such as six for theflop,seven for the turn,and eight for the river,does not appear to significantly affect the quality of the solutions, better or worse.Thefinal linear programming solution produces a large ta-ble of mixed strategies(probabilities for fold,call,or raise) for every reachable scenario in the abstract game.To use this, the poker-playing program looks for the corresponding situa-tion based on the same hand strength and potential measures, and randomly selects an action from the mixed strategy. The large LP computations typically take less than a day (using CPLEX with the barrier method),and use up to two Gigabytes of rger problems will exceed available memory,which is common for large LP systems.Certain LP techniques such as constraint generation could potentially extend the range of solvable instances considerably,but this would probably only allow the use of one or two additional buckets per player.5Experiments5.1Testing against computer playersA series of matches between computer programs was con-ducted,with the results shown in Table1.Win rates are mea-sured in small bets per hand(sb/h).Each match was run for at least20,000games(and over100,000games in some cases). The variance per game depends greatly on the styles of the two players involved,but is typically+/-6sb.The standard deviation for each match outcome is not shown,but is nor-mally less than+/-0.03sb/h.The“bot players”were:PsOpti2,composed of a hand-crafted3-round preflop model,providing conditional probability distributions to each of seven3-round postflop models(Figure2).All models in this prototype used six buckets per player per round. PsOpti1,composed of four3-round postflop models un-der the naive uniform distribution assumption,with7buck-ets per player per round.Selby’s optimal solution for preflop Hold’em is used to play the preflop([Selby,1999]).PsOpti0,composed of a single3-round postflop model, wrongly assuming uniform distributions and an initial pot size of two bets,with seven buckets per player per round.This program used an always-call policy for the preflop betting round.Poki,the University of Alberta poker program.This older version of Poki was not designed to play the2-player game, and can be defeated rather easily,but is a useful benchmark. Anti-Poki,a rule-based program designed to beat Poki by exploiting its weaknesses and vulnerabilities in the2-player game.Any specific counter-strategy can be even more vul-nerable to adaptive players.Aadapti,a relatively simple adaptive player,capable of slowly learning and exploiting persistent patterns in play. Always Call,very weak benchmark strategy.Always Raise,very weak benchmark strategy.It is important to understand that a game-theoretic optimal player is,in principle,not designed to win.Its purpose is to not lose.An implicit assumption is that the opponent is also playing optimally,and nothing can be gained by observing the opponent for patterns or weaknesses.In a simple game like RoShamBo(also known as Rock-Paper-Scissors),playing the optimal strategy ensures a break-even result,regardless of what the opponent does,and is therefore insufficient to defeat weak opponents,or to win a tournament([Billings,2000]).Poker is more complex,and in theory an optimal player can win,but only if the oppo-nent makes dominated errors.Any time a player makes any choice that is part of a randomized mixed strategy of some game-theoretic optimal policy,that decision is not dominated. In other words,it is possible to play in a highly sub-optimal manner,but still break even against an optimal player,be-cause those choices are not strictly dominated.Since the pseudo-optimal strategies do no opponent model-ing,there is no guarantee that they will be especially effective against very bad or highly predictable players.They must rely only on these fundamental strategic errors,and the margin of victory might be relatively modest as a result.No.24681+0.090+0.251+0.047+0.635 PsOpti2-0.090+0.069+0.054+0.5053-0.069+0.163+0.001+0.118 Aadapti-0.251-0.163+0.178+0.9055-0.054-0.178+0.385+0.541 Poki-0.047-0.001-0.385+0.5377-0.505-0.905-0.537=0.000 Always Raise-0.635-0.118-0.541=0.000Posn1sb/hMaster-1early1147+0.360Master-1late2880+0.396Experienced-1803+0.002Experienced-21001-0.168Experienced-31378-0.016Experienced-41086-0.039Intermediate-12448+0.203Novice-11277-0.154-0.015Table2:Human vs PsOpti2matches.Player Hands Posn2-0.006+0.048+0.141+0.228-0.007+0.014+0.047+0.209-0.058+0.053+0.152+0.260-0.252-0.062-0.250-0.239-0.145-0.049-0.182+0.110-0.222-0.116-0.255-0.197-0.369-0.210-0.053-0.219-0.571-0.433All Opponents46479-300-200-10010020030040050001000200030004000500060007000S m a l l B e t s W o nHands Playedthecount (+0.046)Figure 4:Progress of the “thecount”vs PsOpti1me figured out now”,indicating that its opponent model was accurate,when in fact the pseudo-optimal player is oblivious and does no modeling at all.It was also evident that these programs do considerably better in practice than might be expected,due to the emo-tional frailty of their human opponents.Many players com-mented that playing against the pseudo-optimal opponent was an exasperating experience.The bot routinely makes uncon-ventional plays that confuse and confound humans.Invari-ably,some of these “bizarre”plays happen to coincide with a lucky escape,and several of these bad beats in quick succes-sion will often cause strong emotional reactions (sometimes referred to as “going on tilt”).The level of play generally goes down sharply in these circumstances.This suggests that a perfect game-theoretic optimal poker player could perhaps beat even the best humans in the long run,because any EV lost in moments of weakness would never be regained.However,the win rate for such a program could still be quite small,giving it only a slight advantage.Thus it would be unable to exert its superiority convincingly over the short term,such as the few hundred hands of one session,or over the course of a world championship tourna-ment.Since even the best human players are known to have biases and weaknesses,opponent modeling will almost cer-tainly be necessary to produce a program that surpasses all human players.5.3Testing against a world-class playerThe elite poker expert was Gautam Rao ,who is known as “thecount”or “CountDracula”in the world of popular online poker rooms.Mr.Rao is the #1all-time winner in the history of the oldest online game,by an enormous margin over all other players,both in total earnings and in dollar-per-hand rate.His particular specialty is in short-handed games with five or fewer players.He is recognized as one of the best players in the world in these games,and is also exceptional at 2-player Hold’em.Like many top-flight players,he has a dynamic ultra-aggressive style.Mr.Rao agreed to play an exhibition match againstPsOpti1,playing more than 7000hands over the course of several days.The graph in Figure 4shows the progression of the match.The pseudo-optimal player started with some good fortune,but lost at a rate of about -0.2sb/h over the next 2000hands.Then there was a sudden reversal,following a series of for-tuitous outcomes for the program.Although “thecount”is renown for his mental toughness,an uncommon run of bad luck can be very frustrating even for the most experienced players.Mr.Rao believes he played below his best level dur-ing that stage,which contributed to a dramatic drop where he lost 300sb in less than 400hands.Mr.Rao resumed play the following day,but was unable to recover the losses,slipping further to -200sb after 3700hands.At this point he stopped play and did a careful reassessment.It was clear that his normal style for maximizing income against typical human opponents was not effective against the pseudo-optimal player.Whereas human players would nor-mally succumb to a lot of pressure from aggressive betting,the bot was willing to call all the way to the showdown with as little as a Jack or Queen high card.That kind of play would be folly against most opponents,but is appropriate against an extremely aggressive opponent.Most human players fail to make the necessary adjustment under these atypical condi-tions,but the program has no sense of fear.Mr.Rao changed his approach to be less aggressive,with immediate rewards,as shown by the +600sb increase over the next 1100hands (some of which he credited to a good run of cards).Mr.Rao was able to utilize his knowledge that the computer player did not do any opponent modeling.Knowing this allows a human player to systematically probe for weak-nesses,without any fear of being punished for playing in a methodical and highly predictable manner,since an oblivious opponent does not exploit those patterns and biases.Although he enjoyed much more success in the match from that point forward,there were still some “adventures”,such as the sharp decline at 5400hands.Poker is a game of very high variance,especially between two opponents with sharp styles,as can be seen by the dramatic swings over the course of this match.Although 7000games may seem like a lot,Mr.Rao’s victory in this match was still not statistically conclusive.We now believe that a human poker master can eventu-ally gain a sizable advantage over these pseudo-optimal pro-totypes (perhaps +0.20sb/h or more is sustainable).However,it requires a good understanding of the design of the program and its resulting weaknesses.That knowledge is difficult to learn during normal play,due to the good information hiding provided by an appropriate mixture of plans and tactics.This “cloud of confusion”is a natural barrier to opponent learning.It would be even more difficult to learn against an adaptive program with good opponent modeling,since any methodical testing by the human would be easily exploited.This is in stark contrast to typical human opponents,who can often be accurately modeled after only a small number of hands.6Conclusions and Future WorkThe pseudo-optimal players presented in this paper are the first complete approximations of a game-theoretic optimal strategy for a full-scale variation of real poker.Several abstraction techniques were explored,resulting in the reasonably accurate representation of a large imperfect information game tree having nodes with a small collection of models of size.Despite these massivereductions and simplifications,the resulting programs play respectably.For thefirst time ever,computer programs are not completely outclassed by strong human opposition in the game of2-player Texas Hold’em.Useful abstractions included betting tree reductions,trun-cation of betting rounds combined with EV leaf nodes,and bypassing betting rounds.A3-round model anchored at the root provided a pseudo-optimal strategy for the preflop round,which in turn provided the proper contextual informa-tion needed to determine conditional probabilities for post-flop models.The most powerful abstractions for reducing the problem size were based on bucketing,a method for parti-tioning all possible holdings according to strategic similarity. Although these methods exploit the particular structure of the Texas Hold’em game tree,the principles are general enough to be applied to a wide variety of imperfect information do-mains.Many refinements and improvements will be made to the basic techniques in the coming months.Further testing will also continue,since accurate assessment in a high variance domain is always difficult.The next stage of the research will be to apply these tech-niques to obtain approximations of Nash equilibria for-player Texas Hold’em.This promises to be a challenging ex-tension,since multi-player games have many properties that do not exist in the2-player game.Finally,having reasonable approximations of optimal strategies does not lessen the importance of good oppo-nent modeling.Learning against an adaptive adversary in a stochastic game is a challenging problem,and there will be many ideas to explore in combining the two different forms of information.That will likely be the key difference between a program that can compete with the best,and a program that surpasses all human players.Quoting“thecount”:“You have a very strong program.Once you addopponent modeling to it,it will kill everyone.”AcknowledgmentsThe authors would like to thank Gautam Rao,Sridhar Mutyala,and the other poker players for donating their valu-able time.We also wish to thank Daphne Koller,Michael Littman,Matthew Ginsberg,Rich Sutton,David McAllester, Mason Malmuth,and David Sklansky for their valuable in-sights in past discussions.This research was supported in part by grants from the Nat-ural Sciences and Engineering Research Council of Canada (NSERC),the Alberta Informatics Circle of Research Excel-lence(iCORE),and an Izaak Walton Killam Memorial post-graduate scholarship.References[Billings et al.,2002]D.Billings,A.Davidson,J.Schaeffer, and D.Szafron.The challenge of poker.Artificial Intelli-gence,134(1–2):201–240,2002.[Billings,2000]D.Billings.Thefirst international roshambo programming competition.International Computer Games Association Journal,23(1):3–8,42–50,2000. [Koller and Megiddo,1992]D.Koller and N.Megiddo.The complexity of two-person zero-sum games in extensive form.Games and Economic Beh.,4(4):528–552,1992. [Koller and Pfeffer,1997]D.Koller and A.Pfeffer.Repre-sentations and solutions for game-theoretic problems.Ar-tificial Intelligence,pages167–215,1997.[Koller et al.,1994]D.Koller,N.Megiddo,and B.von Sten-gel.Fast algorithms forfinding randomized strategies in game trees.STOC,pages750–759,1994.[Kuhn,1950]H.W.Kuhn.A simplified two-person poker.Contributions to the Theory of Games,1:97–103,1950. [Nash,1950]J.Nash.Equilibrium points in n-person games.National Academy of Sciences,36:48–49,1950. [Sakaguchi and Sakai,1992]M.Sakaguchi and S.Sakai.Solutions of some three-person stud and draw poker.Mathematics Japonica,pages1147–1160,1992. [Selby,1999]A.Selby.Optimal heads-up preflop poker./simplex. [Shi and Littman,2001]J.Shi and M.Littman.Abstrac-tion models for game theoretic poker.In Computers and Games,pages333–345.Springer-Verlag,2001. [Sklansky and Malmuth,1994]D.Sklansky and M.Mal-muth.Texas Hold’em for the Advanced Player.Two Plus Two Publishing,2nd edition,1994.[Takusagawa,2000]K.Takusagawa.Nash equilibrium of Texas Hold’em poker,2000.Undergraduate thesis,Com-puter Science,Stanford University.[von Neumann and Morgenstern,1944]J.von Neumann and O.Morgenstern.Theory of Games and Economic Behav-ior.Princeton University Press,1944.。
Game Theory 2
GAME THEORYThomas S.Ferguson Part II.Two-Person Zero-Sum Games1.The Strategic Form of a Game.1.1Strategic Form.1.2Example:Odd or Even.1.3Pure Strategies and Mixed Strategies.1.4The Minimax Theorem.1.5Exercises.2.Matrix Games.Domination.2.1Saddle Points.2.2Solution of All2by2Matrix Games.2.3Removing Dominated Strategies.2.4Solving2×n and m×2Games.2.5Latin Square Games.2.6Exercises.3.The Principle of Indifference.3.1The Equilibrium Theorem.3.2Nonsingular Game Matrices.3.3Diagonal Games.3.4Triangular Games.3.5Symmetric Games.3.6Invariance.3.7Exercises.4.Solving Finite Games.4.1Best Responses.4.2Upper and Lower Values of a Game.4.3Invariance Under Change of Location and Scale.4.4Reduction to a Linear Programming Problem.4.5Description of the Pivot Method for Solving Games.4.6A Numerical Example.4.7Exercises.5.The Extensive Form of a Game.5.1The Game Tree.5.2Basic Endgame in Poker.5.3The Kuhn Tree.5.4The Representation of a Strategic Form Game in Extensive Form.5.5Reduction of a Game in Extensive Form to Strategic Form.5.6Example.5.7Games of Perfect Information.5.8Behavioral Strategies.5.9Exercises.6.Recursive and Stochastic Games.6.1Matrix Games with Games as Components.6.2Multistage Games.6.3Recursive Games. -Optimal Strategies.6.4Stochastic Movement Among Games.6.5Stochastic Games.6.6Approximating the Solution.6.7Exercises.7.Continuous Poker Models.7.1La Relance.7.2The von Neumann Model.7.3Other Models.7.4Exercises.References.Part II.Two-Person Zero-Sum Games1.The Strategic Form of a Game.The individual most closely associated with the creation of the theory of games is John von Neumann,one of the greatest mathematicians of this century.Although others preceded him in formulating a theory of games-notably´Emile Borel-it was von Neumann who published in1928the paper that laid the foundation for the theory of two-person zero-sum games.Von Neumann’s work culminated in a fundamental book on game theory written in collaboration with Oskar Morgenstern entitled Theory of Games and Economic Behavior,1944.Other more current books on the theory of games may be found in the text book,Game Theory by Guillermo Owen,2nd edition,Academic Press,1982,and the expository book,Game Theory and Strategy by Philip D.Straffin,published by the Mathematical Association of America,1993.The theory of von Neumann and Morgenstern is most complete for the class of games called two-person zero-sum games,i.e.games with only two players in which one player wins what the other player loses.In Part II,we restrict attention to such games.We will refer to the players as Player I and Player II.1.1Strategic Form.The simplest mathematical description of a game is the strate-gic form,mentioned in the introduction.For a two-person zero-sum game,the payofffunction of Player II is the negative of the payoffof Player I,so we may restrict attention to the single payofffunction of Player I,which we call here L.Definition1.The strategic form,or normal form,of a two-person zero-sum game is given by a triplet(X,Y,A),where(1)X is a nonempty set,the set of strategies of Player I(2)Y is a nonempty set,the set of strategies of Player II(3)A is a real-valued function defined on X×Y.(Thus,A(x,y)is a real number for every x∈X and every y∈Y.)The interpretation is as follows.Simultaneously,Player I chooses x∈X and Player II chooses y∈Y,each unaware of the choice of the other.Then their choices are made known and I wins the amount A(x,y)from II.Depending on the monetary unit involved, A(x,y)will be cents,dollars,pesos,beads,etc.If A is negative,I pays the absolute value of this amount to II.Thus,A(x,y)represents the winnings of I and the losses of II.This is a very simple definition of a game;yet it is broad enough to encompass the finite combinatorial games and games such as tic-tac-toe and chess.This is done by being sufficiently broadminded about the definition of a strategy.A strategy for a game of chess,for example,is a complete description of how to play the game,of what move to make in every possible situation that could occur.It is rather time-consuming to write down even one strategy,good or bad,for the game of chess.However,several different programs for instructing a machine to play chess well have been written.Each program constitutes one strategy.The program Deep Blue,that beat then world chess champion Gary Kasparov in a match in1997,represents one strategy.The set of all such strategies for Player I is denoted by X.Naturally,in the game of chess it is physically impossible to describe all possible strategies since there are too many;in fact,there are more strategies than there are atoms in the known universe.On the other hand,the number of games of tic-tac-toe is rather small,so that it is possible to study all strategies andfind an optimal strategy for each ter,when we study the extensive form of a game,we will see that many other types of games may be modeled and described in strategic form.To illustrate the notions involved in games,let us consider the simplest non-trivial case when both X and Y consist of two elements.As an example,take the game called Odd-or-Even.1.2Example:Odd or Even.Players I and II simultaneously call out one of the numbers one or two.Player I’s name is Odd;he wins if the sum of the numbers if odd. Player II’s name is Even;she wins if the sum of the numbers is even.The amount paid to the winner by the loser is always the sum of the numbers in dollars.To put this game in strategic form we must specify X,Y and A.Here we may choose X={1,2},Y={1,2}, and A as given in the following table.II(even)yI(odd)x12 1−2+3 2+3−4A(x,y)=I’s winnings=II’s losses.It turns out that one of the players has a distinct advantage in this game.Can you tell which one it is?Let us analyze this game from Player I’s point of view.Suppose he calls‘one’3/5ths of the time and‘two’2/5ths of the time at random.In this case,1.If II calls‘one’,I loses2dollars3/5ths of the time and wins3dollars2/5ths of the time;on the average,he wins−2(3/5)+3(2/5)=0(he breaks even in the long run).2.If II call‘two’,I wins3dollars3/5ths of the time and loses4dollars2/5ths of the time; on the average he wins3(3/5)−4(2/5)=1/5.That is,if I mixes his choices in the given way,the game is even every time II calls ‘one’,but I wins20/c on the average every time II calls‘two’.By employing this simple strategy,I is assured of at least breaking even on the average no matter what II does.Can Player Ifix it so that he wins a positive amount no matter what II calls?Let p denote the proportion of times that Player I calls‘one’.Let us try to choose p so that Player I wins the same amount on the average whether II calls‘one’or‘two’.Then since I’s average winnings when II calls‘one’is−2p+3(1−p),and his average winnings when II calls‘two’is3p−4(1−p)Player I should choose p so that−2p+3(1−p)=3p−4(1−p)3−5p=7p−412p=7p=7/12.Hence,I should call‘one’with probability7/12,and‘two’with probability5/12.On theaverage,I wins−2(7/12)+3(5/12)=1/12,or813cents every time he plays the game,nomatter what II does.Such a strategy that produces the same average winnings no matter what the opponent does is called an equalizing strategy.Therefore,the game is clearly in I’s favor.Can he do better than813cents per gameon the average?The answer is:Not if II plays properly.In fact,II could use the same procedure:call‘one’with probability7/12call‘two’with probability5/12.If I calls‘one’,II’s average loss is−2(7/12)+3(5/12)=1/12.If I calls‘two’,II’s average loss is3(7/12)−4(5/12)=1/12.Hence,I has a procedure that guarantees him at least1/12on the average,and II has a procedure that keeps her average loss to at most1/12.1/12is called the value of the game,and the procedure each uses to insure this return is called an optimal strategy or a minimax strategy.If instead of playing the game,the players agree to call in an arbitrator to settle thisconflict,it seems reasonable that the arbitrator should require II to pay813cents to I.ForI could argue that he should receive at least813cents since his optimal strategy guaranteeshim that much on the average no matter what II does.On the other hand II could arguethat he should not have to pay more than813cents since she has a strategy that keeps heraverage loss to at most that amount no matter what I does.1.3Pure Strategies and Mixed Strategies.It is useful to make a distinction between a pure strategy and a mixed strategy.We refer to elements of X or Y as pure strategies.The more complex entity that chooses among the pure strategies at random in various proportions is called a mixed strategy.Thus,I’s optimal strategy in the game of Odd-or-Even is a mixed strategy;it mixes the pure strategies one and two with probabilities 7/12and5/12respectively.Of course every pure strategy,x∈X,can be considered as the mixed strategy that chooses the pure strategy x with probability1.In our analysis,we made a rather subtle assumption.We assumed that when a player uses a mixed strategy,he is only interested in his average return.He does not care about hismaximum possible winnings or losses—only the average.This is actually a rather drastic assumption.We are evidently assuming that a player is indifferent between receiving5 million dollars outright,and receiving10million dollars with probability1/2and nothing with probability1/2.I think nearly everyone would prefer the$5,000,000outright.This is because the utility of having10megabucks is not twice the utility of having5megabucks.The main justification for this assumption comes from utility theory and is treated in Appendix1.The basic premise of utility theory is that one should evaluate a payoffby its utility to the player rather than on its numerical monetary value.Generally a player’s utility of money will not be linear in the amount.The main theorem of utility theory states that under certain reasonable assumptions,a player’s preferences among outcomes are consistent with the existence of a utility function and the player judges an outcome only on the basis of the average utility of the outcome.However,utilizing utility theory to justify the above assumption raises a new difficulty. Namely,the two players may have different utility functions.The same outcome may be perceived in quite different ways.This means that the game is no longer zero-sum.We need an assumption that says the utility functions of two players are the same(up to change of location and scale).This is a rather strong assumption,but for moderate to small monetary amounts,we believe it is a reasonable one.A mixed strategy may be implemented with the aid of a suitable outside random mechanism,such as tossing a coin,rolling dice,drawing a number out of a hat and so on.The seconds indicator of a watch provides a simple personal method of randomization provided it is not used too frequently.For example,Player I of Odd-or-Even wants an outside random event with probability7/12to implement his optimal strategy.Since 7/12=35/60,he could take a quick glance at his watch;if the seconds indicator showed a number between0and35,he would call‘one’,while if it were between35and60,he would call‘two’.1.4The Minimax Theorem.A two-person zero-sum game(X,Y,A)is said to be afinite game if both strategy sets X and Y arefinite sets.The fundamental theorem of game theory due to von Neumann states that the situation encountered in the game of Odd-or-Even holds for allfinite two-person zero-sum games.Specifically,The Minimax Theorem.For everyfinite two-person zero-sum game,(1)there is a number V,called the value of the game,(2)there is a mixed strategy for Player I such that I’s average gain is at least V no matter what II does,and(3)there is a mixed strategy for Player II such that II’s average loss is at most V no matter what I does.This is one form of the minimax theorem to be stated more precisely and discussed in greater depth later.If V is zero we say the game is fair.If V is positive,we say the game favors Player I,while if V is negative,we say the game favors Player II.1.5Exercises.1.Consider the game of Odd-or-Even with the sole change that the loser pays the winner the product,rather than the sum,of the numbers chosen(who wins still depends on the sum).Find the table for the payofffunction A,and analyze the game tofind the value and optimal strategies of the players.Is the game fair?2.Player I holds a black Ace and a red8.Player II holds a red2and a black7.The players simultaneously choose a card to play.If the chosen cards are of the same color, Player I wins.Player II wins if the cards are of different colors.The amount won is a number of dollars equal to the number on the winner’s card(Ace counts as1.)Set up the payofffunction,find the value of the game and the optimal mixed strategies of the players.3.Sherlock Holmes boards the train from London to Dover in an effort to reach the continent and so escape from Professor Moriarty.Moriarty can take an express train and catch Holmes at Dover.However,there is an intermediate station at Canterbury at which Holmes may detrain to avoid such a disaster.But of course,Moriarty is aware of this too and may himself stop instead at Canterbury.Von Neumann and Morgenstern(loc.cit.) estimate the value to Moriarty of these four possibilities to be given in the following matrix (in some unspecified units).HolmesMoriartyCanterbury Dover Canterbury100−50 Dover0100What are the optimal strategies for Holmes and Moriarty,and what is the value?(His-torically,as related by Dr.Watson in“The Final Problem”in Arthur Conan Doyle’s The Memoires of Sherlock Holmes,Holmes detrained at Canterbury and Moriarty went on to Dover.)4.The entertaining book The Compleat Strategyst by John Williams contains many simple examples and informative discussion of strategic form games.Here is one of his problems.“I know a good game,”says Alex.“We pointfingers at each other;either onefinger or twofingers.If we match with onefinger,you buy me one Daiquiri,If we match with twofingers,you buy me two Daiquiris.If we don’t match I letyou offwith a payment of a dime.It’ll help pass the time.”Olaf appears quite unmoved.“That sounds like a very dull game—at least in its early stages.”His eyes glaze on the ceiling for a moment and his lipsflutterbriefly;he returns to the conversation with:“Now if you’d care to pay me42cents before each game,as a partial compensation for all those55-cent drinks I’llhave to buy you,then I’d be happy to pass the time with you.Olaf could see that the game was inherently unfair to him so he insisted on a side payment as compensation.Does this side payment make the game fair?What are the optimal strategies and the value of the game?2.Matrix Games —DominationA finite two-person zero-sum game in strategic form,(X,Y,A ),is sometimes called a matrix game because the payofffunction A can be represented by a matrix.If X ={x 1,...,x m }and Y ={y 1,...,y n },then by the game matrix or payoffmatrix we mean the matrix A =⎛⎝a 11···a 1n ......a m 1···a mn⎞⎠where a ij =A (x i ,y j ),In this form,Player I chooses a row,Player II chooses a column,and II pays I the entry in the chosen row and column.Note that the entries of the matrix are the winnings of the row chooser and losses of the column chooser.A mixed strategy for Player I may be represented by an m -tuple,p =(p 1,p 2,...,p m )of probabilities that add to 1.If I uses the mixed strategy p =(p 1,p 2,...,p m )and II chooses column j ,then the (average)payoffto I is m i =1p i a ij .Similarly,a mixed strategy for Player II is an n -tuple q =(q 1,q 2,...,q n ).If II uses q and I uses row i the payoffto I is n j =1a ij q j .More generally,if I uses the mixed strategy p and II uses the mixed strategy q ,the (average)payoffto I is p T Aq = m i =1 n j =1p i a ij q j .Note that the pure strategy for Player I of choosing row i may be represented as the mixed strategy e i ,the unit vector with a 1in the i th position and 0’s elsewhere.Similarly,the pure strategy for II of choosing the j th column may be represented by e j .In the following,we shall be attempting to ‘solve’games.This means finding the value,and at least one optimal strategy for each player.Occasionally,we shall be interested in finding all optimal strategies for a player.2.1Saddle points.Occasionally it is easy to solve the game.If some entry a ij of the matrix A has the property that(1)a ij is the minimum of the i th row,and(2)a ij is the maximum of the j th column,then we say a ij is a saddle point.If a ij is a saddle point,then Player I can then win at least a ij by choosing row i ,and Player II can keep her loss to at most a ij by choosing column j .Hence a ij is the value of the game.Example 1.A =⎛⎝41−3325016⎞⎠The central entry,2,is a saddle point,since it is a minimum of its row and maximum of its column.Thus it is optimal for I to choose the second row,and for II to choose the second column.The value of the game is 2,and (0,1,0)is an optimal mixed strategy for both players.For large m ×n matrices it is tedious to check each entry of the matrix to see if it has the saddle point property.It is easier to compute the minimum of each row and the maximum of each column to see if there is a match.Here is an example of the method.row min A =⎛⎜⎝3210012010213122⎞⎟⎠0001col max 3222row min B =⎛⎜⎝3110012010213122⎞⎟⎠0001col max 3122In matrix A ,no row minimum is equal to any column maximum,so there is no saddle point.However,if the 2in position a 12were changed to a 1,then we have matrix B .Here,the minimum of the fourth row is equal to the maximum of the second column;so b 42is a saddle point.2.2Solution of All 2by 2Matrix Games.Consider the general 2×2game matrix A = a b d c.To solve this game (i.e.to find the value and at least one optimal strategy for each player)we proceed as follows.1.Test for a saddle point.2.If there is no saddle point,solve by finding equalizing strategies.We now prove the method of finding equalizing strategies of Section 1.2works when-ever there is no saddle point by deriving the value and the optimal strategies.Assume there is no saddle point.If a ≥b ,then b <c ,as otherwise b is a saddle point.Since b <c ,we must have c >d ,as otherwise c is a saddle point.Continuing thus,we see that d <a and a >b .In other words,if a ≥b ,then a >b <c >d <a .By symmetry,if a ≤b ,then a <b >c <d >a .This shows thatIf there is no saddle point,then either a >b ,b <c ,c >d and d <a ,or a <b ,b >c ,c <d and d >a .In equations (1),(2)and (3)below,we develop formulas for the optimal strategies and value of the general 2×2game.If I chooses the first row with probability p (es the mixed strategy (p,1−p )),we equate his average return when II uses columns 1and 2.ap +d (1−p )=bp +c (1−p ).Solving for p ,we findp =c −d (a −b )+(c −d ).(1)Since there is no saddle point,(a−b)and(c−d)are either both positive or both negative; hence,0<p<1.Player I’s average return using this strategy isv=ap+d(1−p)=ac−bda−b+c−d.If II chooses thefirst column with probability q(es the strategy(q,1−q)),we equate his average losses when I uses rows1and2.aq+b(1−q)=dq+c(1−q)Hence,q=c−ba−b+c−d.(2)Again,since there is no saddle point,0<q<1.Player II’s average loss using this strategyisaq+b(1−q)=ac−bda−b+c−d=v,(3)the same value achievable by I.This shows that the game has a value,and that the players have optimal strategies.(something the minimax theorem says holds for allfinite games). Example2.A=−233−4p=−4−3−2−3−4−3=7/12q=samev=8−9−2−3−4−3=1/12Example3.A=0−1012p=2−10+10+2−1=1/11q=2+100+10+2−1=12/11.But q must be between zero and one.What happened?The trouble is we“forgot to test this matrix for a saddle point,so of course it has one”.(J.D.Williams The Compleat Strategyst Revised Edition,1966,McGraw-Hill,page56.)The lower left corner is a saddle point.So p=0and q=1are optimal strategies,and the value is v=1.2.3Removing Dominated Strategies.Sometimes,large matrix games may be reduced in size(hopefully to the2×2case)by deleting rows and columns that are obviously bad for the player who uses them.Definition.We say the i th row of a matrix A=(a ij)dominates the k th row if a ij≥a kj for all j.We say the i th row of A strictly dominates the k th row if a ij>a kj for all j.Similarly,the j th column of A dominates(strictly dominates)the k th column if a ij≤a ik(resp.a ij<a ik)for all i.Anything Player I can achieve using a dominated row can be achieved at least as well using the row that dominates it.Hence dominated rows may be deleted from the matrix.A similar argument shows that dominated columns may be removed.To be more precise,removal of a dominated row or column does not change the value of a game .However,there may exist an optimal strategy that uses a dominated row or column (see Exercise 9).If so,removal of that row or column will also remove the use of that optimal strategy (although there will still be at least one optimal strategy left).However,in the case of removal of a strictly dominated row or column,the set of optimal strategies does not change.We may iterate this procedure and successively remove several rows and columns.As an example,consider the matrix,A .The last column is dominated by the middle column.Deleting the last column we obtain:A =⎛⎝204123412⎞⎠Now the top row is dominated by the bottomrow.(Note this is not the case in the original matrix).Deleting the top row we obtain:⎛⎝201241⎞⎠This 2×2matrix does not have a saddle point,so p =3/4,q =1/4and v =7/4.I’s optimal strategy in the original game is(0,3/4,1/4);II’s is (1/4,3/4,0).1241 A row (column)may also be removed if it is dominated by a probability combination of other rows (columns).If for some 0<p <1,pa i 1j +(1−p )a i 2j ≥a kj for all j ,then the k th row is dominated by the mixed strategy that chooses row i 1with probability p and row i 2with probability 1−p .Player I can do at least as well using this mixed strategy instead of choosing row k .(In addition,any mixed strategy choosing row k with probability p k may be replaced by the one in which k ’s probability is split between i 1and i 2.That is,i 1’s probability is increased by pp k and i 2’s probability is increased by (1−p )p k .)A similar argument may be used for columns.Consider the matrix A =⎛⎝046574963⎞⎠.The middle column is dominated by the outside columns taken with probability 1/2each.With the central column deleted,the middle row is dominated by the combination of the top row with probability 1/3and the bottom row with probability 2/3.The reducedmatrix, 0693,is easily solved.The value is V =54/12=9/2.Of course,mixtures of more than two rows (columns)may be used to dominate and remove other rows (columns).For example,the mixture of columns one two and threewith probabilities 1/3each in matrix B =⎛⎝135340223735⎞⎠dominates the last column,and so the last column may be removed.Not all games may be reduced by dominance.In fact,even if the matrix has a saddle point,there may not be any dominated rows or columns.The 3×3game with a saddle point found in Example 1demonstrates this.2.4Solving 2×n and m ×2games.Games with matrices of size 2×n or m ×2may be solved with the aid of a graphical interpretation.Take the following example.p 1−p 23154160Suppose Player I chooses the first row with probability p and the second row with proba-bility 1−p .If II chooses Column 1,I’s average payoffis 2p +4(1−p ).Similarly,choices of Columns 2,3and 4result in average payoffs of 3p +(1−p ),p +6(1−p ),and 5p respectively.We graph these four linear functions of p for 0≤p ≤1.For a fixed value of p ,Player I can be sure that his average winnings is at least the minimum of these four functions evaluated at p .This is known as the lower envelope of these functions.Since I wants to maximize his guaranteed average winnings,he wants to find p that achieves the maximum of this lower envelope.According to the drawing,this should occur at the intersection of the lines for Columns 2and 3.This essentially,involves solving the game in which II is restrictedto Columns 2and 3.The value of the game 3116is v =17/7,I’s optimal strategy is (5/7,2/7),and II’s optimal strategy is (5/7,2/7).Subject to the accuracy of the drawing,we conclude therefore that in the original game I’s optimal strategy is (5/7,2/7),II’s is (0,5/7,2/7,0)and the value is 17/7.Fig 2.10123456col.3col.1col.2col.4015/7pThe accuracy of the drawing may be checked:Given any guess at a solution to a game,there is a sure-fire test to see if the guess is correct ,as follows.If I uses the strategy (5/7,2/7),his average payoffif II uses Columns 1,2,3and 4,is 18/7,17/7,17/7,and 25/7respectively.Thus his average payoffis at least17/7no matter what II does.Similarly, if II uses(0,5/7,2/7,0),her average loss is(at most)17/7.Thus,17/7is the value,and these strategies are optimal.We note that the line for Column1plays no role in the lower envelope(that is,the lower envelope would be unchanged if the line for Column1were removed from the graph). This is a test for domination.Column1is,in fact,dominated by Columns2and3taken with probability1/2each.The line for Column4does appear in the lower envelope,and hence Column4cannot be dominated.As an example of a m×2game,consider the matrix associated with Figure2.2.If q is the probability that II chooses Column1,then II’s average loss for I’s three possible choices of rows is given in the accompanying graph.Here,Player II looks at the largest of her average losses for a given q.This is the upper envelope of the function.II wants tofind q that minimizes this upper envelope.From the graph,we see that any value of q between1/4and1/3inclusive achieves this minimum.The value of the game is4,and I has an optimal pure strategy:row2.Fig2.2⎛⎝q1−q154462⎞⎠123456row1row2row3011/41/2qThese techniques work just as well for2×∞and∞×2games.2.5Latin Square Games.A Latin square is an n×n array of n different letters such that each letter occurs once and only once in each row and each column.The5×5 array at the right is an example.If in a Latin square each letter is assigned a numerical value,the resulting matrix is the matrix of a Latin square game.Such games have simple solutions.The value is the average of the numbers in a row,and the strategy that chooses each pure strategy with equal probability1/n is optimal for both players.The reason is not very deep.The conditions for optimality are satisfied.⎛⎜⎜⎜⎝a b c d eb e acd c a de b d c e b ae d b a c ⎞⎟⎟⎟⎠a =1,b =2,c =d =3,e =6⎛⎜⎜⎜⎝1233626133313623362163213⎞⎟⎟⎟⎠In the example above,the value is V =(1+2+3+3+6)/5=3,and the mixed strategy p =q =(1/5,1/5,1/5,1/5,1/5)is optimal for both players.The game of matching pennies is a Latin square game.Its value is zero and (1/2,1/2)is optimal for both players.2.6Exercises.1.Solve the game with matrix−1−3−22 ,that is find the value and an optimal (mixed)strategy for both players.2.Solve the game with matrix 02t 1for an arbitrary real number t .(Don’t forget to check for a saddle point!)Draw the graph of v (t ),the value of the game,as a function of t ,for −∞<t <∞.3.Show that if a game with m ×n matrix has two saddle points,then they have equal values.4.Reduce by dominance to 2×2games and solve.(a)⎛⎜⎝5410432−10−1431−212⎞⎟⎠(b)⎛⎝1007126476335⎞⎠.5.(a)Solve the game with matrix 3240−21−45 .(b)Reduce by dominance to a 3×2matrix game and solve:⎛⎝08584612−43⎞⎠.6.Players I and II choose integers i and j respectively from the set {1,2,...,n }for some n ≥2.Player I wins 1if |i −j |=1.Otherwise there is no payoff.If n =7,for example,the game matrix is⎛⎜⎜⎜⎜⎜⎜⎜⎝0100000101000001010000010100000101000001010000010⎞⎟⎟⎟⎟⎟⎟⎟⎠。
NASHAPP
54
Chapter 3. Nash Equilibrium: Illustrations
is qi , then the price is P (q1 + · · · + qn ), so that firm i’s revenue is qi P (q1 + · · · + qn ). Thus its profit is πi (q1 , . . . , qn ) = qi P (q1 + · · · + qn ) − Ci (qi ). (54.1)
To find firm 1’s best response to any given output q2 of firm 2, we need to study firm 1’s profit as a function of its output q1 for given values of q2 . If q2 = 0 then firm 1’s profit is π1 (q1 , 0) = q1 (α − c − q1 ) for q1 ≤ α, a quadratic function that is zero when q1 = 0 and when q1 = α − c. This function is the black curve in Figure 55.1. Given the symmetry of quadratic functions (Section 17.4), the output q1 of firm 1 that maximizes its profit is q1 = 1 2 (α − c). (If you know calculus, you can reach the same conclusion by setting the derivative of firm 1’s profit with respect to q1 equal to zero and solving for q1 .) Thus firm 1’s best response to an output of zero for firm 2 is b1 (0) = 1 2 (α − c). As the output q2 of firm 2 increases, the profit firm 1 can obtain at any given output decreases, because more output of firm 2 means a lower price. The gray curve in Figure 55.1 is an example of π1 (q1 , q2 ) for q2 > 0 and q2 < α − c. Again this function is a quadratic up to the output q1 = α − q2 that leads to a price of zero. Specifically, the quadratic is π1 (q1 , q2 ) = q1 (α − c − q2 − q1 ), which is zero when q1 = 0 and when q1 = α − c − q2 . From the symmetry of quadratic functions (or some calculus) we conclude that the output that maximizes π1 (q1 , q2 ) 1 is q1 = 1 2 (α − c − q2 ). (When q2 = 0, this is equal to 2 (α − c), the best response to an output of zero that we found in the previous paragraph.) ↑ π1 (q1 , q2 ) q2 = 0 q2 > 0
雅思考试模拟试题及答案解析(11)
______
第9题
______
第10题
______
上一题下一题
(11~15/共10题)SECTION 2
Choose the correct letter, A, B or C,
Play00:0002:35
Volume
第11题
What does the charity Forward thinking do?
"nearest car park for rehearsals is in Ashburton Road opposite the (6)
Play00:0002:37
Volume
第1题
______
第2题
______
第3题
______
第4题
______
第5题
______
第6题
______
下一题
(7~10/共10题)SECTION 1
made from a (40) of clay and flint
第36题
______
第37题
______
第38题
______
第39题
______
第40题
______
上一题下一题
(41~46/共13题)PASSAGE 1
第41题
______
第42题
______
第43题
______
第44题
______
第29题
write report______
第30题
do presentation______
上一题下一题
(31~35/共10题)SECTION 4
Complete the sentences below.
初二数学应用英语阅读理解20题
初二数学应用英语阅读理解20题1<背景文章>Tom is a student in Grade Eight. One day, he went to the supermarket with his mother. They wanted to buy some fruits. When they came to the fruit section, Tom saw that apples were sold at 5 yuan per kilogram and oranges were sold at 8 yuan per kilogram. Tom's mother wanted to buy 3 kilograms of apples and 2 kilograms of oranges. Tom quickly calculated the total cost in his mind. He thought that 3 kilograms of apples cost 3 times 5 yuan, which is 15 yuan. And 2 kilograms of oranges cost 2 times 8 yuan, which is 16 yuan. So the total cost is 15 yuan plus 16 yuan, which is 31 yuan.After buying the fruits, they went to the cashier to pay. The cashier told them that there was a promotion. If they spent more than 30 yuan, they could get a discount of 5 yuan. Tom was very happy because they could save some money. He quickly calculated the new total cost. After deducting the discount, the new total cost is 31 yuan minus 5 yuan, which is 26 yuan.Tom and his mother left the supermarket happily. Tom realized that mathematics is very useful in daily life. It can help us solve many problems.1. Apples are sold at ___ yuan per kilogram.A.3B.4C.5D.6答案:C。
Evolutionary Game Theory
X XI XII
Glossary
Deterministic evolutionary dynamic: A deterministic evolutionary dynamic is a rule for assigning population games to ordinary differential equations describing the evolution of behavior in the game. Deterministic evolutionary dynamics can be derived from revision protocols, which describe choices (in economic settings) or births and deaths (in biological settings) on an agent-by-agent basis. Evolutionarily stable strategy (ESS): In a symmetric normal form game, an evolutionarily stable strategy is a (possibly mixed) strategy with the following property: a population in which all members play this strategy is resistant to invasion by a small group of mutants who play an alternative mixed strategy. Normal form game: A normal form game is a strategic interaction in which each of n players chooses a strategy and then receives a payoff that depends on all agents’ choices choices of strategy. In a symmetric two-player normal form game, the two players choose from the same set of strategies, and payoffs only depend on own and opponent’s choices, not on a player’s identity. Population game: A population game is a strategic interaction among one or more large populations of agents. Each agent’s payoff depends on his own choice of strategy and the distribution of others’ choices of strategies. One can generate a population game from a –2–
IPV
On Model Theoryfor Intuitionistic Bounded Arithmeticwith Applications to Independence ResultsSamuel R.Buss∗AbstractIPV+is IPV(which is essentially IS12)with polynomial-induction onΣb+1-formulas disjoined with arbitrary formulas in which theinduction variable does not occur.This paper proves that IPV+issound and complete with respect to Kripke structures in which everyworld is a model of CPV(essentially S12).Thus IPV is sound withrespect to such structures.In this setting,this is a strengtheningof the usual completeness and soundness theorems forfirst-orderintuitionistic ing Kripke structures a conservation resultis proved for PV1over IPV.Cook-Urquhart and Kraj´ıˇc ek-Pudl´a k have proved independence results stating that it is consistent with IPV and PV that extendedFrege systems are super.As an application of Kripke models for IPV,we give a proof of a strengthening of Cook and Urquhart’s theoremusing the model-theoretic construction of Kraj´ıˇc ek and Pudl´a k.1IntroductionAn equational theory P V of polynomial time functions was introduced by Cook[4];a classicalfirst-order theory S12for polynomial time computation was developed in Buss[1];and intuitionistic theories IS12and IPV for polynomial time computation have been discussed by Buss[2]and by Cook and Urquhart[5].This paper discusses(a)model theory for the intuitionistic ∗Supported in part by NSF Grant DMS-8902480.fragments IPV and IPV+of Bounded Arithmetic(IPV is essentially IS12 enlarged to the language of PV)and(b)the relationship between two recent independence results for IPV and CPV.The theories IPV and CPV have the same axioms but are intuitionistic and classical,respectively.Our model theory for IPV and IPV+is a strengthening of the usual Kripke semantics for intuitionisticfirst-order logic:we consider Kripke structures in which each “world”is a classical model of CPV.The use of these so-called CPV-normal Kripke structures is in contrast to the usual Kripke semantics which instead require each world to intuitionistically satisfy(or“force”)the axioms;the worlds of a CPV-normal Kripke structure must classically satisfy the axioms. The main new results of this paper establish the completeness and soundness of IPV+with respect to CPV-normal Kripke structures.The outline of this paper is as follows:in section2,the definitions of PV1, IPV and CPV are reviewed and the theory IPV+is introduced;in section3, we develop model theory for IPV and IPV+and prove the soundness of these theories with respect to CPV-normal Kripke structures;in section4we apply the usual intuitionistic completeness theorem to prove a conservation result of PV1over IPV.Section5contains the completeness theorem for IPV+with respect to CPV-normal Kripke models.In section6,we apply the soundness theorem to prove a strengthening of Cook and Urquhart’s independence result for IPV and show that this strengthened result implies Kraj´ıˇc ek and Pudl´a k’s independence result.2The Feasible TheoriesCook[4]defined an equational theory PV for polynomial time computation. Buss[1]introduced afirst-order theory S12with proof-theoretic strength corresponding to polynomial time computation and in which precisely the polynomial time functions could beΣb1-defined.There is a very close connection between S12and PV:let S12(PV)(also called CPV)be the theory defined conservatively over S12by adding function symbols for polynomial time functions and adding defining equations(universal axioms)for the new function symbols;then S12(PV)is conservative over PV[1].Buss[2]defined an intuitionistic theory IS12for polynomial time compu-tation and Cook and Urquhart[5]gave similarly feasible,intuitionistic proof systems PVωand IPVωfor feasible,higher-type functionals.This paper will deal exclusively with the following theories,which are defined in more detail in the next paragraphs:(1)PV1is PV conservatively extended tofirst-order classical logic—PV1is defined by Kraj´ıˇc ek-Pudl´a k-Takeuti[12]and should not be confused Cook’s propositional expansion P V1 of PV[4],(2)IPV is an intuitionistic theory in the language of PV and isessentially equivalent to IS12,(3)CPV is S12(PV),and(4)the intuitionistic theory IPV+is an extension of IPV and is defined below.We now review the definitions of these four theories—it should be noted that our definitions are based on Bounded Arithmetic and not all of them are the historical definitions.Recall that S12is a classical theory of arithmetic with language0,S,+,·,⌊1x⌋,|x|,#and≤where|x|=⌈log2(x+1)⌉is the length of the binary representation of x and x#y=2|x|·|y|.A bounded quantifier is of the form (Qx≤t)where t is a term not involving x;a sharply bounded quantifier is one of the form(Qx≤|t|).A bounded formula is afirst-order formula in which every quantifier is bounded.The bounded formulas are classified in a syntactic hierarchyΣb i,Πb i by counting alternations of bounded quantifiers, ignoring sharply bounded quantifiers.There is a close connection between this hierarchy of bounded formulas and the polynomial time hierarchy;namely,a set of integers is in the classΣp i of the polynomial time hierarchy if and only if it is definable by aΣb i-formula.The theory S12is axiomatized by some purely universal formulas defining basic properties of the non-logical symbols and by PIND(polynomial induction)onΣb1-formulas:x⌋)⊃A(x))⊃(∀x)A(x)A(0)∧(∀x)(A(⌊12for A anyΣb1-formula.A function f isΣb1-definable in S12if and only if it is provably total in S12with aΣb1-formula defining the graph of f.In[1]it is shown that a function isΣb1-definable in S12if and only if it is polynomial time computable.Let S12(PV)denote the conservative extension of S12obtained by adjoining a new function symbol for each polynomial time(Σb1-defined) function.These new function symbols may be used freely in terms in induction axioms.Another name for the theory S12(PV)is CPV and we shall use the latter name for most of this paper.We useΣb1(PV)andΠb1(PV)to denote hierarchy of classes of bounded formulas in the language of CPV.PV is the equational theory consisting of all(intuitionistic)sequents of atomic formulas provable in S12(P V),i.e.,PV is the theory containing exactly those formulas of the form(r1=s1∧···∧r k=s k)⊃t1=t2which are consequences of S12(P V).PV1is the classical,first-order thoery ax-iomatized by formulas in PV and is conservative over PV.Equivalently,PV1 is the theory axiomatized by the∆b1(PV)-consequences of S12(P V)(where ∆b1(P V)means provably equivalent to aΣb1(PV)-and to aΠb1(PV)-formula).Since S12(PV)has a function symbol for each polynomial time function symbol,the use of sharply bounded quantifiers is not necessary;in particular, everyΣb1(PV)-formula is equivalent to a formula of the form(∃x≤t)(r=s).Hence CPV=S12(PV)may be axiomatized by PIND on formulas in this latter form.IS12is an intuitionistic theory of arithmetic.A hereditarilyΣb1-formula, or HΣb1-formula,is defined to be a formula in which every subformula is aΣb1-formula.IS12is axiomatized like S12except with PIND restricted to HΣb1-formulas.Any function definable in IS12is polynomial time computable and,conversely,every polynomial time computable function is HΣb1-definable in IS12.Let IPV=IS12(PV)be the conservative extension of IS12obtained by adjoining every polynomial time function with a HΣb1-defining equation. Note IPV and CPV have the same language.An alternative definition of IPV is that it is the intuitionistic theory axiomatized by PV plus PIND for formulas of the form(∃x≤t)(r=s).In this way,IPV and CPV can be taken to have precisely the same axioms;the former is intuitionistic and the latter is classical.The theories IPV and IS12 have the law of the excluded middle for atomic formulas,that is to say,the law of the excluded middle holds for polynomial time computable predicates. This restricted law of excluded middle also applies to the theory IPV+defined next.Definition IPV+is the intuitionistic theory which includes PV and has the PIND axioms for formulasψ(b, c)of the formϕ( c)∨(∃x≤t(b, c))[r(x,b, c)=s(x,b, c)]where r,s and t are terms andϕ( c)is an arbitrary formula in which the variable b does not occur.The induction axiom is with respect to the variable b and is:ψ(0, c)∧(∀z)(ψ(⌊1z⌋, c)⊃ψ(z, c))⊃(∀z)ψ(z, c).Note that IPV+⊇IPV sinceϕcan be taken to be0=1,for instance.In[3]a theory IS1+2was defined by allowing PIND on HΣb∗1-formulas where HΣb∗1-formulas are HΣb1-formulas disjoined with an arbitrary formula in which the induction variable does not occur.It is readily checked that IPV+is equivalent to the theory IS1+2extended to the language of PV1by introducing symbols for all polynomial functions via HΣb1-definitions.We use⊢c and⊢i for classical and intuitionistic provability,respectively; thus we shall(redundantly)write CPV⊢cϕand IPV⊢iϕand IPV+⊢iϕ. Whenever we writeΓ⊢iϕorΓ⊢cϕ,we require thatΓbe a set of sentences;†however,ϕmay be a formula and may also involve constant symbols not occuring in any formula inΓ.†By convention,afirst-order theory is identified with the set of sentences provable in that theory.Definition A positive formula is one in which no negation signs(¬)and no implication symbols(⊃)appear.Ifθis a positive formula andϕis an arbitrary formula,thenθϕis the formula obtained fromθby replacing every atomic subformulaχofθby(χ∨ϕ).We do not allow free variables inϕto become bound inθϕ:this can be done either by using the conventions of the sequent calculus which has distinct sets of free and bound variables or by renaming bound variables inθto be distinct from the free variables inϕ.Theorem1Letθbe a positive formula.If CPV⊢c¬θthen IPV⊢i¬θ.Theorem2Letθbe a positive formula andϕbe an arbitrary formula.If CPV⊢c¬θthen IPV+⊢iθϕ⊃ϕ.These theorems follow readily from the corresponding facts for S12and IS1+2which are proved in Buss[3].Theorem2can be obtained as a corollary to Theorem1via Lemma3.5.3(a)of[15].3Kripke structures for intuitionistic logicA classical model for PV1or CPV is defined as usual for classicalfirst-order logic using Tarskian semantics.The corresponding semantic notion for intuitionisticfirst-order logic is that of a Kripke model.We briefly define Kripke models for IPV and IPV+,a slightly more general definition of Kripke models can be found in the textbook by Troelstra and van Dalen[15].(Kripke models for IPV are slightly simpler than in the general case since IPV has the law of the excluded middle for atomic formulas.)A Kripke model K for the language of IPV is an ordered pair({M i}i∈I, ) where{M i}i∈I is a set of(not necessarily distinct)classical structures for the language of IPV indexed by elements of the set I and where is a reflexive and transitive binary relation on{M i}i∈I.‡Furthermore,whenever M i M j then M i is a substructure of M j in that M i is obtainable from M j by restricting functions and predicates to the domain|M i|of M i.The M i’s are called worlds.Ifϕis a formula and if c∈|M i|then we define M i|=ϕ( c),M i classically satisfiesϕ( c),as usual,ignoring the rest of the worlds in the Kripke structure. To define the intuitionistic semantics,M i ϕ( c),M i forcesϕ( c),is defined inductively on the complexity ofϕas follows:§‡Strictly speaking, should be a relation on I since the M i’s may not be distinct. However,we follow standard usage and write as a relation on worlds.§A more proper notation would be(K,M i) ϕ( c)or even(K,i) ϕ( c)but we use the simpler notation M i ϕ( c)when K is specified by the context.(1)Ifϕis atomic,M i ϕif and only if M i|=ϕ.(2)Ifϕisψ∧χthen M i ϕif and only if M i ψand M i χ.(3)Ifϕisψ∨χthen M i ϕif and only if M i ψor M i χ.(4)Ifϕisψ⊃χthen M i ϕif and only if for all M j M i,if M j ψthen M j χ.(5)Ifϕis¬ψthen M i ϕif and only if for all M j M i,M j ψ.Alternatively one may define¬ψto meanψ⊃⊥where⊥is always false(not forced).(6)Ifϕis(∃x)ψ(x)then M i ϕif and only if there is some b∈|M i|suchthat M i ψ(b).(7)Ifϕis(∀x)ψ(x)then M i ϕif and only if for all M j M i and allb∈|M j|,M j ϕ(b).An immediate consequence of the definition of forcing is that if M i ϕand M i M j then M j ϕ;this is proved by induction on the complexity ofϕ. Also,the law of the excluded middle for atomic formulas will be forced at every world M i because we required M i to be a substructure of M j whenever M i M j¶.In other words,both truth and falsity of atomic formulas are preserved in“reachable”worlds.Consequently,the law of the excluded middle for quantifier-free formulas is also forced at each world.Hence,ifϕis quantifier-free,then M i ϕif and only if M i ϕ.A formulaϕ( x)is valid in K,denoted K ϕ( x),if and only if for all worlds M i and all c∈|M i|,M i ϕ( c).A set of formulasΓis valid in K, K Γ,if and only if every formula inΓis valid in K.Γ ϕ,ϕis a Kripke consequence ofΓ,if and only if for every Kripke structure K,if K Γthen K ϕ.A Kripke model for IPV is one in which the axioms of IPV are valid. Likewise,a Kripke model for IPV+is one in which the axioms of IPV+are valid.The usual strong soundness and completeness theorems for intuitionistic logic state that for any set of sentencesΓand any sentenceϕ,Γ ϕif and only ifΓ⊢iϕ(see Troelstra and van Dalen[15]for a proof).Hence validity in Kripke models corresponds precisely to intuitionistic provability.A countable Kripke model is one in which there are countably many worlds each with a countable domain.The usual strong completeness theorem further states that ifΓis a countable set of formulas andΓ iψthen there is a countable Kripke structure in whichΓis valid butψis not.¶This differs from the usual definition of Kripke models for intuitionistic logic.The usual strong soundness and completeness theorems give a semantics for the theory IPV in that for any formulaϕ,IPV⊢iϕif and only if for all K,if K IPV then K ϕ.It is,however,a little difficult to interpret directly what it means for K IPV to hold;and we feel that it is more natural to consider CPV-normal Kripke structures instead:Definition A Kripke model K=({M i}i∈I, )is CPV-normal if and only if for all i∈I,the world M i is a classical model of CPV.Theorem3(Soundness of IPV and IPV+for CPV-normal Kripke models.)(a)If K is a CPV-normal Kripke structure then K IPV.Hence for allϕ,if IPV⊢iϕthen K ϕ.(b)If K is a CPV-normal Kripke structure then K IPV+.Hence for allϕ,if IPV+⊢iϕthen K ϕ.The converse to Theorem3(b)is proved in section5below.Proof It will clearly suffice to prove only(b)since IPV+⊇IPV.Suppose K is a CPV-normal Kripke structure.Since every world M i is a classical model of CPV and hence of PV1,it follows immediately from the definition for forcing and from the fact that P V1is axiomatized by universal formulas that K PV1.So it will suffice to show that the PIND axioms of IPV+are valid in K.Let M i be a world and consider a formulaϕ(b, c)of the form ψ( c)∨χ(b, c)whereχ(b, c)is a formula of the form(∃x≤t(b, c))(r(x,b, c)=s(x,b, c))and where b is a variable, c∈|M i|andψ( c)is an arbitrary formula not involving b.We must show thatM i ϕ(0, c)∧(∀z)(ϕ(⌊12z⌋, c)⊃ϕ(z, c))⊃(∀x)ϕ(x, c).To prove this,suppose that M i M j and thatM j ϕ(0, c)∧(∀z)(ϕ(⌊12z⌋, c)⊃ϕ(z, c));we must show M j (∀x)ϕ(x, c).If M j ψ( c)then this is clear,so suppose M j ψ( c).Note that for any b∈|M j|,M j χ(b, c)if and only if M j χ(b, c).Hence,since M j ϕ(0, c)and M j ψ( c),M j χ(0, c).And similarly,by reflexivity of ,for each b∈|M j|,if M j χ(⌊12b⌋, c)thenM j χ(b, c).In other words,M j (∀z)(χ(⌊1z⌋, c)⊃χ(z, c)).But now since M j CPV and CPV has PIND forχ(b, c),M j (∀z)χ(z, c).We have established that either M j χ(b, c)for every b∈|M j|or M j ψ( c).The same reasoning applies to any world M k M i and in particular,for any M k M j,either M k χ(b, c)for every b∈|M k|or M k ψ( c).Hence by the definition of forcing,M j (∀z)ϕ(z, c).We have shown that if K is a CPV-normal Kripke model then every axiom of IPV+is valid in K.It now follows by the usual soundness theorem for intuitionistic logic that every intuitionistic consequence of IPV+is valid in K. Q.E.D.Theorem34A conservation theoremThe usual G¨o del-Kolmogorov“negative translations”don’t seem to apply to IPV since we don’t know whether the negative translations of the PIND axioms of IPV are consequences of IPV.However,the usual completeness theorem for Kripke models of IPV does allow us to prove the following substitute:Theorem4Letϕbe a quantifier-free formula.(a)Ifψis a sentence of the form¬(∃x)(∀y)¬(∀z)ϕand PV1⊢cψthenIPV⊢iψ.(b)Ifψis a sentence of the form¬(∃x1)(∀y1)¬¬(∃x2)(∀y2)¬¬···¬¬(∃x r)(∀y r)ϕand PV1⊢cψthen IPV⊢iψ.This theorem is a statement about how strong IPV is;although IPV has stronger axioms than PV1,it uses intuitionistic logic instead of classical logic so it makes sense to establish a conservation result for PV1over IPV.Of course,at least some of the negation signs inψare required for Theorem4 to be true;for example,PV1proves(∀x)(∃y)(∀z)(|z|=x⊃|y|=x)but IPV cannot prove this since otherwise,by the polynomial time realizability of IPV-provable formulas,y would be polynomial time computable in terms of x,which is false since y must be greater than or equal to2x−1.Proof Let’s prove(1)first.Suppose IPV i¬(∃x)(∀y)¬(∀z)ϕ;we must show PV1 c(∀x)(∃y)(∀z)ϕ.By the usual completeness theorem for Kripke models for IPV,there is a Kripke model K=({M i}i∈I, )of IPV such that K ¬(∃x)(∀y)¬(∀z)ϕand such that each M i is countable.Hence there is a world,say M0such that M0 (∃x)(∀y)¬(∀z)ϕ.Our stategy is tofind a chain of worlds M0 M1 M2 ···such that their union is a modelof PV1and of(∃x)(∀y)(∃z)¬ϕ.First of all note that each M i PV1since IPV includes the(purely universal)axioms of PV1.Hence i=0,1,2,···M i is a model of PV1,again because PV1has universal axioms.Let x0∈|M0|be such that M0 (∀y)¬(∀z)ϕ(x0,y,z).It will suffice tofind the M i’s so thati∈N M i (∀y)(∃z)¬ϕ(x0,y,z).Suppose we have already picked worldsM0,...,M k−1and that y k∈|M k−1|;we pick M k M k−1so that forsome z k∈|M k|,M k ¬ϕ(x0,y k,z k),or equivalently,M k ¬ϕ(x0,y k,z k).Such an M k and z k must exist since M0 (∀y)¬(∀z)ϕ(x0,y k,z k)andM0 M k−1and thus M k−1 (∀z)ϕ(x0,y k,z).Since each M i is countable,we may choose the y k’s in the right order so that y1,y2,...enumerates everyelement in the union of the M i’s.Thus for every y in the union i∈N M i there is a z such that¬ϕ(x0,y,z)holds.That gives a model of PV1in whichψis false,proving(1).The proof of(2)is similar but with more complicated bookkeeping.Let Kbe a Kripke model of IPV such thatψis not valid in K.Here if M0,...M k−1have already been chosen and if x k,1,...,x k,i−1and y k,1,...,y k,i−1are inM k−1so thatM k−1 ¬¬(∃x i)(∀y i)···¬¬(∃x r)(∀y r)ϕ(x k,1,...,x k,i−1,x i,...,x r,y k,1,...,y k,i−1,y i,...,y r) then we may pick M k M k−1and x k,i∈|M k|so thatM k (∀y i)···¬¬(∃x r)(∀y r)ϕ(x k,1,...,x k,i,x i+1,...,x r,y k,1,...,y k,i−1,y i,...,y r) By appropriately diagonalizing through the countably many choices for i and x and y we may ensure that k∈N M k is a model of P V1∪{¬ψ}.We omit the details.25A completeness theorem for IPV+We next establish the main theorem of this paper.Theorem5(Completeness Theorem for IPV+with respect to CPV-normalKripke models)Letϕbe any sentence.If IPV+ iϕthen there is a CPV-normal Kripkemodel K such that K IPV+and K ϕ.Note that the conclusion“K IPV+”is superfluous as this is already aconsequence of Theorem3.The proof of this theorem will proceed along thelines of the proof of the usual strong completeness theorem for intuitionisticlogic as exposited in section2.6of Troelstra and van Dalen[15].The new ingredient and the most difficult part in our proof is Lemma7below which is needed to ensure that the Kripke model is CPV-normal.Although we shall not prove it here,Theorem5can be strengthened to require K to be countable.Definition Let C be a set of constant symbols.A C-formula or C-sentence is a formula or sentence in the language of PV1plus constant symbols in C. All sets of constants are presumed to be countable.Definition A set of C-sentencesΓis C-saturated provided the following hold:(1)Γis intuitionistically consistent,(2)For all C-sentencesϕandψ,ifΓ⊢iϕ∨ψthenΓ⊢iϕorΓ⊢iψ.(3)For all C-sentences(∃x)ϕ(x),ifΓ⊢i(∃x)ϕ(x)then for some c∈C,Γ⊢iϕ(c).The next,well-known lemma shows that C-saturated sets can be readily constructed.Lemma6LetΓbe a set of sentences andϕbe a sentence such thatΓ iϕ. If C is a set of constant symbols containing all constants inΓplus countably infinitely many new constant symbols,then there is a C-saturated setΓ∗containingΓsuch thatΓ∗ iϕ.The proof of Lemma6is quite simple,merely enumerate with repetitions all C-sentences which either begin with an existential quantifier or are a disjunction and then formΓ∗by adding new sentences toΓso that(2)and(3) of the definition of C-saturated are satisfied.This can be done so thatϕis still not an intuitionistic consequence.(For a full proof,refer to lemma2.6.3 of[15].)In the proof of the usual completeness theorem for Kripke models and intuitionistic logic,the C-saturated sets of sentences constructed with Lemma6specify worlds in a canonical Kripke model.However,Lemma6 is not adequate for the proof of Theorem5and Lemma7below is needed instead.A C-saturated setΓdefines a world with domain C in which an atomic formulaϕis forced if and onlyΓ⊢iϕ.For the proof of Theorem5,we shall only consider setsΓwhich contain IPV+and hence imply the law of the excluded middle for atomic formulas;the C-saturation ofΓthus implies that for any atomic C-sentenceϕ,eitherΓ⊢iϕorΓ⊢i¬ϕ.ThusΓspecifies a classical structure MΓdefined as follows:Definition SupposeΓ⊃IPV,Γis a C-saturated set,and for all distinct c,c′∈C,Γ⊢i c=c′.Then MΓis the classical structure in the language of PV plus constant symbols in C such that the domain of MΓis C itself(so c MΓ=c)and such that for every atomic C-sentenceϕ,MΓ ϕif and only ifΓ⊢iϕ.It is straightforward to check that MΓis a classical structure:the only thing to check is that the equality axioms hold(it suffices to do this for atomic formulas).Note the equality relation=MΓin MΓis true equality in that M c=c′if and only if c=c′because of the restriction thatΓ⊢i c=c′if c and c′are distinct.This restriction is not very onerous as we will be able to make it hold by eliminating duplicate constant symbols.In order to prove Theorem5we must construct setsΓso that the structures MΓare classical models of CPV;Lemma7is the crucial tool for this:Lemma7SupposeΓis a set of C-sentences,ϕis a C-sentence and Γ⊇IPV+.Further supposeΓ iϕandΓ⊢i c=c′for distinct c,c′∈C. Then there is a setΓ∗of sentences and a set C∗of constants such that(a)Γ∗⊃Γ(b)Γ∗is C∗-saturated(c)Γ∗ iϕ(d)Γ∗⊢i c=c′for all distinct c,c′∈C∗(e)MΓ∗ CPV.ProofΓ∗and MΓ∗are constructed by a technique similar to Henkin’s proof of G¨o del’s completeness theorem.We pick C+to be C plus countably infinitely many new constant symbols and enumerate the C+formulas as α1,α2,α3,...with each C+-formula appearing infinitely many times in the enumeration.We shall form classically consistent sets of sentences Π0,Π1,Π2,...so thatΠ0⊇CPV and so that,for all k,Πk⊇Πk−1and eitherαk∈Πk or¬αk∈Πk.Furthermore,ifαk=(∃x)β(x)andαk∈Πk−1 then for some constant symbol c,Πk⊢cβ(c).Thus,as usual in a Henkin-style model construction,the union of theΠk’s will specify a classical model M of CPV with domain formed of equivalence classes of constants in C+.This M will become MΓ∗after elimination of duplicate constant names.If we did not adopt this restiction,then the domain of MΓwould have to be equivalence classes of constants in C instead of just the set C.But this would cause some inconveniences later on in the definition of the canonical CPV-normal Kripke structure.While defining the setsΠk we also define setsΠ′k,Γk,C k and C′k so that Πk−1⊆Π′k⊆Πk andΓ=Γ0⊆Γ1⊆Γ2⊆···and such that C0is C,C k⊇C′k⊇C k−1,and C+= k C k.Γ∗will be the union of theΓi’s after elimination of duplicate constant names.Definition Let D be a set of constants andΛbe a set of D-sentences.Then T h+ϕ[Λ,D]is the set{θ:θis a positive D-sentence andΛ⊢iθϕ}For us,the formulaϕisfixed,so we also denote this set by T h+[Λ,D].If ∆is a classical theory then the[Λ,D]-closure of∆is the classical theory axiomatized by∆∪T h+[Λ,D].Definition We defineΓ0to beΓ,C0to be C andΠ0to be the[Γ,C]-closure of CPV.For k>0,Πk,Π′k,Γk,C k and C′k are inductively defined by: (1)Supposeαk∈Πk andαk is of the form(∃x)βk(x).Then C′k is C k−1plus an additional new constant symbol c∈C+\C k−1.AndΠ′k is the [Γk−1,C′k]-closure ofΠk−1∪{βk(c)}.(2)If Case(1)does not apply then C′k is C k−1plus the constant symbolsinαk and:(a)LetΠ′k beΠk−1∪{αk}∪T h+[Γk−1,C′k]if this theory is classicallyconsistent,(b)Otherwise,letΠ′k beΠk−1∪{¬αk}∪T h+[Γk−1,C′k](3)Ifαk is of the form(∃x)βk(x)andΓk−1⊢iαk then C k is C′k∪{d}whered is a new constant symbol from C+\C′k,Γk isΓk−1∪{βk(d)}andΠkis the[Γk,C k]-closure ofΠ′k.(4)Ifαk is of the formβk∨γk andΓk−1⊢iαk then C k is C′k and:(a)If the[Γk−1∪{βk},C k]-closure ofΠ′k is classically consistent thenΠk defined to be equal to this theory andΓk isΓk−1∪{βk}.(b)Otherwise,Γk isΓk−1∪{γk}andΠk is the[Γk,C k]-closure ofΠ′k. DefineΠω= kΠk andΓk= kΓk.Note C+= k C k.The point of cases(1)and(2)above is to makeΠωa complete theory with witnesses for existential consequences.The point of cases(3)and(4)is to forceΓωto be C+-saturated.The requirement thatΠk contain T h+[Γk,C k] andΠ′k contain T h+[Γk−1,C′k]serves to maintain the condition thatΓk iϕ.Claim:For k=0,1,2,(1)Πk is classically consistent for all k.(2)Γk iϕ(soΓk is intuitionistically consistent).Note that ifΓk⊢iϕ,thenΓk⊢i(0=1)ϕand henceΠk⊢c0=1and Πk is inconsistent.So to prove the claim,it suffices to showΠk is consistent which we do by induction on k.The base case is k=0.Suppose for a contradiction thatΠ0is inconsistent.Then CPV⊢c¬θ1∨¬θ2∨···∨¬θs for positive C-sentencesθj such thatΓ⊢iθϕj.By taking the conjunction of theθj’s there is a single positive C-sentenceθsuch that CPV⊢c¬θand Γ⊢iθϕ.But,by Theorem2,IPV+⊢iθϕ⊃ϕand thus,sinceΓ⊇IPV+,Γ⊢iϕ;which is a contradiction.For the induction step,wefirst assumeΠk−1is consistent and show that Π′k is consistent.Referring to Case(1)of the definition ofΠ′k,suppose αk=(∃x)βk(x)and thatΠ′k is inconsistent.This means that there is a positive C′k-sentenceθ(c)such thatΠk−1⊢cβk(c)⊃¬θ(c)andΓk−1⊢iθ(c)ϕ. Then,since c was a new constant symbol,Πk−1⊢c(∃x)βk(x)⊃(∃x)¬θ(x) and soΠk−1⊢c¬(∀x)θ(x);also,Γk−1⊢i[(∀x)θ(x)]ϕ.But(∀x)θ(x) is a positive C k−1-sentence andΠk−1contains T h+[Γk−1,C k−1],soΠk−1 contains(∀x)θ(x)which contradicts our assumption thatΠk−1is consis-tent.Now suppose Case(2)of the definition applies.Letαk=αk( e) where e denotes all the constant symbols inαk that are not in C k−1(soC′k =C k−1∪{ e}).LetΠa k andΠb k be the[Γk−1,C′k]-closures ofΠk−1∪{αk}andΠk−1∪{¬αk},respectively.We need to show that at least one of these theories is classically consistent,so suppose that both are inconsistent.Then there are positive C′k-sentencesθa( e)andθb( e)such thatΓk−1⊢iθa( e)ϕ,Γk−1⊢iθb( e)ϕ,Πk−1⊢cαk( e)⊃¬θa( e)andΠk−1⊢c¬αk( e)⊃¬θb( e)Then Πk−1⊢c¬(∃ x)(θa( x)∧θb( x))andΓk−1⊢i[(∀ x)(θa( x)∧θb( x))]ϕ.SinceΠk−1 contains T h+[Γk−1,C k−1],(∀ x)(θa( x)∧θb( x))is inΠk−1,contradicting the consistency ofΠk−1.Tofinish the induction step and prove the claim,we assumeΠ′k is consis-tent and show thatΠk is consistent.First suppose Case(3)of the definition ofΓk andΠk applies and thatΠk is inconsistent.Then there is a positive C k-sentenceθ(d)such thatΠ′k⊢c¬θ(d)andΓk−1⊢iβk(d)⊃(θ(d))ϕ. Since d is a new constant symbol,Πk−1⊢c(∀x)¬θ(x)and likewise,since Γk−1⊢i(∃x)βk(x),Γk−1⊢i(∃x)θ(x)ϕ.Hence(∃x)θ(x)is inΠ′k which contradicts the consistency ofΠ′k.Second,suppose Case(4)of the definition applies.LetΠc k andΠd k be the[Γk−1∪{βk},C k]-closure and[Γk−1∪{γk},C k]-closure ofΠ′k,respectively.Suppose,for sake of a contradiction,that both Πc k andΠd k are inconsistent.Then there are positive C k-sentencesθc andθd such thatΓk−1⊢iβk⊃(θc)ϕandΓk−1⊢iγk⊃(θd)ϕand such thatΠ′k⊢c¬θc。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
A Game Theoretic Formulation for Intrusion Detection inMobile Ad Hoc Networks∗Animesh Patcha and Jung-Min ParkBradley Department of Electrical and Computer EngineeringVirginia Polytechnic Institute and State UniversityBlacksburg,VA24061AbstractNodes in a mobile ad hoc network need to thwart various attacks and ma-licious activities.This is especially true for the ad hoc environment wherethere is a total lack of centralized or third-party authentication and securityarchitectures.This paper presents a game-theoretic model to analyze intrusiondetection in mobile ad hoc networks.We use game theory to model the in-teractions between the nodes of an ad hoc network.We view the interactionbetween an attacker and an individual node as a two player non-cooperativegame,and construct models for such a game.KeywordsIntrusion Detection,Mobile Ad hoc Networks,Game Theory∗A preliminary version of portions of this material was presented at the Fifth Annual IEEE Information Assurance Workshop,United States Military Academy,West Point,New York,June2004.11IntroductionIn the past couple of years,considerable interest has developed in creating new kinds of network applications that fully exploit distributed mobile computing,particularly for military and defence purposes.The key underlying technology for such applications is the mobile ad hoc network(MANET).MANETs,as the name suggests,have no supporting infrastructure.They are autonomous distributed systems that are comprised of a number of mobile nodes connected by wireless links,forming arbitrary time-varying wireless network topolo-gies.Mobile nodes function both as hosts and routers.As hosts,they represent source and destination nodes in the network,while as routers,they represent intermediate nodes between a source and a destination,providing store-and-forward services to neighboring nodes.Store-and-forward services are needed due to the limited range of each individual mobile host’s wireless transmission.Nodes that constitute the wireless network infrastructure are free to move randomly and organize themselves arbitrarily.Applications such as military exercises and disaster relief will benefit from ad hoc networking,but secure and reliable communication is a necessary prerequisite for such applications.Flexibility and adaptability,which are the strengths of MANET,are unfortunately accompanied by increased security risks.Security in the MANET environment is particularly difficult to achieve,notably because of the limited physical protection to each of the nodes,the sporadic nature of connectivity,the absence of a certification authority,and the lack of a centralized monitoring or management unit.Intrusion prevention is not guaranteed to work all the time,and this clearly underscores theneed for intrusion detection as a front–line security research area under the umbrella of ad hoc network security.In traditional wireless networks,mobile devices associate themselves with an access point which is in turn connected to other wired machines such as a gateway or a name server which handle network management functions. Ad hoc networks,on the other hand,do not use such access points and form a com-pletely distributed architecture.The absence of an infrastructure and subsequently the absence of authorization facilities impedes the usual practice of establishing a line of defense—distinguishing nodes as trusted or non-trusted.There may be no ground for an a priori classification,since all nodes are required to cooperate in supporting the network operation and no prior security association(SA)can be assumed for all the network nodes.Freely roaming nodes form transient associations with their neighbors:they join and leave sub-domains independently with and without notice.In MANETs,compromised nodes may cause potential Byzantine failures in the routing protocols.In a Byzantine failure,a set of the nodes could be compromised in such a way that incorrect and malicious behavior cannot be detected.Malicious nodes can inflict a Byzantine failure on a system by creating new routing messages, advertising non-existent links or providing incrrect link state information.It can therefore be seen that intrusion prevention measures,like afirewall in MANETs are not enough.Intrusion detection techniques are widely used in wired networks to protect net-worked systems.Intrusion detection techniques geared towards wired networks can-not,however,be applied directly to MANETs.This is especially because of the latter’s lack of afixed infrastructure,mobility,the vulnerability of wireless trans-missions to eavesdropping and the lack of a clear separation between normal and abnormal behavior of the nodes.In addition,the ad hoc networking paradigm does not allow for the presence of traffic concentration points in the network,whereas most conventional intrusion detection systems(IDS)geared towards wired networks depend on such an architecture.In this paper,we concentrate on providing a mathematical framework for intrusion detection in mobile ad hoc networks.We describe how game theory can be used to find strategies for both the malicious node and the administrator of the target node.The organization of the paper is as follows.In Section2we briefly describe the related work.Section2is followed by Section3,where we give a brief introduction to the concepts of game theory and introduce the formal model for non-cooperative games.We also relate the elements in the model to the problem at hand.In Section4, we present our model for intrusion detection in a MANET.We conclude the paper in Section5.2Related WorkIDS are classified as anomaly detection systems or misuse detection systems.The research in IDS’s began with a report by Anderson[1]followed by a seminal paper by Denning[2].Since then various models for intrusion detection have been proposed for wired networks.All existing approaches take into consideration domain specific knowledge to build suitable detection systems.Research in IDSs for wireless networks,especially MANETs,is an emerging areathat has a relatively short history.Marti et al.[3]introduced the concept of Watchdog and Pathrater to snoop promiscuously in the neighborhood of a given wireless node to identify routing misbehavior.Buchegger and Boudec[4]extended the work of Marti et al.by replacing the Watchdog with a Neighborhood Watch paradigm.In this paradigm,a node monitors on its downstream neighbor.They also introduce a Trust Manager,a Reputation System and a Path Manager.The basic premise in their system is that each node runs afinite state machine to calculate the“trust”it has in its neighbor,which in turn is used to rank the other node’s reputation.The path with the highest security metric is always chosen,and nodes with low reputation values are ignored and/or isolated from the system.Zhang and Lee[5]put forth the basic requirements for an IDS in the MANET environment.They also proposed a general intrusion detection and response mech-anism for MANETs,in which each IDS agent participates in the intrusion detection and response tasks independently.Huang et al.[6]extended the work done by Zhang and Lee.They use cross-feature analysis to analyze the routing activities and im-prove the anomaly detection process by providing more details about the attack types and attack sources.Sun et al.[7],proposed a Markov chain-based anomaly detection approach for MANETs.The use of mobile agents in the context of IDS has also been proposed in the last couple of years.Kachirski and Guha[8]have proposed a distributed IDS for the MANET environment.By efficiently merging audit data from multiple network sensors,their bandwidth-conscious scheme analyzes the entire ad hoc wireless network for intrusions at multiple levels,thwarts intrusion attempts,and provides a lightweightlow-overhead mechanism based on the mobile agent concept.Two other notable research projects in the application area of mobile agents for intrusion detection are the LIDS project[9]and the SPARTA project[10].For further information on intrusion detection for MANETs the reader is directed to the survey article in[11].Game theory has been used extensively in computer and communication networks to model a variety of problems.The relevant body of work includes the work of Shenker[12]for modeling service disciplines,the work of Akella et al.[13]for TCP performance,and the work of Baser et al.[14]for modeling power control in a multi-cell wireless network.Bencs´a th et al.[15]applied game theory and client puzzles to devise a defense against denial of service(DoS)attacks.In the area of MANETs, Michiardi et al.[16]used cooperative and non-cooperative game theoretic constructs to develop a reputation based architecture for enforcing cooperation.Modeling intrusion detection using game theory,however,is a relatively new ap-proach.Kodialam et al.[17]used a game theoretic framework to model intrusion detection via sampling in communications networks and developed sampling schemes that are optimal in the game theoretic setting.Our work is more closely related to the model proposed by Alpcan et al.[18].We have extended the model proposed in[18]to include MANETs,and have analyzed the interaction between an attacker anda host-based IDS as a dynamic two player non-cooperative game.3Game TheoryGame theory is a branch of applied mathematics that uses models to study interac-tions with formalized incentive structures(“games”).It has applications in a variety offields,including economics,international relations,evolutionary biology,political science,and military strategy.Game theory provides us with tools to study situa-tions of conflict and cooperation.Such a situation exists when two or more decision makers who have different objectives act on the same system or share the same set of resources.Therefore,game theory is concerned withfinding the best actions for individual decision makers in such situations and recognizing stable outcomes.Some of the assumptions that one makes while formulating a game are:1.There are at least two players in a game and each player has,available to him/her,two or more well-specified choices or sequences of choices.2.Every possible combination of plays available to the players leads to a well-definedend-state(win,loss,or draw)that terminates the game.3.Associated with each possible outcome of the game is a collection of numericalpayoffs,one to each player.These payoffs represent the value of the outcome to the different players.4.All decision makers are rational;that is,each player,given two alternatives,willselect the one that yields the greater payoff.Game theory has been traditionally divided into cooperative game theory and non-cooperative game theory.The two branches of game theory differ in how theyformalize interdependence among the players.In non-cooperative game theory,a game is a detailed model of all the moves available to the players.In contrast, cooperative game theory abstracts away from this level of detail and describes only the outcomes that result when the players come together in different combinations. In this paper,we consider non-cooperative games.3.1Non-Cooperative Game TheoryNon-cooperative game theory studies situations in which a number of nodes/players are involved in an interactive process whose outcome is determined by the node’s individual decisions and,in turn,affects the well-being of each node in a possibly different way.Non-cooperative games can be classified into a few categories based on several criteria.Non-cooperative games can be classified as static or dynamic based on whether the moves made by the players are simultaneous or not.In a static game, players make their strategy choices simultaneously,without the knowledge of what the other players are choosing.Static games are generally represented diagrammatically using a game table that is called the normal form or strategic form of a game.In contrast,in a dynamic game,there is a strict order of play.Players take turns to make their moves,and they know the moves played by players who have gone before them.Game trees are used to depict dynamic games.This methodology is generally referred to as the extensive form of a game.A game tree illustrates all of the possible actions that can be taken by all of the players.It also indicates all of the possible outcomes at each step of the game.Non-cooperative games can also be classified as complete information games or incomplete information games,based on whether the players have complete or incom-plete information about their adversaries in the game.Here information denotes the payoff-relevant characteristics of the adversaries.In a complete information game, each player has complete knowledge about his/her adversary’s characteristics,strat-egy spaces,payofffunctions,and so on.For further details on game theory,the reader is directed to[19,20].In this paper,we model the interaction between an attacker and an intrusion detection system as a basic signaling game which falls under the gambit of multi-stage dynamic non-cooperative game with incomplete information.As mentioned above,in a non-cooperative game with incomplete information,we model situations in which some players have some private information before the beginning of a game.This initial private information is called the type of a player and it fully describes any information the player has,which is not common knowledge.A player may have several types,one for each possible state of his/her private information.It is also assumed that each player knows his/her own type with complete certainty.3.2Basic Signaling GameA basic signaling game,in its simplest form has two players—Player1who is the sender and Player2who is the receiver.For the sake of convenience we treat Player1 as masculine and Player2as feminine.Nature1draws the type of the sender from1We often want to include in our model some extrinsic uncertainty,that is some random event not under the control of the players.We indicate this by allowing nodes to be owned by an artificial player that we call“Nature”and sometimes index as Player0.Nature’s moves are not labeled in the same way as the moves of the strategic players.Rather we associate probabilities to each of nature’s moves.a type setΘ,whose typical element isθ.The type information is private to each sender.Player1observes information about his typeθand chooses an action a1 from his action space A1.Player2,whose type is known to everyone observes a1 and chooses an action a2from her action space A2.Player2has prior beliefs,before the start of the game,about Player1’s type.In other words,before observing the sender’s message,the receiver believes that the probability that the sender is some typeθ∈Θis p(θ).The action spaces of mixed actions are A1and A2with elements α1andα2respectively.Player i’s payoffis denoted by u i(α1,α2,θ).Player1’s strategy is a probability distributionσ1(·|θ)over actions a1for each typeθ.A strategy for Player2is a probability distributionσ2(·|α1)over actions a2for each action a1.After both the players have taken their actions,the payoffs are awarded according to the message sent by the sender,the action taken by the receiver in response and the typeθof the sender chosen by Nature.Typeθ’s payoffto strategyσ1(·|θ)when Player2playsσ2(·|a1)isu1(σ1,σ2,θ)=a1a2σ1(a1|θ)σ2(a2|a1)u1(a1,a2,θ).(1)Player2’s payoffto strategyσ2(·|a1)when Player1playsσ1(·|θ)isθp(θ)(a1a2σ1(a1|θ)σ2(a2|a1)u2(a1,a2,θ)).(2)Player2updates her beliefs aboutθand bases her choice of action a2on theposterior distribution2µ(·|a1)overΘ.Bayesian Equilibrium dictates that Player1’saction will depend on his type.Therefore,ifσ∗1(·|θ)denotes this strategy,then know-ingσ∗1(·|θ)and by observing a1,Player2can use Bayes rule to update p(·)andµ(·|a1). Drew and Tirole[19]state that the natural extension of the subgame-perfect equi-librium3is the perfect Bayesian equilibrium,which requires Player2to maximize her payoffconditional on a1for each a1.Definition:A perfect Bayesian equilibrium(PBE)of a signaling game is a strategy profileσ∗and posterior beliefsµ(·|a1)such that(P1)∀θ,σ∗1(·|θ)∈arg maxα1u1(α1,σ∗2,θ),(3)(P2)∀a1,σ∗2(·|a1)∈arg maxα2θµ(θ|a1)u2(a1,α2,θ),(4)and(B)µ(θ|a1)=p(θ)σ∗1(a1|θ)θ1∈Θp(θ1)σ∗1(a1|θ1)(5)ifθ ∈Θp(θ )σ∗1(a1|θ )>0andµ(·|a1)is any probability distribution onΘifθ ∈Θp(θ )σ∗1(a1|θ )=0,2In Bayesian inference,when we have performed an experiment we use Bayes’theorem tofind a new distribution which reflects the result of the experiment.This new distribution is called the posterior distribution.3In extensive-form games with complete information,many strategy profiles that form the best responses to one another imply incredible threats or promises that a player actually does not want to carry out anymore once he must face an(unexpected)off-equilibrium move by an opponent.If the profile of strategies is such that no player wants to amend his strategy whatever decision node can be reached during the play of the game,an equilibrium profile of strategies is called subgame perfect.In this sense,a subgame-perfect strategy profile is“time consistent”in that it remains an equilibrium in whatever truncation of the original game(subgame)the players mayfind themselves.where P1and P2are the perfection conditions and B corresponds to the application of Bayes rule.P1says that Player1takes into account the affect of a1on Player2’s action.P2states that Player2reacts optimally to Player1’s action given her posterior beliefs aboutθ.In other words,a perfect Bayesian equilibrium must satisfy the subgame perfection criterion and in addition the model must satisfy the following Bayesian postulates.•For each information set,the players must have beliefs about the stage the game has reached.•Whenever it is a players turn to move,his/her actions must be optimal from that point onwards given his/her beliefs.•The players beliefs about neighboring nodes,must be determined using the Bayes rule.Thus,a perfect Bayesian equilibrium can be thought of as a set of strategies and beliefs such that at any stage of the game,strategies are optimal given the beliefs. These beliefs are obtained from the equilibrium strategies and the observed actions using Bayes rule.We believe that intrusion detection in MANETs can be modeled as a basic signal-ing game for a number of reasons.First and foremost,in a MANET environment,it is very hard to distinguish a friend from a foe in the absence of security mechanisms such as public key infrastructure(PKI),digital certificates,etc.Therefore,the type of a particular node is not easily verifiable by other nodes in the system.Secondly an IDS responds to the intrusion after an intrusion has occurred.Therefore,we believethat modeling intrusion detection in a game theoretic framework based on dynamic non-cooperative games is the right direction to take.4A Game Theoretic Model of Intrusion Detection The very nature of MANETs,dictates that any IDS designed for such a network has to be distributed in nature.Centralized solutions that have a single point of failure cannot be used.Assuming a host based IDS,we model an intrusion detection game played between a host and an intruder.In this section,we present our game theoretic framework to analyze and model the response of an IDS.Examples of IDS response actions include setting offan alarm,watching suspicious activity before setting offan alarm,and a total system reconfiguration.We model the interaction between an attacker and a host based IDS as a two player signaling game which falls under the gambit of multi-stage dynamic non-cooperative game with incomplete information.In the intrusion detection game,the objective of the attacker is to send a malicious message from some attack node with the intension of attacking the target node.The intrusion is deemed successful when the malicious message reaches the target machine without being detected by the host IDS.We assume that an intrusion is detected and the intruding node is blocked when a message sent by a probable intruder is intercepted and the host IDS can say with certainty that the message is malicious in nature.For an IDS,the basic performance criteria is the rate of false alarms in the system. There exists a tradeoffbetween the reduction in false alarms and the reduction of un-detected intrusions—decreasing the system sensitivity to reduce the number of false alarm result in the increase of undetected intrusions.Either extremes are undesirable as the IDS becomes totally ineffective in such circumstances.In our system model, we consider the cost associated with an undetected intrusion to be much more severe than the cost associated with false alarms.To simplify our analysis,we assume that a malicious node attacks only one node at a time and that collusion between malicious nodes do not occur.In addition,we do not consider selfish node4activity.The IDS does one of two things:it either sets offan alarm on detection of an intrusion or does nothing.4.1System ModelIn our model of the signaling game,a node is the sender and the host based IDS based is the receiver to which the message is directed.The senders private information will be his nature.In other words,the sender node could be of two types:he could be a regular node or he could be a malicious node/attacker.The type space of a given sender is,therefore,given byΘ=[Attacker,RegularNode].The IDS prior beliefs concerning the probability that any other node in the system is either an attacker or a regular node can be described by a single number q∈[0,1].The malicious node’s(attacker)decision is a choice between exhibiting malicious behavior or exhibiting normal behavior.Let the probability of a particular malicious4Selfish nodes are nodes that use the network resources but act selfishly in order to save system resources like battery life for their own needs.They do not intend to directly damage other nodesFigure1:An attacker-IDS basic signaling gamenode exhibiting malicious activity be s,and the probability of the same node exhibit-ing normal behavior be1−s.The particular choice that the attacker makes is his “message”.The IDS“detects”this decision with a probability t and misses it with a probability1−t depending on his beliefs.Consider the attacker-IDS game shown in Figure1.Note that the sender has two information sets,corresponding to his two types(viz.Attacker and Normal Node). The receiver also has two information sets,but these correspond to the senders two possible messages(viz.defend and miss)rather than to the senders possible types. The IDS has a gain of−γdefend for detecting an attack where as there is a cost involved whenever the IDS misses an attack(γmiss)or when it raises a false alarm(γfalarm). On the other hand,the intruder has a gain of−δintrude on a successful undetected intrusion and a cost ofδcaught on being detected and blocked.False alarms have a zero cost value to the attacker.In this paper,we assume that the payoffs for the IDSand the node are different in the case of an active attack as compared to the payoffs awarded in the case of a passive attack.To illustrate this point,the payoffs awarded in the latter case are shown byγandδin Figure1.For the attacker,in all possible cases,the expected payoffiss[tδcaught−(1−t)δintrude].(6) Similarly,for the IDS,in all possible cases,the expected payoffissγmiss+tγfalarm−st(γdefend+γfalarm+γmiss).(7)A rational node will always try to maximize(7).If the cost of the false alarm (γfalarm)is relatively low,then the IDS will always choose to sound an alarm.The Nash equilibrium5for such a signaling game is described by the following condition.For the attacker/regular node:Given the strategy of the IDS,each typeθof anode evaluates the utility from sending a message a1asa2σ2(a2|a1)u1(a1,a2,θ)andp(θ)puts weight on a1only if it is amongst the maximizing messages in this expected utility.For the IDS:The IDS will proceed in two steps.First for every message a1that is sent with positive probability by some typeθ,the IDS uses Bayes rule to compute5A profile of strategies such that given the other players conform to the(hypothesized)equilibrium strategies, no player has an incentive to unilaterally deviate from his(hypothesized)equilibrium strategy.The self-reference in this definition can be made more explicit by saying that a Nash equilibrium is a profile of strategies that form “best responses”to one another,or a profile of strategies which are“optimal reactions”to“optimal reactions”.Nash equilibrium is the pure form of the basic concept of strategic equilibrium;as such,it is useful mainly in normal form games with complete information.When allowing for randomized strategies,at least one Nash equilibrium exists in any game(unless the players’payofffunctions are irregular);for an example,see the game of matching pennies in the entry on game theory.Typically,a game possess several Nash equilibria,and the number of these is odd.the posterior probability assessment that a1comes from each typeθ.According to the Nash equilibrium condition,for all a1that are sent by some typeθwith positive probability,every response a2in support of the IDS’s response should be the best response to a1given the beliefs that are computed using Bayes rule.Therefore,we can say that the IDS strategy will be the best response to the sending nodes behavior strategy if and only if it maximizes its expected utility over all possible pure strategies.The strategy of the IDS will,therefore,be to pick the optimal strategy∀θ,σ∗1(·|θ)∈arg maxα1u1(α1,σ∗2,θ),(8)out of its available set in response to a message a1from the sending node.The choice of strategy must be based on the receiver’s prior beliefs such that it is able to maximize the effective payoffby minimizing the cost due to false alarms and missed attacks.Bayes theorem being recursive in nature allows each node to periodically update its posterior beliefs about other nodes from its previous posterior distribution based on independent observations.Intuitively,we can see that with time the false alarm rates will decrease.When applied in tandem with other approaches like likelihood evaluation and active intruder profiling,the false alarm rates can be further reduced.A full explanation of these methods is beyond the scope of this paper.The game theoretic investigation presented in this paper gives us valuable insight into the behavior of the attacker and the IDS.We believe that most of the simpli-fying assumptions made in this paper can be modified to incorporate more realistic scenarios.5Conclusions and Research IssuesAd hoc network security has come into the lime light of network security research over the past couple of years.However,little has been done in terms of defining the security requirements specific to MANETs.Such security requirements must include countermeasures against node misbehavior and denial of service attacks.In this paper,we used the concept of multi-stage dynamic non-cooperative game with incomplete information to model intrusion detection in a network that uses a host-based IDS.As long as the beliefs are consistent with the information obtained and the actions are optimal given the beliefs,the model is theoretically consistent.We believe that this game-theoretic modeling technique models intrusion detection in a more realistic way compared to previous approaches.As part of our future work,we intend to extend our game theoretic approach to take into account selfish nodes and groups of colluding attackers.References[1]J.P.Anderson,“Computer security threat monitoring and surveillance,”tech-nical report,James P.Anderson Co.,Fort Washington,Fort Washington,PA, 1980.[2]D.E.Denning,“An intrusion detection model,”in IEEE Transactions on Soft-ware Engineering,vol.13,(Piscataway,NJ,USA),pp.222–232,IEEE Press, 1986.。