研究生课程 博弈论 英文课件3
合集下载
研究生课程 博弈论 英文课件4
3 / 33
Figure: Maximin strategies of player 1
u1 u1 (p, T ) = 1 − 2 p u1 (p, H) = 2 p − 1
p∗
p
m1 (p) = {u1 (p, T ), u1 (p, H)}
4 / 33
The thick (kinked) line represents m1 (p), i.e. the worst outcome for player 1 for each value of p ∈ [0, 1]. In order to maximize m1 (p), we have to compute the intersection of u1 (p, H) and u1 (p, T ):
u1 (H, q) and u1 (T, q)
u1 (H, q) = q u1 (H, H) + (1 − q) u1 (H, T ) ⇔ u1 (H, q) = 2 q − 1 u1 (T, q) = q u1 (T, H) + (1 − q) u1 (T, T ) ⇔ u1 (T, q) = −2 q + 1
u1 (p, T ) and u1 (p, H)
u1 (p, H) = p u1 (H, H) + (1 − p) u1 (T, H) ⇔ ⇔ u1 (p, T ) = 2 p + (1 − p) (−1) u1 (p, T ) = 3 p − 1 u1 (p, T ) = p u1 (H, T ) + (1 − p) u1 (T, T ) ⇔ ⇔ u1 (p, T ) = p (−1) + (1 − p) u1 (p, T ) = −2 p + 1
Figure: Maximin strategies of player 1
u1 u1 (p, T ) = 1 − 2 p u1 (p, H) = 2 p − 1
p∗
p
m1 (p) = {u1 (p, T ), u1 (p, H)}
4 / 33
The thick (kinked) line represents m1 (p), i.e. the worst outcome for player 1 for each value of p ∈ [0, 1]. In order to maximize m1 (p), we have to compute the intersection of u1 (p, H) and u1 (p, T ):
u1 (H, q) and u1 (T, q)
u1 (H, q) = q u1 (H, H) + (1 − q) u1 (H, T ) ⇔ u1 (H, q) = 2 q − 1 u1 (T, q) = q u1 (T, H) + (1 − q) u1 (T, T ) ⇔ u1 (T, q) = −2 q + 1
u1 (p, T ) and u1 (p, H)
u1 (p, H) = p u1 (H, H) + (1 − p) u1 (T, H) ⇔ ⇔ u1 (p, T ) = 2 p + (1 − p) (−1) u1 (p, T ) = 3 p − 1 u1 (p, T ) = p u1 (H, T ) + (1 − p) u1 (T, T ) ⇔ ⇔ u1 (p, T ) = p (−1) + (1 − p) u1 (p, T ) = −2 p + 1
博弈论完整课件[浙江大学]GAME_Cha(3)
Chapter 2 Dynamic Games of Complete and Perfect Information
In this chapter we introduce dynamic games. We again restrict attention to games with complete information(i.e.,games in which the players’ payoff functions are common knowledge).We analyze dynamic games that have not only complete but also perfect information, by which we mean that at each move in the game the p可l编a辑yppet r with the move 1
now analyze dynamic games by representing
such games in extensive form. This expositional
approach may make it seem that static games
must be represented in normal form and
采取蕴涵可信威胁的策略。
可编辑ppt
6
设想有一家寡占企业(在位者)在市场上享有
丰厚的利润,另一家企业(进入者)企图进入
分享;为了进入该行业,进入者必须付出4000
万元的(沉没)成本建一个工厂。在位者当然
希望进入者别进入。如果进入者不进入,在位
者能继续定高价,享受垄断利润10000万元。
如果进入者进入,在位者可以“容忍”,维持高
In this chapter we introduce dynamic games. We again restrict attention to games with complete information(i.e.,games in which the players’ payoff functions are common knowledge).We analyze dynamic games that have not only complete but also perfect information, by which we mean that at each move in the game the p可l编a辑yppet r with the move 1
now analyze dynamic games by representing
such games in extensive form. This expositional
approach may make it seem that static games
must be represented in normal form and
采取蕴涵可信威胁的策略。
可编辑ppt
6
设想有一家寡占企业(在位者)在市场上享有
丰厚的利润,另一家企业(进入者)企图进入
分享;为了进入该行业,进入者必须付出4000
万元的(沉没)成本建一个工厂。在位者当然
希望进入者别进入。如果进入者不进入,在位
者能继续定高价,享受垄断利润10000万元。
如果进入者进入,在位者可以“容忍”,维持高
博弈论最全完整-讲解PPT课件
王则柯、李杰编著,《博弈论教程》,中国人民大学 出版社,2004年版。
艾里克.拉斯缪森(Eric Rasmusen)著,《博弈与信 息:博弈论概论》,北京大学出版社,2003年版。
因内思·马可-斯达德勒,J.大卫·佩雷斯-卡斯特里罗著, 《信息经济学引论:激励与合约》,上海财经大学出版 社,2004年版。
常和博弈也是利益对抗程度最高的博弈。 非常和(变和)博弈蕴含双赢或多赢。
.
32
导论
四、主要参考文献
.
33
张维迎著,《博弈论与信息经济学》,上海三联书店、 上海人民出版社,1996年版。
Roger B. Myerson著:Game Theory(原文版、译文 版),中国经济出版社,2001年版。
是关于动态博弈进行过程之中面临决策 或者行动的参与人对于博弈进行迄今的 历史是否清楚的一种刻划。
如果在博弈进行过程中的每一时刻,面 临决策或者行动的参与人,对于博弈进 行到这个时刻为止所有参与人曾经采取 的决策或者行动完全清楚,则称为完美 信息博弈;否则位不完美信息。
.
30
零和博弈与非零和博弈
了解自己行动的限制和约束,然后以精心策划的方式 选择自己的行为,按照自己的标准做到最好。 • 博弈论对理性的行为又从新的角度赋予其新的含义— —与其他同样具有理性的决策者进行相互作用。 • 博弈论是关于相互作用情况下的理性行为的科学。
.
4
如何在博弈中获胜?
…… 真的能在博弈中(总是)获 胜吗?
对手和你一样聪明! 许多博弈相当复杂,博弈论并不
施锡铨编著,《博弈论》上海财大出版社,2000年版。
谢识予编著,《经济博弈论》,复旦大学出版社, 2002年版。
谢识予主编,《经济博弈论习题指南》,复旦大学出 版社,2003年版。
博弈论完整版PPT课件
ac 3
纳什均衡利润为:
Π1NE
Πቤተ መጻሕፍቲ ባይዱ
NE 2
(a c)2 9
.
31
q2 a-c
(a-c)/2 (a-c)/3
.
19
理性共识
0-阶理性共识:每个人都是理性的,但不知道其 他人是否是理性的;
1-阶理性共识:每个人都是理性的,并且知道其 他人也是理性的,但不知道其他人是否知道自己 是理性的;
2-阶理性共识:每个人都是理性的,并且知道其
他人也是理性的,同时知道其他人也知道自己是
理性的;但不知道其他人是否知道自己知道他们
国外经济学教科书改写,加入大量博弈论内容
博弈论进入主流经济学,反映了:
经济学的研究对象越来越转向个体放弃了有些没有微观基础的假设
经济学的研究对象越来越转向人与人之间行为的相互影响和作用
经济学越来越重视对信息的研究
传统微观经济学的工具是数学(微积分、线性代数、统计学),而
博弈论是一种新的数学。以前只有陆军,现在有了空军,其差异
不完全信息
静态
纳什均衡
(纳什)
贝叶斯纳什均衡
(海萨尼)
.
动态
子博弈精练纳什均衡
(泽尔腾)
精练叶贝斯纳什均衡
(泽尔腾等)
9
博弈的分类
根据参与人是否合作
根据参与人的多少
根据博弈结果
根据行动的先后次序
两人博弈 多人博弈
静态博弈 动态博弈
合作博弈 非合作博弈
零和博弈 常和博弈 变和博弈
根据参与人对其他参与人的
4-阶理性:C相信R相信C相信R相信C是理性的,C会将R1从R的战略空间 中剔除, C不会选择C3;
5-阶理性:R相信C相信R相信C相信R相信C是理性的,R会将C3从C的战
博弈论第三讲
x1
x2
x4
x3
x5
x6
x7
an edge connecting nodes x1 and x5
x8
7
Game tree
A path is a sequence of distinct
nodes y1, y2, y3, ..., yn-1, yn such that yi and yi+1 are adjacent, for i=1, 2, ..., n-1. We say that this path is from y1 to yn. We can also use the sequence of edges induced by these nodes to denote the path.
13
Entry game
Challenger’s strategies
In Out Incumbent’s strategies Accommodate (if challenger plays In) Fight (if challenger plays In) Payoffs Normal-form representation
Dynamic Games of Complete Information
Entry game
An incumbent monopolist faces the possibility of entry by a
challenger. The challenger may choose to enter or stay out. If the challenger enters, the incumbent can choose either to accommodate or to fight. The payoffs are common knowledge.
game-theory1--博弈论-英文PPT课件
• Utility maximization - major component of a certain way of thinking, pulls together most of economic theory. More attractive and realistic alternatives failed because they did not have any interesting consequences
playersknowactionstakenotherplayersactionsknowngamesclassificationintroductioneconomicmodelsgametheorymodelsgamessummary38previewperfectinformationstaticgamesnashequilibriumdynamicgamesbackwardinduction倒推归imperfectinformationdynamicgamessubgame子博弈perfectneincompleteinformationstaticgamesauctions拍卖dynamicgamessignalinggamesclassificationintroductioneconomicmodelsgametheorymodelsgamessummaryeconomicmodelsgoodenoughapproximationrealworldmanyusefulpurposesgametheorymodelseconomicmodelssituationswheredecisionmakersinteractsummaryintroductioneconomicmodelsgametheorymodelsgamessummarystrategicgameconsistseachplayerseteachplayersetpreferencesoveractionprofilespreferencesrepresentedpayofffunctionsolvinggamesiterative重复的elimination消去strictlydominatedstrategiesnextlecturenashequilibriumnextlectureothermethodslatercoursesummaryiiintroductioneconomicmodelsgametheorymodelsgamessummary
playersknowactionstakenotherplayersactionsknowngamesclassificationintroductioneconomicmodelsgametheorymodelsgamessummary38previewperfectinformationstaticgamesnashequilibriumdynamicgamesbackwardinduction倒推归imperfectinformationdynamicgamessubgame子博弈perfectneincompleteinformationstaticgamesauctions拍卖dynamicgamessignalinggamesclassificationintroductioneconomicmodelsgametheorymodelsgamessummaryeconomicmodelsgoodenoughapproximationrealworldmanyusefulpurposesgametheorymodelseconomicmodelssituationswheredecisionmakersinteractsummaryintroductioneconomicmodelsgametheorymodelsgamessummarystrategicgameconsistseachplayerseteachplayersetpreferencesoveractionprofilespreferencesrepresentedpayofffunctionsolvinggamesiterative重复的elimination消去strictlydominatedstrategiesnextlecturenashequilibriumnextlectureothermethodslatercoursesummaryiiintroductioneconomicmodelsgametheorymodelsgamessummary
lecture23(博弈论讲义(Carnegie Mellon University))
June 20, 2003
73-347 Game Theory--Lecture 23
6
Cournot duopoly model of incomplete information (version one) cont'd
Firm 2's marginal cost depends on some factor (e.g. technology) that only firm 2 knows. Its marginal cost can be
* * * Firm 1 chooses q1 which is its best response to firm 2's ( q2 (cH ) , q2 (cL ) ) (and the probability). * If firm 2's marginal cost is HIGH then firm 2 chooses q2 (cH ) which is its * best response to firm 1's q1 . * If firm 2's marginal cost is LOW then firm 2 chooses q2 (cL ) which is its * best response to firm 1's q1 .
June 20, 2003
73-347 Game Theory--Lecture 23
3
Static (or simultaneous-move) games of complete information
A set of players (at least two players) For each player, a set of strategies/actions Payoffs received by each player for the combinations of the strategies, or for each player, preferences over the combinations of the strategies All these are common knowledge among all the players.
game theory3 博弈论 英文
Strict & Weak
Best Response
Mixed Strategy NE
Summary
Example:
1 2 L R
T
B
2,2
2,2
1,1
2,3
Nonstrict NE
Strict NE
If we eliminate T which is weakly dominated by B, and then eliminate L which is dominated by R we lose nonstrict NE {T,L}
Summary
• Definition: player i’s action ai weakly dominates her action bi if ui(ai,a−i)≥ui(bi,a−i) for every list a−i of the other players’ actions, where ui is a payoff function that represents player i’s preferences
a*i is in Bi(a*−i) for every player i This is why “method of circles”, i.e. looking for best responses leads to NE
18 / 43
Review
Strict & Weak
Best Response
• Elimination method is sometimes imprecise, NE (Circle Method, Best responses) is stronger.
博弈论Chapter 3 II
L
R 2 L R 1 a 2 a b a
2
L R
b 2 b
• In the case of perfect information, the number of subgames is equal to that of non-terminal nodes.
• Next, let’s consider the situation of imperfect game.
• Note however, that each of these SPE yields the same equilibrium outcome in which the left terminal node is reached.
• PROPOSITION : The set of subgame perfect equilibria of a finite horizon extensive game with perfect information is equal to the set of strategy profiles isolated by the procedure of backward induction.
the other is the subgame which is played after player 1 has entered the market:
2 Fight -1, 1 Acquiesce 1, 1
Example 2
Find the subgames of this tree.
1
• PROPOSITION (Existence of subgame perfect equilibrium): Every finite extensive game with perfect information has a subgame perfect equilibrium.
gametheory6博弈论英文精品PPT课件
3 / 27
Review Dynamic Games Centipede Game Ultimatum Game Summary
• action is a decision in one particular node (confess, remain silent, head, tail,…) • strategy is a plan of actions for every possible situation that might occur, for every possible node (AF-Accept if Albert goes In, Fight if Albert plays Out) • strategy – it is deciding about the action in each decision node prior to the game • it is like as if you want your friend to play the game instead of you, you have to tell him in advance what to do in each situation
OUT
0 2
IN
FA
-3
2
-1
1
5 / 27
Review Dynamic Games Centipede Game Ultimatum Game Summary
Dynamic Game (tree):
OUT
IN
0 2
Static game (table):
IN OUT
FA
-3
2
-1
1
F
NEA:
-3,-1
Review Dynamic Games Centipede Game Ultimatum Game Summary
• action is a decision in one particular node (confess, remain silent, head, tail,…) • strategy is a plan of actions for every possible situation that might occur, for every possible node (AF-Accept if Albert goes In, Fight if Albert plays Out) • strategy – it is deciding about the action in each decision node prior to the game • it is like as if you want your friend to play the game instead of you, you have to tell him in advance what to do in each situation
OUT
0 2
IN
FA
-3
2
-1
1
5 / 27
Review Dynamic Games Centipede Game Ultimatum Game Summary
Dynamic Game (tree):
OUT
IN
0 2
Static game (table):
IN OUT
FA
-3
2
-1
1
F
NEA:
-3,-1
博弈论全套课件
三. 经典的博弈模型
1、“囚徒的困境”
关于博弈论,流传最广的是一个叫做“囚 徒 困 境 ” 的 故 事 。 这 个 博 弈 是 1950 年 图 克 (Tucker)提出的,这个博弈模型提出后曾引 发了大量的相关研究,也有许多关于“囚徒困 境”的版本。“囚徒困境”对博弈论的发展起 到了巨大的推动作用。可以说凡是讲博弈论, 都会说到这个经典的博弈模型。
在过去二三十年中,博弈论已成为社会科 学研究的一个重要方法。有人说,如果未来社 会科学还有纯理论的话,那就是博弈论。无论 是合作博弈还是非合作博弈都给我们提供了一 种系统的分析方法,使人们在其命运取决于他 人的行为时制定出相应的战略。特别是当许多 相互依赖的因素共存,没有任何决策能独立于 其它许多决策之外时,博弈论更是价值巨大。
最近十几年来,博弈论在经济学尤其是微 观经济学中得到了广泛的运用, 博弈论在许多 方面改写了微观经济学的基础,经济学家们已经 把研究策略相互作用的博弈论当作最合适的分 析工具来分析各类经济问题,诸如公共经济、 国际贸易、自然资源、企业管理等。在现代经 济学里,博弈论已经成为十分标准的分析工具。 除经济学以外, 博弈论目前在生物学、管理学 、国际关系、计算机科学、政治学、军事战略 和其他很多学科都有广泛的应用。现在已经有 愈来愈多的人开始关注、了解并学习博弈理论 。
博弈论(Game Theory)是一种关于游戏的 理论, 又叫做对策论, 是一门以数学为基础的、 研究对抗冲突中最优解问题的学科。事实上, 博弈论也正是衍生于古老的游戏,如象棋、围 棋、扑克等。
博弈论作为一门学科,是在20世纪50~60 年代发展起来的,当非零和博弈理论、特别是 不完全信息博弈理论获得充分发展时,才成为 现实。到20世纪70年代,博弈论正式成为主流 经济学研究的主要方法之一。1994年诺贝尔经 济学奖同时授予了纳什、泽尔腾、海萨尼三位 博弈论专家。2005年诺贝尔经济学奖又授予了 美国经济学家托马斯.谢林(Thomas Schelling)和以色列经济学家罗伯特.奥曼 (Robert Aumann),以表彰他们在合作博弈 方面的巨大贡献。
lecture3(博弈论讲义(Carnegie Mellon University))
* * ui ( s1 ,..., si*1, si , si*1,..., sn )
for all si Si . That is, si* solves Maximize Subject to
* * * * ui ( s1 ,..., si 1, si , si 1,..., sn ) si Si
May 21, 2003 73-347 Game Theory--Lecture 3 2
Today’s Agenda
Review of previous classes Nash equilibrium Best response function
Use best response function to find Nash
(Confess, Confess) is a Nash equilibrium. Prisoner 1
May 21, 2003
Prisoner 2
Mum
Mum Confess -1 , 0 , -1 -9
Confess
-9 , -6 , 0 -6
10
73-347 Game Theory--Lecture 3
equilibria Examples
May 21, 2003
73-347 Game Theory--Lecture 3
3
Review
The normal-form (or strategic-form) representation of a
game G specifies: A finite set of players {1, 2, ..., n}, players’ strategy spaces S1 S2 ... Sn and their payoff functions u1 u2 ... un where ui : S1 × S2 × ...× Sn→R.
for all si Si . That is, si* solves Maximize Subject to
* * * * ui ( s1 ,..., si 1, si , si 1,..., sn ) si Si
May 21, 2003 73-347 Game Theory--Lecture 3 2
Today’s Agenda
Review of previous classes Nash equilibrium Best response function
Use best response function to find Nash
(Confess, Confess) is a Nash equilibrium. Prisoner 1
May 21, 2003
Prisoner 2
Mum
Mum Confess -1 , 0 , -1 -9
Confess
-9 , -6 , 0 -6
10
73-347 Game Theory--Lecture 3
equilibria Examples
May 21, 2003
73-347 Game Theory--Lecture 3
3
Review
The normal-form (or strategic-form) representation of a
game G specifies: A finite set of players {1, 2, ..., n}, players’ strategy spaces S1 S2 ... Sn and their payoff functions u1 u2 ... un where ui : S1 × S2 × ...× Sn→R.
博弈论英文课件 (3)
Ø If Player 2 chooses Head, r-(1-r)=2r-1 Ø If Player 2 chooses Tail, -r+(1-r)=1-2r
Solving matching pennies
Player 2
Head
Tail
Expected payoffs
Head Player 1
Static (or SimultaneousMove) Games of Complete Information
Matching pennies
Player 2
Head
Tail
Player 1
Head Tail
-1 , 1 1 , -1 1 , -1 -1 , 1
n Head is Player 1’s best response to Player 2’s strategy Tail n Tail is Player 2’s best response to Player 1’s strategy Tail
( 1(H)=0.5, 1(T)=0.5 ) is a Mixed strategy. That is, player 1 plays H and T with probabilities 0.5 and 0.5, respectively.
( 1(H)=0.3, 1(T)=0.7 ) is another Mixed strategy. That is, player 1 plays H and T with probabilities 0.3 and 0.7, respectively.
Ø Player 2’s expected payoff of playing s22:
EU2(s22, (r, 1-r))= r×u2(s11, s22)+(1-r)×u2(s12, s22)
Solving matching pennies
Player 2
Head
Tail
Expected payoffs
Head Player 1
Static (or SimultaneousMove) Games of Complete Information
Matching pennies
Player 2
Head
Tail
Player 1
Head Tail
-1 , 1 1 , -1 1 , -1 -1 , 1
n Head is Player 1’s best response to Player 2’s strategy Tail n Tail is Player 2’s best response to Player 1’s strategy Tail
( 1(H)=0.5, 1(T)=0.5 ) is a Mixed strategy. That is, player 1 plays H and T with probabilities 0.5 and 0.5, respectively.
( 1(H)=0.3, 1(T)=0.7 ) is another Mixed strategy. That is, player 1 plays H and T with probabilities 0.3 and 0.7, respectively.
Ø Player 2’s expected payoff of playing s22:
EU2(s22, (r, 1-r))= r×u2(s11, s22)+(1-r)×u2(s12, s22)
Chap.5 Extensive Form of Games 博弈论英文版教学课件
1
L
x1
R
2
2
L x2 R
L x3 R
1
1
1
L x4 R
L x5 R L x6 R
L x7 R
5.2 The Strategies of Extensive-Form Games
Definition 5.3 A strategy for a player is a complete plan of action—it specifies a feasible action for the player in very contingency in which the player might be called on to act.
In extensive-form games, a contingency in which the player should act is just a information set. So a strategy of extensive-form games specifies a feasible action for the player in very information set.
Consider the dynamic Investment in New Product game of complete information, in which firm 1 chooses first and firm 2 chooses after he/she observes the firm’s choice.
1
Mum
x1
2
x2
Fink 2
x3
Mum
Fink
Mum
Fink
Chap.6 Subgame perfect Nash Equilibrium 博弈论英文版教学课件
There exists two Nash equilibria, but only the strategies combination (a, (a, a)) is subgame perfect Nash equilibrium.
2
(a, a) (a, b) (b, a)
a 300,300 300,300 800, 0
1
b 0, 200
0, 0
0, 200
(b, b)
200, 0 0, 0
Definition 6.1 A subgame in an extensive-form game
1) begins at a decision node n that is a singleton information set;
A
2
YN
2,0
0,0
1
B
2
YN
1,1
0,0
C
2
Y
N
0,2
0,0
The strategies combination (A,(Y,Y,Y)) and (B,(N,Y,Y)) are subgame perfect Nash equilibria.
Kuhn Theorem: Every finite extensive-form game whit perfect information has a subgame perfect Nash equilibrium.
(5)If the case goes to trial, the plaintiff wins amount x with probability r and otherwise wins nothing.
相关主题
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
3. Since player 2 is intelligent, he will predict the reaction of player 1 and figure out the expected payoff m2 (a2 ) from playing a2 : m2 (a2 ) = = ∀ a 1 ∈ A1 .
5 / 24
a1 ∈A1 a1 ∈A1
min u1 (a1 , a2 ) min {u2 (a3 , a1 ), u2 (a2 , a2 ), u2 (a1 , a3 )}, 1 2 1 2 1 2
1\2 a1 1 a2 1 a3 1 m2 (a2 )
a1 2 0, 0 4, −4 9, −9 −9
3. Since player 1 is intelligent, he will predict the reaction of player 2 and figure out the expected payoff m1 (a1 ) from playing a1 : m1 (a1 ) = = ∀ a 1 ∈ A1 .
2 / 24
First argument of player 1: Player 2 moves second
1. Since player 2 is intelligent, he will predict any action a1 ∈ A1 that player 1 may choose. 2. Since player 2 is rational, he will choose the action a2 that maximizes his payoff (or, equivalently, minimizes the payoff of player 2). P1 : P1 : P1 : a1 1 a2 1 3 a1 ⇒ ⇒ ⇒ P2 : P2 : P2 : a1 2 a2 2 2 a2 ⇒ ⇒ ⇒ u1min u1min u1min = 0 = 2 = 0
9 / 24
a1 ∈A1
max u1 (a1 , a2 ) u1 (a1 , a3 ), u1 (a2 , a2 ), u1 (a3 , a1 ) , 1 2 1 2 1 2
1\2 a1 1 a2 1 a3 1 M1 (a2 )
a1 2 0, 0 4, −4 9, −9 9
a2 2 1, −1 2, −2 0, 0 2
a2 ∈A2
a1 ∈A1
min u1 a2 ∈A2 u1 a2 , a2 1 2
= 2.
10 / 24
Second argument of player 2: Player 1 decides first
1. Since player 2 is intelligent, he will predict any action a1 ∈ A1 that player 1 may choose. 2. Since player 2 is rational, he will choose the action a2 that maximizes his payoff. P1 : P1 : P1 : a1 1 a2 1 a3 1 ⇒ ⇒ ⇒ P2 : P2 : P2 : a1 2 a2 2 a2 2 ⇒ ⇒ ⇒ u2max u2max u2max = 0 = −2 = 0
3. Since player 1 is intelligent, he will predict the reaction of player 2 and figure out the expected payoff M2 (a1 ) from playing a1 : M2 (a1 ) = = ∀ a 1 ∈ A1 .
3. Since player 2 is intelligent, he will predict the reaction of player 1 and figure out the expected payoff M1 (a2 ) from playing a2 : M1 (a2 ) = = ∀ a 2 ∈ A2 .
Notation 2.6
We will denote the worst outcome for player 1 and player 2, respectively, for a given strategy choice of their own, as m1 (a1 ) =
a2 ∈A2
min u1 (a1 , a2 )
a1 ∈A1
a2 ∈A2
max u1 a1 ∈A1 u1 a2 , a2 1 2
= 2.
4 / 24
First argument of player 2: Player 1 moves second
1. Since player 1 is intelligent, he will predict any action a2 ∈ A2 that player 2 may choose. 2. Since player 1 is rational, he will choose the action a1 that maximizes his payoff (or, equivalently, minimizes the payoff of player 2). P2 : P2 : P2 : a1 2 a2 2 3 a2 ⇒ ⇒ ⇒ P1 : P1 : P1 : a3 1 a2 1 1 a1 ⇒ ⇒ ⇒ u2min u2min u2min = −9 = −2 = −7
ai ∈Ai
A maximin action is a maximizing action ai ∈ arg max mi (ai ). ¯
ai ∈Ai
Now, we will turn to a different kind of reasoning.
8 / 24
Second argument of player 1: Player 2 decides first
magnus.hoffmann@ovgu.de
1 / 24
Chapter 2.2 - Two-Player Zero-Sum Games
Definition 2.1 (Two-player zero-sum games)
Two-player zero-sum games is a special class of normal form games, satisfying u2 (a1 , a2 ) = −u1 (a1 , a2 ), (2.1) ∀a1 ∈ A1 and ∀a2 ∈ A2 . An example for this kind of game is the following
a2 ∈A2
a1 ∈A1
max u2 (a3 , a1 ), u2 (a2 , a2 ), u2 (a1 , a3 ) 1 2 1 2 1 2 a2 ∈A2 u2 a2 , a2 1 2
= −2.
6 / 24
Note
The logic of this approach is in considering the worst case. Each player considers the worst that could result from each of his strategies and then simply chooses the strategy that yields the least worst outcome. This procedure guarantees each player a minimal payoff level. This guaranteed payoff level is called security level or maximin value.
11 / 24
a2 ∈A2
max u2 (a1 , a2 ) u2 (a1 , a1 ), u2 (a2 , a2 ), u2 (a3 , a2 ) , 1 2 1 2 1 2
a3 2 7, −7 3, −3 1, −1 7
4. Since player 2 is rational he will choose the action a2 that minimizes the payoff M1 (a2 ). 5. Therefore, the expected payoff of player 1 will be M1 = = = min max u1 (a1 , a2 ) a1 , a3 , u1 a2 , a2 , u1 a3 , a1 1 2 1 2 1 2
a2 2 1, −1 2, −2 0, 0 −2
a3 2 7, −7 3, −3 1, −1 −7
4. Since player 2 is rational, he will choose the action a2 that maximizes the payoff m2 (a2 ).பைடு நூலகம்5. Therefore, the expected payoff of player 2 will be m2 = = = max min u2 (a1 , a2 )
1. Since player 1 is intelligent, he will predict any action a2 ∈ A2 that player 2 may choose. 2. Since player 1 is rational, he will choose the action a1 that maximizes his payoff. P2 : P2 : P2 : a1 2 a2 2 a3 2 ⇒ ⇒ ⇒ P1 : P1 : P1 : a3 1 a2 1 a1 1 ⇒ ⇒ ⇒ u1max u1max u1max = 9 = 2 = 7
3 / 24