博弈论(哈佛大学原版教程)

合集下载

博弈论 朱弗登博格

博弈论 朱弗登博格

朱弗登博格是哈佛大学经济学教授,主要研究领域为博弈论和动态经济学。

他与让·梯若尔合著了《博弈论》一书,该书是经济学研究生和高年级本科生学习博弈论的最好教材,也是其他对博弈论有兴趣的读者的必备参考书。

朱弗登博格在博弈论领域的研究涵盖了多种类型的博弈,包括合作博弈、非合作博弈、重复博弈、机制设计等。

他的研究成果在经济学、计算机科学、政治学等多个领域都有广泛的应用。

除了博弈论,朱弗登博格还在动态经济学领域进行了深入的研究,他的研究涉及到市场失灵、经济增长、产业组织等问题。

他的研究成果为理解现代经济现象提供了重要的理论基础。

总的来说,朱弗登博格在博弈论和动态经济学领域的研究成果为学术界和实践界提供了重要的理论指导和决策支持。

完美打印版英文学习Game Theory 博弈论公开课哈佛大学第1到5课字幕讲稿

完美打印版英文学习Game Theory 博弈论公开课哈佛大学第1到5课字幕讲稿

PRINT ECON-159: GAME THEORYChapter 1. What Is Strategy? [00:00:00]Professor Ben Polak: So this is Game Theory Economics 159. If you're here for art history, you're either in the wrong room or stay anyway, maybe this is the right room; but this is Game Theory, okay. You should have four handouts; everyone should have four handouts. There is a legal release form--we'll talk about it in a minute--about the videoing. There is a syllabus, which is a preliminary syllabus: it's also online. And there are two games labeled Game 1 and Game 2. Can I get you all to look at Game 1 and start thinking about it. And while you're thinking about it, I am hoping you can multitask a bit. I'll describe a bit about the class and we'll get a bit of admin under our belts. But please try and look at--somebody's not looking at it, because they're using it as a fan here--so look at Game 1 and fill out that form for me, okay?So while you're filling that out, let me tell you a little bit about what we're going to be doing here. So what is Game Theory? Game Theory is a method of studying strategic situations. So what's a strategic situation? Well let's start off with what's not a strategic situation. In your Economics - in your Intro Economics class in 115 or 110, you saw some pretty good examples of situations that were not strategic. You saw firms working in perfect competition. Firms in perfect competition are price takers: they don't particularly have to worry about the actions of their competitors. You also saw firms that were monopolists and monopolists don't have any competitors to worry about, so that's not a particularly strategic situation. They're not price takers but they take the demand curve. Is this looking familiar for some of you who can remember doing 115 last year or maybe two years ago for some of you? Everything in between is strategic. So everything that constitutes imperfect competition is a strategic setting. Think about the motor industry, the motor car industry. Ford has to worry about what GM is doing and what Toyota is doing, and for the moment at least what Chrysler is doing but perhaps not for long. So there's a small number of firms and their actions affect each other.So for a literal definition of what strategic means: it's a setting where the outcomes that affect you depend on actions, not just on your own actions, but on actions of others. All right, that's as much as I'm going to say for preview right now, we're going to come back and see plenty of this over the course of the next semester.Chapter 2. Strategy: Where Does It Apply? [00:02:16]So what I want to do is get on to where this applies. It obviously applies in Economics, but it also applies in politics, and in fact, this class will count as a Political Science class if you're a Political Science major. You should go check with the DUS in Political Science. It count - Game Theory is very important in law these days. So for those of you--for the half of you--that are going to end up in law school, this is pretty good training. Game Theory is also used in biology and towards the middle of the semester we're actually going to see some examples of Game Theory as applied to evolution. And not surprisingly, Game Theory applies to sport.Chapter 3. (Administrative Issues) [00:02:54]So let's talk about a bit of admin. How are you doing on filling out those games? Everyone managing to Lecture 1 - Introduction: Five First Lessons [September 5, 2007]multitask: filling in Game 1? Keep writing. I want to get some admin out of the way and I want to start by getting out of the way what is obviously the elephant in the room. Some of you will have noticed that there's a camera crew here, okay. So as some of you probably know, Yale is undergoing an open education project and they're videoing several classes, and the idea of this, is to make educational materials available beyond the walls of Yale. In fact, on the web,internationally, so people in places, maybe places in the U.S. or places miles away, maybe in Timbuktu or whatever, who find it difficult to get educational materials from the local university or whatever, can watch certain lectures from Yale on the web.Some of you would have been in classes that do that before. What's going to different about this class is that you're going to be participating in it. The way we teach this class is we're going to play games, we're going to have discussions, we're going to talk among the class, and you're going to be learning from each other, and I want you to help people watching at home to be able to learn too. And that means you're going to be on film, at the very least on mike.So how's that going to work? Around the room are three T.A.s holding mikes. Let me show you where they are: one here, one here, and one here. When I ask for classroom discussions, I'm going to have one of the T.A.s go to you with a microphone much like in "Donahue" or something, okay. At certain times, you're going to be seen on film, so the camera is actually going to come around and point in your direction.Now I really want this to happen. I had to argue for this to happen, cause I really feel that this class isn't about me. I'm part of the class obviously, but it's about you teaching each other and participating. But there's a catch, the catch is, that that means you have to sign that legal release form.So you'll see that you have in front of you a legal release form, you have to be able to sign it, and what that says is that we can use you being shown in class. Think of this as a bad hair day release form. All right, you can't sue Yale later if you had a bad hair day. For those of you who are on the run from the FBI, your Visa has run out, or you're sitting next to your ex-girlfriend, now would be a good time to put a paper bag over your head.All right, now just to get you used to the idea, in every class we're going to have I think the same two people, so Jude is the cameraman; why don't you all wave to Jude: this is Jude okay. And Wes is our audio guy: this is Wes. And I will try and remember not to include Jude and Wes in the classroom discussions, but you should be aware that they're there. Now, if this is making you nervous, if it's any consolation, it's making me very nervous.So, all right, we'll try and make this class work as smoothly as we can, allowing for this extra thing. Let me just say, no one's making any money off this--at least I'm hoping these guys are being paid--but me and the T.A.s are not being paid. The aim of this, that I think is a good aim, it's an educational project, and I'm hoping you'll help us with it. The one difference it is going to mean, is that at times I might hold some of the discussions for the class, coming down into this part of the room, here, to make it a little easier for Jude.All right, how are we doing now on filling out those forms? Has everyone filled in their strategy for the first game? Not yet. Okay, let's go on doing a bit more admin. The thing you mostly care about I'm guessing, is the grades. All right, so how is the grade going to work for this class? 30% of the class will be on problem sets, 30% of the grade; 30% on the mid-term, and 40% on the final; so 30/30/40.The mid-term will be held in class on October 17th; that is also in your syllabus. Please don't anybody tell me late - any time after today you didn't know when the mid-term was and therefore it clashes with 17 different things. The mid-term is on October 17th, which is a Wednesday, in class. All right, the problem sets: there will be roughly ten problem sets and I'll talk about them more later on when I hand them out.The first one will go out on Monday but it will be due ten days later. Roughly speaking they'll be every week.The grade distribution: all right, so this is the rough grade distribution. Roughly speaking, a sixth of the class are going to end up with A's, a sixth are going to end up with A-, a sixth are going to end up with B+, a sixth are going to end up with B, a sixth are going to end up with B-, and the remaining sixth, if I added that up right, are going to end up with what I guess we're now calling the presidential grade, is that right?That's not literally true. I'm going to squeeze it a bit, I'm going to curve it a bit, so actually slightly fewer than a sixth will get straight A's, and fewer than a sixth will get C's and below. We'll squeeze the middle to make them be more B's. One thing I can guarantee from past experience in this class, is that the median grade will be a B+. The median will fall somewhere in the B+'s. Just as forewarning for people who have forgotten what a median is, that means half of you--not approximately half, it means exactly half of you--will be getting something like B+ and below and half will get something like B+ and above.Now, how are you doing in filling in the forms? Everyone filled them in yet? Surely must be pretty close to getting everyone filled in. All right, so last things to talk about before I actually collect them in - textbooks. There are textbooks for this class. The main textbook is this one, Dutta's book Strategy and Games. If you want a slightly tougher book, more rigorous book, try Joel Watson's book, Strategies. Both of those books are available at the bookstore.But I want to warn everybody ahead of time, I will not be following the textbook. I regard these books as safety nets. If you don't understand something that happened in class, you want to reinforce an idea that came up in class, then you should read the relevant chapters in the book and the syllabus will tell you which chapters to read for each class, or for each week of class, all right. But I will not be following these books religiously at all. In fact, they're just there as back up.In addition, I strongly recommend people read, Thinking Strategically. This is good bedtime reading. Do any of you suffer from insomnia? It's very good bedtime reading if you suffer from insomnia. It's a good book and what's more there's going to be a new edition of this book this year and Norton have allowed us to get advance copies of it. So if you don't buy this book this week, I may be able to make the advance copy of the new edition available for some of you next week. I'm not taking a cut on that either, all right, there's no money changing hands.All right, sections are on the syllabus sign up - sorry on the website, sign up as usual. Put yourself down on the wait list if you don't get into the section you want. You probably will get into the section you want once we're done.Chapter 4. Elements of a Game: Strategies, Actions, Outcomes and Payoffs [00:09:40]All right, now we must be done with the forms. Are we done with the forms? All right, so why don't we send the T.A.s, with or without mikes, up and down the aisles and collect in your Game #1; not Game #2, just Game #1.Just while we're doing that, I think the reputation of this class--I think--if you look at the course evaluations online or whatever, is that this class is reasonably hard but reasonably fun. So I'm hoping that's what the reputation of the class is. If you think this class is going to be easy, I think it isn't actually an easy class. It's actually quite a hard class, but I think I can guarantee it's going to be a fun class. Now one reason it's a fun class, is the nice thing about teaching Game Theory - quieten down folks--one thing about teaching Game Theory is, you get to play games, and that's exactly what we've just been doing now. This is our first game and we're going to play games throughout the course, sometimes several times a week,sometimes just once a week.We got all these things in? Everyone handed them in? So I need to get those counted. Has anyone taken the Yale Accounting class? No one wants to - has aspirations to be - one person has. I'll have a T.A. do it, it's all right, we'll have a T.A. do it. So Kaj, can you count those for me? Is that right? Let me read out the game you've just played."Game 1, a simple grade scheme for the class. Read the following carefully. Without showing your neighbor what you are doing, put it in the box below either the letter Alpha or the letter Beta. Think of this as a grade bid. I will randomly pair your form with another form and neither you nor your pair will ever know with whom you were paired. Here's how the grades may be assigned for the class. [Well they won't be, but we can pretend.] If you put Alpha and you're paired with Beta, then you will get an A and your pair a C. If you and your pair both put Alpha, you'll both get B-. If you put Beta and you're paired with Alpha, you'll get a C and your pair an A. If you and your pair both put Beta, then you'll both get B+."So that's the thing you just filled in.Now before we talk about this, let's just collect this information in a more useful way. So I'm going to remove this for now. We'll discuss this in a second, but why don't we actually record what the game is, that we're playing, first. So this is our grade game, and what I'm going to do, since it's kind of hard to absorb all the information just by reading a paragraph of text, I'm going to make a table to record the information. So what I'm going to do is I'm going to put me here, and my pair, the person I'm randomly paired with here, and Alpha and Beta, which are the choices I'm going to make here and on the columns Alpha and Beta, the choices my pair is making.In this table, I'm going to put my grades. So my grade if we both put Alpha is B-, if we both put Beta, wasB+. If I put Alpha and she put a Beta, I got an A, and if I put Beta and she put an Alpha, I got a C. Is that correct? That's more or less right? Yeah, okay while we're here, why don't we do the same for my pair? So this is my grades on the left hand table, but now let's look at what my pair will do, what my pair will get.So I should warn the people sitting at the back that my handwriting is pretty bad, that's one reason for moving forward. The other thing I should apologize at this stage of the class is my accent. I will try and improve the handwriting, there's not much I can do about the accent at this stage.So once again if you both put Alpha then my pair gets a B-. If we both put Beta, then we both get a B+; in particular, my pair gets a B+. If I put Alpha and my pair puts Beta, then she gets a C. And if I put Beta and she puts Alpha, then she gets an A. So I now have all the information that was on the sheet of paper that you just handed in.Now there's another way of organizing this that's standard in Game Theory, so we may as well get used to it now on the first day. Rather then drawing two different tables like this, what I'm going to do is I'm going to take the second table and super-impose it on top of the first table. Okay, so let me do that and you'll see what I mean. What I'm going to do is draw a larger table, the same basic structure: I'm choosing Alpha and Beta on the rows, my pair is choosing Alpha and Beta on the columns, but now I'm going to put both grades in. So the easy ones are on the diagonal: you both get B- if we both choose Alpha; we both get B+ if we both choose Beta. But if I choose Alpha and my pair chooses Beta, I get an A and she gets a C. And if I choose Beta and she chooses Alpha, then it's me who gets the C and it's her who gets the A.So notice what I did here. The first grade corresponds to the row player, me in this case, and the second grade in each box corresponds to the column player, my pair in this case. So this is a nice succinct way of recording what was in the previous two tables. This is an outcome matrix; this tells us everything that wasOkay, so now seems a good time to start talking about what people did. So let's just have a show of hands. How many of you chose Alpha? Leave your hands up so that Jude can catch that, so people can see at home, okay. All right and how many of you chose Beta? There's far more Alphas - wave your hands the Beta's okay. All right, there's a Beta here, okay. So it looks like a lot of - well we're going to find out, we're going to count--but a lot more Alpha's than Beta's. Let me try and find out some reasons why people chose.So let me have the Alpha's up again. So, the woman who's in red here, can we get a mike to the - yeah, is it okay if we ask you? You're not on the run from the FBI? We can ask you why? Okay, so you chose Alpha right? So why did you choose Alpha?Student: [inaudible] realized that my partner chose Alpha, therefore I chose [inaudible].Professor Ben Polak: All right, so you wrote out these squares, you realized what your partner was going to do, and responded to that. Any other reasons for choosing Alpha around the room? Can we get the woman here? Try not to be intimidated by these microphones, they're just mikes. It's okay.Student: The reason I chose Alpha, regardless of what my partner chose, I think there would be better outcomes than choosing Beta.Professor Ben Polak: All right, so let me ask your names for a second-so your name was?Student: Courtney.Professor Ben Polak: Courtney and your name was?Student: Clara Elise.Professor Ben Polak: Clara Elise. So slightly different reasons, same choice Alpha. Clara Elise's reason -what did Clara Elise say? She said, no matter what the other person does, she reckons she'd get a better grade if she chose Alpha. So hold that thought a second, we'll come back to - is it Clara Elise, is that right? We'll come back to Clara Elise in a second. Let's talk to the Beta's a second; let me just emphasize at this stage there are no wrong answers. Later on in the class there'll be some questions that have wrong answers. Right now there's no wrong answers. There may be bad reasons but there's no wrong answers. So let's have the Beta's up again. Let's see the Beta's. Oh come on! There was a Beta right here. You were a Beta right? You backed off the Beta, okay. So how can I get a mike into a Beta? Let' s stick in this aisle a bit. Is that a Beta right there? Are you a Beta right there? Can I get the Beta in here? Who was the Beta in here? Can we get the mike in there? Is that possible? In here - you can leave your hand so that - there we go. Just point towards - that's fine, just speak into it, that's fine.Student: So the reason right?Professor Ben Polak: Yeah, go ahead.Student: I personally don't like swings that much and it's the B-/B+ range, so I'd much rather prefer that to a swing from A to C, and that's my reason.Professor Ben Polak: All right, so you're saying it compresses the range. I'm not sure it does compress the range. I mean if you chose Alpha, you're swinging from A to B-; and from Beta, swinging from B+ to C. I mean those are similar kind of ranges but it certainly is a reason. Other reasons for choosing? Yeah, the guy in blue here, yep, good. That's all right. Don't hold the mike; just let it point at you, that's fine.Student: Well I guess I thought we could be more collusive and kind of work together, but I guess not. So IProfessor Ben Polak: There's a siren in the background so I missed the answer. Stand up a second, so we can just hear you.Student: Sure.Professor Ben Polak: Sorry, say again.Student: Sure. My name is Travis. I thought we could work together, but I guess not.Professor Ben Polak: All right good. That's a pretty good reason.Student: If you had chosen Beta we would have all gotten B+'s but I guess not.Professor Ben Polak: Good, so Travis is giving us a different reason, right? He's saying that maybe, some of you in the room might actually care about each other's grades, right? I mean you all know each other in class. You all go to the same college. For example, if we played this game up in the business school--are there any MBA students here today? One or two. If we play this game up in the business school, I think it's quite likely we're going to get a lot of Alpha's chosen, right? But if we played this game up in let's say the Divinity School, all right and I'm guessing that Travis' answer is reflecting what you guys are reasoning here. If you played in the Divinity School, you might think that people in the Divinity School might care about other people's grades, right? There might be ethical reasons--perfectly good, sensible, ethical reasons--for choosing Beta in this game. There might be other reasons as well, but that's perhaps the reason to focus on. And perhaps, the lesson I want to draw out of this is that right now this is not a game. Right now we have actions, strategies for people to take, and we know what the outcomes are, but we're missing something that will make this a game. What are we missing here?Student: Objectives.Professor Ben Polak: We're missing objectives. We're missing payoffs. We're missing what people care about, all right. So we can't really start analyzing a game until we know what people care about, and until we know what the payoffs are. Now let's just say something now, which I'll probably forget to say in any other moment of the class, but today it's relevant.Game Theory, me, professors at Yale, cannot tell you what your payoff should be. I can't tell you in a useful way what it is that your goals in life should be or whatever. That's not what Game Theory is about. However, once we know what your payoffs are, once we know what your goals are, perhaps Game Theory can you help you get there.So we've had two different kinds of payoffs mentioned here. We had the kind of payoff where we care about our own grade, and Travis has mentioned the kind of payoff where you might care about other people's grades. And what we're going to do today is analyze this game under both those possible payoffs. To start that off, let's put up some possible payoffs for the game. And I promise we'll come back and look at some other payoffs later. We'll revisit the Divinity School later.Chapter 5. Strictly Dominant versus Strictly Dominated Strategies [00:21:38]All right, so here once again is our same matrix with me and my pair, choosing actions Alpha and Beta, but this time I'm going to put numbers in here. And some of you will perhaps recognize these numbers, but that's not really relevant for now. All right, so what's the idea here? Well the first idea is that these numbers represent utiles or utilities. They represent what these people are trying to maximize, what they're toachieve, their goals.The idea is - just to compare this to the outcome matrix - for the person who's me here, (A,C) yields a payoff of--(A,C) is this box--so (A,C) yields a payoff of three, whereas (B-,B-) yields a payoff of 0, and so on. So what's the interpretation? It's the first interpretation: the natural interpretation that a lot of you jumped to straight away. These are people--people with these payoffs are people--who only care about their own grades. They prefer an A to a B+, they prefer a B+ to a B-, and they prefer a B- to a C. Right, I'm hoping I the grades in order, otherwise it's going to ruin my curve at the end of the year. So these people only care about their own grades. They only care about their own grades.What do we call people who only care about their own grades? What's a good technical term for them? In England, I think we refer to these guys - whether it's technical or not - as "evil gits." These are not perhaps the most moral people in the universe. So now we can ask a different question. Suppose, whether these are actually your payoffs or not, pretend they are for now. Suppose these are all payoffs. Now we can ask, not what did you do, but what should you do? Now we have payoffs that can really switch the question to a normative question: what should you do? Let's come back to - was it Clara Elise--where was Clara Elise before? Let's get the mike on you again. So just explain what you did and why again.Student: Why I chose Alpha?Professor Ben Polak: Yeah, stand up a second, if that's okay.Student: Okay.Professor Ben Polak: You chose Alpha; I'm assuming these were roughly your payoffs, more or less, you were caring about your grades.Student: Yeah, I was thinking-Professor Ben Polak: Why did you choose Alpha?Student: I'm sorry?Professor Ben Polak: Why did you choose Alpha? Just repeat what you said before.Student: Because I thought the payoffs - the two different payoffs that I could have gotten--were highest if I chose Alpha.Professor Ben Polak: Good; so what Clara Elise is saying--it's an important idea--is this (and tell me ifI'm paraphrasing you incorrectly but I think this is more or less what you're saying): is no matter what the other person does, no matter what the pair does, she obtains a higher payoff by choosing Alpha. Let's just see that. If the pair chooses Alpha and she chooses Alpha, then she gets 0. If the pair chooses Alpha and she chose Beta, she gets -1. 0 is bigger than -1. If the pair chooses Beta, then if she chooses Alpha she gets 3, Beta she gets 1, and 3 is bigger than 1. So in both cases, no matter what the other person does, she receives a higher payoff from choosing Alpha, so she should choose Alpha. Does everyone follow that line of reasoning? That's a stronger line of reasoning then the reasoning we had earlier. So the woman, I have immediately forgotten the name of, in the red shirt, whose name was-Student: Courtney.Professor Ben Polak: Courtney, so Courtney also gave a reason for choosing Alpha, and it was a perfectly good reason for choosing Alpha, nothing wrong with it, but notice that this reason's a stronger reason. It kind of implies your reason.So let's get some definitions down here. I think I can fit it in here. Let's try and fit it in here.Definition: We say that my strategy Alpha strictly dominates my strategy Beta, if my payoff from Alpha is strictly greater than that from Beta, [and this is the key part of the definition], regardless of what others do.Shall we just read that back? "We say that my strategy Alpha strictly dominates my strategy Beta, if my payoff from Alpha is strictly greater than that from Beta, regardless of what others do." Now it's by no means my main aim in this class to teach you jargon. But a few bits of jargon are going to be helpful in allowing the conversation to move forward and this is certainly one. "Evil gits" is maybe one too, but this is certainly one.Let's draw out some lessons from this. Actually, so you can still read that, let me bring down and clean this board. So the first lesson of the class, and there are going to be lots of lessons, is a lesson that emerges immediately from the definition of a dominated strategy and it's this. So Lesson One of the course is: do not play a strictly dominated strategy. So with apologies to Strunk and White, this is in the passive form, that's dominated, passive voice. Do not play a strictly dominated strategy. Why? Somebody want to tell me why? Do you want to get this guy? Stand up - yeah.Student: Because everyone's going to pick the dominant outcome and then everyone's going to get the worst result - the collectively worst result.Professor Ben Polak: Yeah, that's a possible answer. I'm looking for something more direct here. So we look at the definition of a strictly dominated strategy. I'm saying never play one. What's a possible reason for that? Let's - can we get the woman there?Student: [inaudible]Professor Ben Polak: "You'll always lose." Well, I don't know: it's not about winning and losing. What else could we have? Could we get this guy in the pink down here?Student: Well, the payoffs are lower.Professor Ben Polak: The payoffs are lower, okay. So here's an abbreviated version of that, I mean it's perhaps a little bit longer. The reason I don't want to play a strictly dominated strategy is, if instead, I play the strategy that dominates it, I do better in every case. The reason I never want to play a strictly dominated strategy is, if instead I play the strategy that dominates it, whatever anyone else does I'm doing better than I would have done. Now that's a pretty convincing argument. That sounds like a convincing argument. It sounds like too obvious even to be worth stating in class, so let me now try and shake your faith a little bitin this answer.Chapter 6. Contracts and Collusion [00:29:33]You're somebody who's wanted by the FBI, right?Okay, so how about the following argument? Look at the payoff matrix again and suppose I reason as follows. Suppose I reason and say if we, me and my pair, both reason this way and choose Alpha then we'll both get 0. But if we both reasoned a different way and chose Beta, then we'll both get 1. So I should choose Beta: 1 is bigger than 0, I should choose Beta. What's wrong with that argument? My argument must be wrong because it goes against the lesson of the class and the lessons of the class are gospel right, they're not wrong ever, so what's wrong with that argument? Yes, Ale - yeah good.。

博弈论(哈佛大学原版教程)

博弈论(哈佛大学原版教程)

Lecture XVII:Dynamic Games withIncomplete InformationMarkus M.M¨o biusMay6,2004•Gibbons,sections4.1and4.2•Osborne,chapter101IntroductionIn the last two lectures I introduced the idea of incomplete information. We analyzed some important simultaneous move games such as sealed bid auctions and public goods.In practice,almost all of the interesting models with incomplete informa-tion are dynamic games also.Before we talk about these games we’ll need a new solution concept called Perfect Bayesian Equilibrium.Intuitively,PBE is to extensive form games with incomplete games what SPE is to extensive form games with complete information.The concept we did last time,BNE is a simply the familiar Nash equilibrium under the Harsanyi representation of incomplete information.In principle,we could use the Harsanyi representation and SPE in dynamic games of incomplete information.However,dynamic games with incomplete information typi-cally don’t have enough subgames to do SPE.Therefore,many’non-credible’threats are possible again and we get too many unreasonable SPE’s.PBE allows subgame reasoning at information sets which are not single nodes whereas SPE only applies at single node information sets of players(because only those can be part of a proper subgame).The following example illustrates some problems with SPE.11.1Example I-SPEOurIts unique SPE is(R,B).The next game looks formally the same-however,SPE is the same as NEThe old SPE survives-all(pR+(1−p)RR,B)for all p is SPE.But there are suddenly strange SPE such as(L,qA+(1−q)B)for q≥1.Player2’s2strategy looks like an non-credible threat again-but out notion of SPE can’t rule it out!Remember:SPE can fail to rule out actions which are not optimal given any’beliefs’about uncertainty.Remark1This problem becomes severe with incomplete information:moves of Nature are not observed by one or both players.Hence the resulting exten-sive form game will have no or few subgames.This and the above example illustrate the need to replace the concept of a’subgame’with the concept of a ’continuation game’.1.2Example II:Spence’s Job-Market SignallingThe most famous example of dynamic game with incomplete information is Spence’s signalling game.There are two players-afirm and a worker.The worker has some private information about his ability and has the option of acquiring some cation is always costly,but less so for more able workers.However,education does not improve the worker’s productivity at all!In Spence’s model education merely serves as a signal tofirms.His model allows equilibria where able workers will acquire educa-tion and less able workers won’t.Hencefirms will pay high wages only to those who acquired education-however,they do this because education has revealed the type of the player rather than improved his productivity.Clearly this is an extreme assumption-in reality education has presum-ably dual roles:there is some signalling and some productivity enhancement. But it is an intriguing insight that education might be nothing more than a costly signal which allows more able workers to differentiate themselves from less able ones.Let’s look at the formal set-up of the game:•Stage0:Nature chooses the abilityθof a worker.SupposeΘ={2,3} and that P rob(θ=2)=p and P rob(θ=3)=1−p.•Stage I:Player1(worker)observes his type and chooses eduction levele∈{0,1}.Education has cost ceθ.Note,that higher ability workershave lower cost and that getting no education is costless.•Stage II:Player2(the competitive labor market)chooses the wage rate w(e)of workers after observing the education level.3Suppose that u1(e,w,;θ)=w−ceθand that u2(e,w;θ)=−(w−θ)2.Note,that education does not enter thefirm’s utility function.Also note, that the BR of thefirm is to set wages equal to the expected ability of the worker under this utility function.This is exactly what the competitive labor market would do ifθis equal to the productivity of a worker(the dollar amount of output he produces).If afirm pays above the expected productivity it will run a loss,and if it pays below some otherfirm would come in and offer more to the worker.So the market should offer exactly the expected productivity.The particular(rather strange-looking)utility function we have chosen implements the market outcome with a singlefirm -it’s a simple shortcut.Spence’s game is a signalling game.Each signalling game has the same three-part structure:nature chooses types,the sender(worker)observes his type and takes and action,the receiver(firm)sees that action but not the worker’s type.Hence thefirm tries to deduce the worker’s type using his action.His action therefore serves as a signal.Spence’s game is extreme because the signal(education)has no value to thefirm except for its sig-nalling function.This is not the case for all signalling models:think of a car manufacturer who can be of low or high quality and wants to send a signal to the consumer that he is a high-quality producer.He can offer a short or long warranty for the car.The extended warranty will not only signal his type but also benefit the consumer.The(Harsanyi)extensive form representation of Spence’s game(and any other41.3Why does SPE concept together with Harsanyirepresentation not work?We could find the set of SPE in the Harsanyi representation of the game.The problem is that the game has no proper subgame in the second round when the firm makes its decision.Therefore,the firm can make unreasonable threats such as the following:both workers buy education and the firms pays the educated worker w =3−p (his expected productivity),and the uneducated worker gets w =−235.11(or something else).Clearly,every worker would get education,and the firm plays a BR to a worker getting education (check for yourself using the Harsanyi representation).However,the threat of paying a negative wage is unreasonable.Once the firm sees a worker who has no education it should realize that the worker has a least ability level 2and should therefore at least get a wage of w =2.2Perfect Bayesian EquilibriumLet G be a multistage game with incomplete information and observed ac-tions in the Harsanyi representation.Write Θi for the set of possible types for player i and H i to be the set of possible information sets of player i .For each information set h i ∈H i denote the set of nodes in the information set with X i (h i )and X H i X i (h i ).A strategy in G is a function s i :H i →∆(A i ).Beliefs are a function µi :H i →∆(X i )such that the support of belief µi (h i )is within X i (h i ).Definition 1A PBE is a strategy profile s ∗together with a belief system µsuch that1.At every information set strategies are optimal given beliefs and oppo-nents’strategies (sequential rationality ).σ∗i (h )maximizes E µi (x |h i )u i σi ,σ∗−i |h,θi ,θ−i2.Beliefs are always updated according to Bayes rule when applicable .The first requirement replaces subgame perfection.The second requirement makes sure that beliefs are derived in a rational manner -assuming that you know the other players’strategies you try to derive as many beliefs as5possible.Branches of the game tree which are reached with zero probability cannot be derived using Bayes rule:here you can choose arbitrary beliefs. However,the precise specification will typically matter for deciding whether an equilibrium is PBE or not.Remark2In the case of complete information and observed actions PBE reduces to SPE because beliefs are trivial:each information set is a singleton and the belief you attach to being there(given that you are in the correspond-ing information set)is simply1.2.1What’s Bayes Rule?There is a close connection between agent’s actions and their beliefs.Think of job signalling game.We have to specify the beliefs of thefirm in the second stage when it does not know for sure the current node,but only the information set.Let’s go through various strategies of the worker:•The high ability worker gets education and the low ability worker does not:e(θ=2)=0and e(θ=3)=1.In this case my beliefs at the information set e=1should be P rob(High|e=1)=1and similarly, P rob(High|e=0)=0.•Both workers get education.In this case,we should have:P rob(High|e=1)=1−p(1) The beliefs after observing e=0cannot be determined by Bayes rule because it’s a probability zero event-we should never see it if players follow their actions.This means that we can choose beliefs freely at this information set.•The high ability worker gets education and the low ability worker gets education with probability q.This case is less trivial.What’s the prob-ability of seeing worker get education-it’s1−p+pq.What’s the probability of a worker being high ability and getting education?It’s 1−p.Hence the probability that the worker is high ability after wehave observed him getting education is1−p1−p+pq .This is the non-trivialpart of Bayes rule.6Formally,we can derive the beliefs at some information set h i of player i as follows.There is a probability p(θj)that the other player is of typeθj. These probabilities are determined by nature.Player j(i.e.the worker)has taken some action a j such that the information set h i was reached.Eachtype of player j takes action a j with some probabilityσ∗j (a j|θj)according tohis equilibrium strategy.Applying Bayes rule we can then derive the belief of player i that player j has typeθj at information set h i:µi(θj|a j)=p(θj)σ∗j(a j|θj)˜θj∈Θjp˜θjσ∗ja j|˜θj(2)1.In the job signalling game with separating beliefs Bayes rule gives usexactly what we expect-we belief that a worker who gets education is high type.2.In the pooling case Bayes rule gives us P rob(High|e=1)=1−pp×1+(1−p)×1=1−p.Note,that Bayes rule does NOT apply forfinding the beliefs after observing e=0because the denominator is zero.3.In the semi-pooling case we get P rob(High|e=1)=(1−p)×1p×q+(1−p)×1.Sim-ilarly,P rob(High|e=0)=(1−p)×0p×(1−q)+(1−p)×0=0.3Signalling Games and PBEIt turns out that signalling games are a very important class of dynamic games with incomplete information in applications.Because the PBE con-cept is much easier to state for the signalling game environment we define it once again in this section for signalling games.You should convince yourself that the more general definition from the previous section reduces to the definition below in the case of signalling games.3.1Definitions and ExamplesEvery signalling game has a sender,a receiver and two periods.The sender has private information about his type and can take an action in thefirst action.The receiver observes the action(signal)but not the type of the sender,and takes his action in return.7•Stage0:Nature chooses the typeθ1∈Θ1of player1from probability distribution p.•Stage1:Player1observesθ1and chooses a1∈A1.•Stage2:Player2observes a1and chooses a2∈A2.The players utilities are:u1=u1(a1,a2;θ1)(3)u2=u2(a1,a2;θ1)(4) 3.1.1Example1:Spence’s Job Signalling Game•worker is sender;firm is receiver•θis the ability of the worker(private information to him)•A1={educ,no educ}•A2={wage rate}3.1.2Example2:Initial Public Offering•player1-owner of privatefirm•player2-potential investor•Θ-future profitability•A1-fraction of company retained•A2-price paid by investor for stock3.1.3Example3:Monetary Policy•player1=FED•player2-firms•Θ-Fed’s preference for inflation/unemployment•A1-first period inflation•A2-expectation of second period inflation83.1.4Example4:Pretrial Negotiation•player1-defendant•player2-plaintiff•Θ-extent of defendant’s negligence•A1-settlement offer•A2-accept/reject3.2PBE in Signalling GamesA PBE in the signalling game is a strategy profile(s∗1(θ1),s∗2(a1))togetherwith beliefsµ2(θ1|a1)for player2such that1.Players strategies are optimal given their beliefs and the opponents’strategies,i.e.s∗1(θ1)maximizes u1(a1,s∗2(a1);θ1)for allθ1∈Θ1(5)s∗2(a1)maximizesθ1∈Θ1u2(a1,a2;θ1)µ2(θ1|a1)for all a1∈A1(6)2.Player2’s beliefs are compatible with Bayes’rule.If any type of player1plays a1with positive probability thenµ2(θ1|a1)=p(θ1)P rob(s∗1(θ1)=a1)θ1∈Θ1p(θ1)P rob(s∗1(θ1)=a1)for allθ1∈Θ13.3Types of PBE in Signalling GamesTo help solve for PBE’s it helps to think of all PBE’s as taking one of the following three forms”1.Separating-different types take different actions and player2learnstype from observing the action2.Pooling-all types of player1take same action;no info revealed3.Semi-Separating-one or more types mixes;partial learning(oftenonly type of equilibrium9Remark3In the second stage of the education game the”market”must have an expectation that player1is typeθ=2and attach probabilityµ(2|a1) to the player being type2.The wage in the second period must be between 2and3.This rules out the unreasonable threat of the NE I gave you in the education game(with negative wages).1Remark4In the education game suppose the equilibrium strategies ares∗1(θ=2)=0and s∗1(θ=3)=1,i.e.only high types get education.Thenfor any prior(p,1−p)at the start of the game beliefs must be:µ2(θ=2|e=0)=1µ2(θ=3|e=0)=0µ2(θ=2|e=1)=0µ2(θ=3|e=1)=1If player1’s strategy is s∗1(θ=2)=12×0+12×1and s∗1(θ=3)=1:µ2(θ=2|e=0)=1µ2(θ=3|e=0)=0µ2(θ=2|e=1)=p2p+1−p=p2−pµ2(θ=3|e=1)=2−2p 2−pAlso note,that Bayes rule does NOT apply after an actions which should notoccur in equilibrium.Suppose s∗1(θ=2)=s∗1(θ=3)=1then it’s OK toassumeµ2(θ=2|e=0)=57 64µ2(θ=3|e=0)=7 64µ2(θ=2|e=1)=pµ2(θ=3|e=1)=1−pThefirst pair is arbitrary.1It also rules out unreasonable SPE in the example SPE I which I have initially.Under any beliefs player2should strictly prefer B.10Remark5Examples SPE II and SPE III from the introduction now make sense-if players update according to Bayes rule we get the’reasonable’beliefs of players of being with equal probability in one of the two nodes.4Solving the Job Signalling GameFinally,after11tough pages we can solve our signalling game.The solution depends mainly on the cost parameter c.4.1Intermediate Costs2≤c≤3A separating equilibrium of the model is when only the able worker buys education and thefirm pays wage2to the worker without education and wage3to the worker with education.Thefirm beliefs that the worker is able iffhe gets educated.•The beliefs are consistent with the equilibrium strategy profile.•Now look at optimality.Player2sets the wage to the expected wage so he is maximizing.•Player1of typeθ=2gets3−c2≤2for a1=1and2for a1=0.Hence he should not buy education.•Player1of typeθ=3gets3−c3≥2for a1=1and2for a1=0.Hence he should get educated.1.Note that for too small or too big costs there is no separating PBE.2.There is no separating PBE where theθ=2type gets an educationand theθ=3type does not.4.2Small Costs c≤1A pooling equilibrium of the model is that both workers buy education and that thefirm pays wage w=3−p if it observes education,and wage2 otherwise.Thefirm believes that the worker is able with probability1−p if it observes education,and that the worker is of low ability if it observes no education.11•The beliefs are consistent with Bayes rule for e=1.If e=0has been observed Bayes rule does not apply because e=0should never occur -hence any belief isfine.The belief that the worker is low type if he does not get education makes sure that the worker gets punished for not getting educated.•Thefirm pays expected wage-hence it’s optimal response.The lowability guy won’t deviate as long2.5−c2≥2and the high ability typewon’t deviate as long as2.5−c3≥2.For c≤1both conditions aretrue.1.While this pooling equilibrium only works for small c there is alwaysanother pooling equilibrium where no worker gets education and the firms thinks that any worker who gets education is of the low type. 4.31<c<2Assume that p=12for this section.In the parameter range1<c<2there isa semiseparating PBE of the model.The high ability worker buys education and the low ability worker buys education with positive probability q.Thewage is w=2if thefirm observes no education and set to w=2+11+q ifit observes education.The beliefs that the worker is high type is zero if he gets no education and11+qif he does.•Beliefs are consistent(check!).•Firm plays BR.•Player1of low type won’t deviate as long as2+11+q −c2≤2.•Player1of high type won’t deviate as long as2+11+q −c3≥2.Set1+q=2c .It’s easy to check that thefirst condition is binding andthe second condition is strictly true.So we are done if we choose q=2c −1.Note,that as c→2we get back the separating equilibrium and as c→1we get the pooling one.12。

哈佛博弈课

哈佛博弈课
哈佛博弈课
读书笔记模板
01 思维导图
03 读书笔记 05 目录分析
目录
02 内容摘要 04 精彩摘录 06 作者介绍
思维导图
关键字分析思维导图
生活
意愿
剖析 课
技巧
信息
成本

参与者
博弈论 博弈
原则
哈佛
策略
方向
选择
艺术
理性
重要性
内容摘要
这是一本关于哈佛博弈课的经管书,此书将博弈论层层剖析,结合生活中的例子,结合当今社会热点,将我 们生活中可能用到的博弈做出讲解,有利于人们在生活中运用。读此书,让你“知己知彼”,了解对方的想法, 了解他们知道些什么,是什么激励着他们,甚至他们是怎么看你的;读此书,让你“洞察先机”,让你的策略选 择不能完全出于你的主观意愿,还能考虑到其他参与者如何行动;读此书,让你“影响对方”,扭转对方的思维, 做到进可攻退可守,瞬间提高决断力,引导并利用他人的情绪,把胜利导向自己,扭转对方的思维,做到进可攻 退可守。
第二课这,就是博 弈
第一课不一样的博 弈
第三课博弈给我们 带来什么
绑住自己的手 蚂蚁和狮子的策略 整体与联盟的较量 困扰NBA的高薪难题 非合作的选择
策略性的互动决策 博弈的构成要素 什么叫做理性 无处不在的对局
实现均衡 零和博弈 非零和博弈 负和博弈 正和博弈 多次博弈与单次博弈
第五课蜈蚣的博弈
强硬与温和的演进 威胁与承诺并举
双赢会持续吗 找到自己的位置与方向
搭便车 搭便车行为无处不在
第十一课你不是你 第十二课最优收益
第十三课信息的对称
第十四课不要做这样 的笨蛋
第十六课沟通的技 巧
第十五课你到底想 要什么

section1(博弈论讲义(Harvard University))

section1(博弈论讲义(Harvard University))
The decision to help in a dispute depends on one’s ability to influence the outcome, one’s level of motivation, and the costliness of getting involved
Simplifies to... (PBA + PBC -1)(UBA -UBC) = (KBA -KBC) Simplifies to...
[b/(a + b + c)] (UBA -UBC) = (KBA -KBC)
Translation: Validity
• What is the point at which B is indifferent? [b/(a + b + c)] (UBA -UBC) = (KBA -KBC) • [b/(a + b + c)] = resources B can contribute • (UBA -UBC) = B’s motivation for A vs. C • (KBA -KBC) = B’s costs for A vs. C
Analysis: Whither Alliances?
• Adopt the perspective of a player: B • B’s utility from an alliance with A = PBA(UBA) + (1-PBA)(UBC) - KBA (1) • B’s utility from an alliance with C = PBC(UBC) + (1-PBC)(UBA) - KBC (2) • What if equation 1 > equation 2?

section10(博弈论讲义(Harvard University))

section10(博弈论讲义(Harvard University))

Principal-Agent Models&Incentive Compatibility/Participation Constraints Suppose Pat manages a large computer software company and Allen is a talented program designer.Pat would like to hire Allen to develop a new software package.If Allen works for Pat,Allen must choose whether or not to expend high effort or low effort on the job.At the end of the work cycle,Pat will learn whether the project is successful or not.A successful project yields revenue of6for thefirm,whereas the revenue is2if the project is unsuccessful.Success depends on Allen’s high effort as well as on a random factor.Specif-ically,if Allen expends high effort,then success will be achieved with probability 1/2;if Allen expends low effort,then the project is unsuccessful for sure.As-sume that,to epxend high effort,Allen must endure a personal cost of1.The parties can write a contract that specifies compensationfor Allen conditional on whether the project is successful(but it cannot be conditioned directly on Allen’s effort).Assume that Allen values money acording to the utility func-tion v A(x)=x a.Assume0<a<1.Finally,suppose Pat’s utility function is v P(x)=x.Imagine that the games interact as depicted above.At the beginning of the game,Pat offers Allen a wage and bonus package.The wage w is to be paid regardless of the project outcome,whereas the bonus b would be paid only if the project is successful.Then Allen decides whether or not to accept the contract. If he declines(N)the game ends,with Pat getting0and Allen obtaining utility of1(corresponding to his outside opportunities).If Allen accepts the contract(Y),then he decides whether to exert high(H) or low(L)effort.Low effort leads to an unsuccessful project,whereby Pat gets revenue of2minus the wage w and Allen gets his utility of the wage,w a. High effort leads to a chance node,where nature picks whether the project is successful(with probability1/2)or not(probability1/2).An unsuccessful project implies the same payoffs as with low effort,except that,in this case, Allen also pays his effort cost of1.A successful project raises Pat’s revenue to 6and trigers the bonus b paid to Allen in addition to the wage.Calculating the expected payoffs from the chance node,we can rewrite the game as shown above.To solve the game and learn about the interaction of risk and incentives, we use backward induction.Start by observing that Pat would certainly like Allen to exert high effort.In fact,it is inefficient for Allen to expend low effort, because Pat can compensate Allen for exerting high effort by increasing his pay by1(offsetting Allen’s effort cost).By doing so,Pat’s expected revenue increases by2.Another way of looking at this is that if Allen’s effort were verifiable,the parties would write a contract that induces high effort.Given that high effort is desired,we must ask whether there is a contract that induces it.Can Patfind a wage and bonus package that motivates Allen to exert high effort and Gives Pat a large payoff?What can be accomplished w/out a bonus?b=0implies Allen has no incentive to exert high effort;he gets w a when he chooses L and less(w−1)a,when he chooses H.The best that Pat can do without a bonus is to offer w=1,which Allen is willing to accept(he1will accept no less,given his outside option).Thus,the best no-bonus contract (w=1and b=0)yields the payoffvector(1,1).Next,consider a bonus contract,designed to induce high effort.In order for him to be motivated to exert high effort,Allen’s expected payofffrom H must be at least as great as is his payofffrom L:1 2(w+b−1)+12(w−1)a≥w a.In principal-agent models,this kind of inequality is usally called the effort constraint or incentive compatibility constraint.In addition to the effort con-straint,the contract must give Allen an expected payoffat least as great as the value of his outside opportunity:1 2(w+b−1)+12(w−1)a≥1Theorists call this participation constraint.Assuming she wants to motivate high effort,Pat will offer Allen the bonus contract satisfying the two inequalities above.2。

section9(博弈论讲义(Harvard University))

section9(博弈论讲义(Harvard University))

• Beliefs mean something new here!
– Probability distribution over location in the information set.
• Strategies corresponding to events off the equilibrium path may be paired with any beliefs, because these strategies are not consistent with Bayes‘ Rule (and occur with probability zero.) • Two major types of equilibria:
• In Nash equilibrium, the credibility of best responses judged along equilibrium path; off the path, NE may include incredible strategies • In subgame perfect NE, best replies are judged in every subgame • In perfect Bayesian equilibrium, best replies judged at each information set
– Here, all strategies are credible
• Think of the pirate game: (99,0,1) is SPE, but there are many not-credible NE of the form (100-x, 0, x)
• Grim trigger strategies are supported as subgame perfect equilibria

博弈论(哈佛大学原版教程)

博弈论(哈佛大学原版教程)

博弈论(哈佛⼤学原版教程)Lecture XIII:Repeated GamesMarkus M.M¨o biusApril19,2004Gibbons,chapter2.3.B,2.3.COsborne,chapter14Osborne and Rubinstein,sections8.3-8.51IntroductionSo far one might get a somewhat misleading impression about SPE.When we?rst introduced dynamic games we noted that they often have a large number of(unreasonable)Nash equilibria.In the models we’ve looked at so far SPE has’solved’this problem and given us a unique NE.In fact,this is not really the norm.We’ll see today that many dynamic games still have a very large number of SPE.2Credible ThreatsWe introduced SPE to rule out non-credible threats.In many?nite horizon games though credible threats are common and cause a multiplicity of SPE.Consider the following game:1actions before choosing the second period actions.Now one way to get a SPE is to play any of the three pro?les above followed by another of them (or same one).We can also,however,use credible threats to get other actions played inperiod1,such as:Play(B,R)in period1.If player1plays B in period1play(T,L)in period2-otherwise play (M,C)in period2.It is easy to see that no single period deviation helps here.In period2a NE is played so obviously doesn’t help.In period1player1gets4+3if he follows strategy and at most5+1 if he doesn’t.Player2gets4+1if he follows and at most2+1if he doesn’t. Therefore switching to the(M,C)equilibrium in period2is a credible threat.23Repeated Prisoner’s DilemmaNote,that the PD doesn’t have multiple NE so in a?nite horizon we don’t have the same easy threats to use.Therefore,the? nitely repeated PD has a unique SPE in which every player defects in eachperiod.In in?nite other types of threats are credible.Proposition1In the in?nitely repeated PD withδ≥12there exists a SPEin which the outcome is that both players cooperate in every period. Proof:Consider the following symmetric pro?le:s i(h t)=CIf both players have played C in everyperiod along the path leading to h t.D If either player has ever played D.To see that there is no pro?table single deviation note that at any h t such that s i(h t)=D player i gets:0+δ0+δ20+..if he follows his strategy and1+δ0+δ20+..if he plays C instead and then follows s i.At any h t such that s i(h t)=C player i gets:1+δ1+δ21+..=1 1?δ3if he follows his strategy and2+δ0+δ20+..=2if he plays D instead and then follows s i.Neither of these deviations is worth while ifδ≥12.QEDRemark1While people sometimes tend to think of this as showing that people will cooperate in they repeatedly interact ifdoes not show this.All it shows is that there is one SPE in which they do.The correct moral to draw is that there many possible outcomes.3.1Other SPE of repeated PD1.For anyδit is a SPE to play D every period.2.Forδ≥12there is a SPE where the players play D in the?rst periodand then C in all future periods.3.Forδ>1√2there is a SPE where the players play D in every evenperiod and C in every odd period.4.Forδ≥12there is a SPE where the players play(C,D)in every evenperiod and(D,C)in every odd period.3.2Recipe for Checking for SPEWhenever you are supposed to check that a strategy pro?le is an SPE you should?rst try to classify all histories(i.e.all information sets)on and o?the equilibrium path.Then you have to apply the SPDP for each class separately.Assume you want to check that the cooperation with grim trigger pun-ishment is SPE.There are two types of histories you have to check.Along the equilibrium path there is just one history:everybody coop-erated so far.O?the equilibrium path,there is again only one class: one person has defected.4Assume you want to check that cooperating in even periods and defect-ing in odd periods plus grim trigger punishment in case of deviation by any player from above pattern is SPE.There are three types of his-tories:even and odd periods along the equilibrium path,and o?the equilibrium path histories.Assume you want to check that TFT(’Tit for Tat’)is SPE(which it isn’t-see next lecture).Then you have you check four histories:only the play of both players in the last period matters for future play-so there are four relevant histories such as player1and2both cooperated in the last period,player1defected and player2cooperated etc.1Sometimes the following result comes in handy.Lemma1If players play Nash equilibria of the stage game in each period in such a way that the particular equilibrium being played in a period is a function of time only and does not depend on previous play,then this strategy is a Nash equilibrium. The proof is immediate:we check for the SPDP.Assume that there is a pro?table deviation.Such a deviation will not a?ect future play by assump-tion:if the stage game has two NE,for example,and NE1is played in even periods and NE2in odd periods,then a deviation will not a?ect future play.1 Therefore,the deviation has to be pro?table in the current stage game-but since a NE is being played no such pro?table deviation can exist.Corollary1A strategy which has players play the same NE in each period is always SPE.In particular,the grim trigger strategy is SPE if the punishment strategy in each stage game is a NE(as is the case in the PD). 4Folk TheoremThe examples in3.1suggest that the repeated PD has a tremendous number of equilibria whenδis large.Essentially,this means that game theory tells us we can’t really tell what is going to happen.This turns out to be an accurate description of most in?nitely repeated games.1If a deviation triggers a switch to only NE1this statement would no longer be true.5Let G be a simultaneous move game with action sets A1,A2,..,A I and mixed strategy spacesΣ1,Σ2,..,ΣI and payo?function?u i.De?nition1A payo?vector v=(v1,v2,..,v I)? I is feasible if there exists action pro?les a1,a2,..,a k∈A and non-negative weightsω1,..,ωI which sum up to1such thatv i=ω1?u ia1+ω2?u ia2+..+ωk?u ia k+De?nition2A payo?vector v is strictly individually rational ifv i>v i=minσ?i∈Σ?imaxσi(σ?i)∈Σiu i(σi(σ?i),σ?i)(1)We can think of this as the lowest payo?a rational player could ever get in equilibrium if he anticipates hisopponents’(possibly non-rational)play.Intuitively,the minmax payo?v i is the payo?player1can guarantee to herself even if the other players try to punish her as badly as they can.The minmax payo?is a measure of the punishment other players can in?ict. Theorem1FolkTheorem.Suppose that the set of feasible payo?s of G is I-dimensional.Then for any feasible and strictly individually rational payo?vector v there existsδ<1such that for allδ>δthere exists a SPE x?of G∞such that the average payo?to s?is v,i.e.u i(s?)=v i 1?δThe normalized(or average)payo?is de?ned as P=(1?δ)u i(s?).It is the payo?which a stage game would have to generate in each period such that we are indi?erent between that payo?stream and the one generates by s?:P+δP+δ2P+...=u i(s?)4.1Example:Prisoner’s DilemmaThe feasible payoset is the diamond bounded by(0,0),(2,-1),(-1,2) and(1,1).Every point inside can be generated as a convex combina-tions of these payo?vectors.6The minmax payofor each player is0as you can easily check.The other player can at most punish his rival by defecting,and each playerthat the equilibria I showed before generate payo?s inside this area.4.2Example:BOSConsiderHere each player can guarantee herself at least payo?23which is the pay-o?from playing the mixed strategy Nash equilibrium.You can check that whenever player2mixes with di?erent probabilities,player1can guarantee herself more than this payo?by playing either F or O all the time.4.3Idea behind the Proof1.Have players on the equilibrium path play an action with payo?v(oralternate if necessary to generate this payo?).22.If some player deviates punish him by having the other players for Tperiods chooseσ?i such that player i gets v i.3.After the end of the punishment phase reward all players(other than i)for having carried out the punishment by switching to an action pro?lewhere player i gets v Pij+ .2For example,in the BoS it is not possible to generate32,32in the stage game evenwith mixing.However,if players alternate and play(O,F)and then(F,O)the players can get arbitrarily close for largeδ. 8。

博弈论-哈佛大学-第五节

博弈论-哈佛大学-第五节

Acquiesce Fight
Incumbent
1
In
ø
Out
Spend
Save
Fight 2
1
Acquiesce
Spend 1 Save
Challenger
ø
In
Out
Incumbent
Acquiesce
Fight
Challenger Hit Back
Back out
Extensive Games w/ Perfect Info
(1, 2)
• In a strategic game, NE rationale is that in steady state, each player’s experience playing the game leads her belief about the other players’ actions to be correct.
– The strategic setting – Two-round, 3 candidate closed rule game – Two-round, n-candidate closed rule game – Infinite-round, 3 candidate closed rule game – Open-rule games – Endogenizing the rules
• Terminal History: (In, Acquiesce), (In, Fight), and (Out)
• Player Function: P(ø) = challenger , P(In) = incumbent
• Preferences for the Players:

section3(博弈论讲义(Harvard University))

section3(博弈论讲义(Harvard University))

Modus Operandi
(1) encourage honest information-sharing, and Are there systems which (2) use that information in a reasonable way?
• First: Define the setting
X 1 2 3 B A C Y B C A Z C B A
Is social order in Condorcet world going to match the social order in the Alternative world?
B vs. C? A vs. B? A vs. C?
Arrow’s Theorem: Step II
• Positive responsiveness
– Departures from a tie must break a tie
So, what is still admissible?
May’s Theorem: Step III
May‟s Thm.
A social welfare functional F(p1, p2, … , pN) corresponds to majority voting iff it satisfies: (1) symmetry among agents; (2) neutrality among alternatives; and (3) positive responsiveness
– Only my A vs. B is relevant to society‟s A vs. B – For its A vs. B, society only needs my A vs. B

博弈论(哈佛大学原版教程)

博弈论(哈佛大学原版教程)
The probability of player i winning the auction with bidding b is
P rob(fj(vj) ≤ b) = P rob(vj ≤ fj−1(b)) = F (fj−1(b)) = fj−1(b)
(1)
The last equation follows because F is the uniform distribution. The expected utility from bidding b is therefore:
In all our auctions there are n participants and each participant has a valuation vi and submits a bid bi (his action).
The rules of the auction determine the probability qi(b1, .., bn) that agent i wins the auction and the expected price pi(b1, .., bn) which he pays. His utility is simple ui = qivi − pi.a
(vi

f (vi))

vi
=
0
(4)
This is a differential equation and can be rewritten as follows:
vi = vif (vi) + f (vi)
(5)
We can integrate both sides and get:
1 2
vi2

博弈论(哈佛大学原版教程)

博弈论(哈佛大学原版教程)

Lecture XII:Analysis of Infinite GamesMarkus M.M¨o biusApril7,2004•Gibbons,chapter2.1.A,2.1.B,2.2.A•Osborne,sections14.1-14.4,16•Oxborne and Rubinstein,sections6.5,8.1and8.21Introduction-Critique of SPEThe SPE concept eliminates non-credible threats but it’s worth to step back for am minute and ask whether we think SPE is reasonable or in throwing out threats we have been overzealous.Practically,for this course the answer will be that SPE restrictions are OK and we’ll always use them in extensive form games.However,it’s worth looking at situations where it has been criticized.Some of the worst anoma-lies disappear in infinite horizon games which we study next.1.1Rationality offthe Equilibrium PathIs it reasonable to play NE offthe equilibrium path?After all,if a player does not follow the equilibrium he is probably as stupid as a broomstick. Why should we trust him to play NE in the subgame?Let’s look at the following game to illustrate that concern:1Here(L,A,x)is the unique SPE.However,player2has to put a lot of trust into player1’s rationality in order to play x.He must believe that player1is smart enough tofigure out that A is a dominant strategy in the subgame following R.However,player2might have serious doubts about player1’s marbles after the guy has just foregone5utils by not playing L.1 1.2Multi-Stage GamesLemma1The unique SPE of thefinitely repeated Prisoner’s Dilemma game in which players get the sum of their payoffs from each stage game has every player defecting at each information set.The proof proceeds by analyzing the last stage game where we would see defection for sure.But then we would see defection in the second to last stage game etc.In the same way we can show that thefinitely repeated Bertrand game results in pricing at marginal cost all the time.Remark1The main reason for the breakdown of cooperation in thefinitely repeated Prisoner’s Dilemma is not so much SPE by itself by the fact that there is afinal period in which agents would certainly defect.This raises the question whether an infinitely repeated PD game would allow us to cooperate. Essentially,we could cooperate as long as the other player does,and if there 1After all any strategy in which L is played strictly dominates any strategy in which R is played in the normal form.2is defection,we defect from then on.This still looks like a SPE-in any subgame in which I have defected before,I might as well defect forever.If I haven’t defected yet,I can jeopardize cooperation by defection,and therefore should not do it as long as I care about the future sufficiently.These results should make us suspicious.Axelrod’s experiments(see fu-ture lecture)showed that in thefinitely repeated Prisoner’s Dilemma people tend to cooperate until the last few periods when the’endgame effect’kicks in.Similarly,there are indications that rivalfirms can learn to collude if they interact repeatedly and set prices above marginal cost.This criticisms of SPE is reminiscent of our criticism of IDSDS.In both cases we use an iterative procedure tofind equilibrium.We might have doubts where real-world subjects are able(and inclined)to do this calculation.game is to drop out immediately(play L)at each stage.However,in experi-3ments people typically continue until almost the last period before they drop out.2Infinite Horizon GamesOf the criticism of SPE the one we will take most seriously is that longfinite horizon models do not give reasonable answers.Recall that the problem was that the backward induction procedure tended to unravel’reasonably’looking strategies from the end.It turns out that many of the anomalies go away once we model these games as infinite games because there is not endgame to be played.The prototypical model is what Fudenberg and Tirole call an infinite horizon multistage game with observed actions.•At times t=0,1,2,..some subset of the set of players simultaneously chooses actions.•All players observe the period t actions before choosing period t+1 actions.•Players payoffs maybe any function of the infinite sequence of actions (play does not end in terminal nodes necessarily any longer)2.1Infinite Games with DiscountingOften we assume that player i’s payoffs are of the form:u i(s i,s−i)=u i0(s i,s−i)+δu i1(s i,s−i)+δ2u i2(s i,s−i)+..(1) where u it(s i,s−i)is a payoffreceived at t when the strategies are followed.2.1.1Interpretation ofδ1.Interest rateδ=11+r .Having two dollars today or two dollars tomor-row makes a difference to you:your two dollars today are worth more, because you can take them to the bank and get2(1+r)Dollars to-morrow where r is the interest rate.By discounting future payoffs withδ=11+r we correct for the fact that future payoffs are worth less to usthan present payoffs.42.Probabilistic end of game:suppose the game is reallyfinite,but thatthe end of the game is not deterministic.Instead given that stage t is reached there is probabilityδthat the game continues and probability 1−δthat the game ends after this stage.Note,that the expectednumber of periods is then11−δandfinite.However,we can’t applybackward induction directly because we can never be sure that any round is the last one.The probabilistic interpretation is particularly attractive for interpreting bargaining games with many rounds.2.2Example I:Repeated GamesLet G be a simultaneous move game withfinite action spaces A1,..,A I.The infinitely repeated game G∞is the game where in periods t=0,1,2,..theplayers simultaneously choose actions(a t1,..,a tI)after observing all previousactions.We define payoffs in this game byu i(s i,s−i)=∞t=0δt˜u ia t1,..,a tI(2)where(a t1,..,a tI)is the action profile taken in period t when players followstrategies s1,..,s I,and˜u i are the utilities of players in each stage game. For example,in the infinitely repeated Prisoner’s Dilemma game the˜u i are simply the payoffs in the’boxes’of the normal form representation.2.3Example II:BargainingSuppose there is a one Dollar to be divided up between two players.The following alternate offer procedure is used:I.In periods0,2,4,...player1offers the division(x1,1−x1).Player2then accepts and the game ends,or he rejects and play continues.II.In period1,3,5,...player2offers the division(1−x2,x2).Player1then accepts or rejects.Assume that if the division(y,1−y)is agreed to in period t then the payoffs areδt y andδt(1−y).22Note that this is not a repeated game.First of all,the stage games are not identical (alternate players make offers).Second,there is no per period payoff.Instead,players only get payoffs when one of them has agreed to an offer.Waiting to divide the pie is costly.5Remark2There arefinite versions of this game in which play end after period T.One has to make some assumption what happens if there is no agreement at time T-typically,one assumes that the pie simply disappears. If T=1then we get the simple ultimatum game.3Continuity at InfinityNone of the tools we’ve discussed so far are easy to apply for infinite games. First,backward induction isn’t feasible because there is no end to work back-ward from.Second,using the definition of SPE alone isn’t very easy.There are infinitely many subgames and uncountably many strategies that might do better.We will discuss a theorem which makes the analysis quite tractable in most infinite horizon games.To do so,we mustfirst discuss what continuity at infinity means.Definition1An infinite extensive form game G is continuous at∞iflim T→∞supi,σ,σ s.th.σ(h)=σ (h)for all h in peri-ods t≤T|u i(σ)−u i(σ )|=0In words:compare the payoffs of two strategies which are identical for all information sets up to time T and might differ thereafter.As T becomes large the maximal difference between any two such strategies becomes arbitrarily small.Essentially,this means that distant future events have a very small payoffeffect.3.1Example I:Repeated GamesIfσandσ agree in thefirst T periods then:|u i(σ)−u i(σ )|=∞t=Tδt˜u ia t−˜u i(at)≤∞t=Tδt maxi,a,a|˜u i(a)−˜u i(a )| 6Forfinite stage games we know that M=max i,a,a |˜u i(a)−˜u i(a )|isfinite. This implies thatlim T→∞|u i(σ)−u i(σ )|≤limT→∞∞t=Tδt M=limT→∞δT1−δM=0.3.2Example II:BargainingIt’s easy to check that the bargaining game is also continuous.3.3Example III:Non-discounted war of attritionThis is an example for an infinite game which is NOT continuous at infinity.Players1and2choose a ti ∈{Out,Fight}at time t=0,1,2,...The gameends whenever one player quits with the other being the’winner’.Assume the payoffs areu i(s i,s−i)=1−ct if player i’wins’in period t −ct if player i quits in period tNote,that players in this game can win a price of1by staying in the game longer than the other player.However,staying in the game is costly for both players.Each player wants the game tofinish as quickly as possible,but also wants the other player to drop outfirst.This game is not continuous at∞.LetσT=Bothfight for T periods and then1quitsσ T=Bothfight for T periods and then2quits.Then we haveuiσT−u iσ T=1.This expression does not go to zero as T→∞.4The Single-Period Deviation PrincipleThe next theorem makes the analysis of infinite games which are continuous at∞possible.7Theorem1Let G be an infinite horizon multistage game with observed ac-tions which is continuous at∞.A strategy profile(σ1,..,σI)is a SPE if and only if there is no player i and strategyˆσi that agrees withσi except at a sin-gle information set h ti and which gives a higher payoffto player i conditionalon h tibeing reached.We write u i(σi,σ−i|x)for the payoffconditional on x being reached.For example,in the entry game below we have u2(Accomodate,Out|node1is reached)= 1.Recall,that we can condition on nodes which are not on the equilibrium path because the strategy of each player defines play at each node.4.1Proof of SPDP for Finite GamesI start by proving the result forfinite-horizon games with observed actions.Step I:By the definition of SPE there cannot be a profitable deviationfor any player at some information set in games with observed actions.3 Step II:The reverse is a bit harder to show.We want to show thatu i(ˆσi,σ−i|x t)≤u i(σi,σ−i|x t)for all initial nodes x t of a subgame(subgameat some round t).We prove this by induction on T which is the number of periods in whichσi andˆσi differ.3We have to very careful at this point.We have defined SPE as NE in every subgame. Subgames can only originate at nodes and not information sets.However,in games with observed actions all players play simultaneous move games in each round t.Therefore any deviation by a player at an information set at round t which is not a singleton is on the equilibrium path of some subgame at round t.8T=1:In this case the result is clear.Supposeσi andˆσi differ only in the information set in period t .If t>t it is clear that u i(ˆσi,σ−i|x t)= u i(σi,σ−i|x t)because the two strategies are identical at all the relevant in-formation sets.If t≤t then:u i(ˆσi,σ−i|x t)=h itu i(ˆσi,σ−i|h it )P rob{h it |ˆσi,σ−i,x t}≤h itu i(σi,σ−i|h it )follows from onestage deviationcriterionP rob{h it |σi,σ−i,x t}follows fromσiandˆσi havingsame play be-tween t and t=u i(σi,σ−i|x t)T→T+1:Assuming that the result holds for T letˆσi be any strategy differing fromσi in T+1periods.Let t be the last period at which they differ and define˜σi by:˜σi(h it)=ˆσi(h it)if t<tσi(h it)if t≥tIn other words,˜σi differs fromσi only at T periods.Therefore we have for any x tu i(˜σi,σ−i|x t)≤u i(σi,σ−i|x t)by the inductive hypothesis since we assumed that the claim holds for T.However,we also know that˜σandˆσonly differ in a single deviation at round t .Therefore,we can use exactly the same argument as in the previous step to show thatu i(ˆσi,σ−i|x t)≤u i(˜σi,σ−i|x t)for any x t.4Combining both inequalities we get the desires result:u i(ˆσi,σ−i|x t)≤u i(σi,σ−i|x t)This proves the result forfinite games with observed actions.4It is important that we have defined˜σi in differing only in the last period deviation. Therefore,after time t strategy˜σi followsσi.This allows us to use the SPDP.94.2Proof for infinite horizon gamesNote,that the proof forfinite-horizon games also establishes that we for a profileσwhich satisfies SPDP player i cannot improve onσi in some subgame x t by considering a new strategyˆσi withfinitely many deviations fromσi. However,it is still possible that deviations at infinitely many periods might be an improvement for player i.Assume this would be the case for someˆσi.Let’s denote the gain from using this strategy with :=u i(ˆσi,σ−i|x t)−u i(σi,σ−i|x t)Because the game is continuous at∞this implies that if we choose T large enough we can define some strategy˜σi which agrees withˆσi up to period T and then follows strategyσi such that:|u i(ˆσi,σ−i|x t)−u i(˜σi,σ−i x t)|< 2This implies that the new strategy˜σi gives player i strictly more utility than σi.However,it can only differ fromσi for at most T periods.But this is a contradiction as we have shown above.QEDRemark3Games which are at continuous at∞are in some sense’the next best thing’tofinite games.5Analysis of Rubinstein’s Bargaining Game To illustrate the use of the single period deviation principle and to show the power of SPE in one interesting model we now return to the Rubinstein bargaining game introduced before.First,note that the game has many Nash equilibria.For example,player 2can implement any division by adopting a strategy in which he only accepts and proposes a share x2and rejects anything else.Proposition1The bargaining game has a unique SPE.In each period of the SPE the player i who proposes picks x i=11+δand the other player acceptsany division giving him at leastδ1+δand rejects any offer giving him less.Several observations are in order:101.We get a roughly even split forδclose to1(little discounting).Theproposer can increase his share the more impatient the other player is.2.Agreement is immediate and bargaining is therefore efficient.Playerscan perfectly predict play in the next period and will therefore choose a division which makes the other player just indifferent between accepting and making her own proposal.There is no reason to delay agreement because it just shrinks the pie.Immediate agreement is in fact not observed in most experiments-last year in a two stage bargaining game in class we observed that only2out of14bargains ended in agreement after thefirst period.There are extensions of the Rubinstein model which do not give immediate agreement.53.The division becomes less equal forfinite bargaining games.Essentially,the last proposer at period T can take everything for himself.Therefore, he will tend to get the greatest share of the pie in period1as well -otherwise he would continue to reject and take everything in the last period.In our two-period experiments we have in deed observed greater payoffs to the last proposer(ratio of2to1).However,half of all bargains resulted in disagreement after the second period and so zero payoffs for everyone.Apparently,people care about fairness as well as payoffs which makes one wonder whether monetary payoffs are the right way to describe the utility of players in this game.5.1Useful ShortcutsThe hard part is to show that the Rubinstein game has a unique SPE.If we know that it is much easier to calculate the actual strategies.5.1.1Backward Induction in infinite gameYou can solve the game by assuming that random payoffafter rejection of offer at period T.The game then becomes afinite game of perfect information which can be solved through backward induction.It turns out that as T→∞5For example,when there is uncertainty about a players discount factor the proposer might start with a low offer in order to weed out player2types with low discount factor. Players with highδwill reject low offers and therefore agreement is not immediate.To analyze these extension we have tofirst develop the notion of incomplete information games.11the backward solution converges to the Rubinstein solution.This technique also provides an alternative proof for uniqueness.5.1.2Using the Recursive Structure of GameThe game has a recursive structure.Each player faces essentially the same game,just with interchanged roles.Therefore,in the unique SPE player 1should proposes some(x,1−x)and player2should propose(1−x,x). Player1’s proposal to2should make2indifferent between accepting imme-diately or waiting to make her own offer(otherwise player1would bid higher or lower).This implies:1−x2’s payoffat period1=δx2’s discounted payofffrom period2x=1 1+δ5.2Proof of Rubinstein’s Solution5.2.1ExistenceWe show that there is no profitable single history deviation.Proposer:If he conforms at period t the continuation payoffisδt11+δ.If he deviates and asks for more he getsδt+1δ.If he deviates and asks for less hegets less.Either way he loses.Recipient:Look at payoffs conditional on y being proposes in period t.–If he rejects he getsδt+111+δ.–If he accepts he getsδt y.–If y≥δ1+δthe strategy says accept and this is better than reject-ing.–If y<δ1+δthe strategy says reject and accepting is not a profitabledeviation.125.2.2UniquenessThis proof illustrates a number of techniques which enable us to prove prop-erties about equilibrium without actually constructing it.Let v and v be the highest and lowest payoffs received by a proposer in any SPE.We first observe:1−v ≥δv (3)If this equation would not be true then no proposer could propose v because the recipient could always get more in any subgame.We also find:v ≥1−δv (4)If not then no proposer would propose v -she would rather wait for the other player to make her proposal because she would get a higher payoffthis way.We can use both inequalities to derive bounds on v and v :v ≥1−δv ≥1−δ(1−δv )(5) 1−δ2 v ≥1−δ(6)v ≥11+δ(7)Similarly,we find:1−v ≥δv ≥δ(1−δv )(8)v ≤11+δ(9)Hence v =v =11+δ.Clearly,no other strategies can generate this payoffin every subgame.66While being a nice result it does not necessarily hold anymore when we change the game.For example,if both players make simultaneous proposals then any division is a SPE.Also,it no longer holds when there are several players.13。

博弈论-哈佛大学-第二节

博弈论-哈佛大学-第二节

each other’s actions
• Cooperation and
• Pareto efficiency
competition
Partnership Game: Setting
• 2 Partners in business • Complete information about the situation • Simultaneous moves (no verifiable contract) • Two players: i, j
» But individual firms produce until marginal revenue = $100
• Therefore firms overproduce, outcome is not efficient
Cournot Oligopoly: Analysis
• Profit functions for firms 1 and 2
π1 = (1000 - q1 - q2)q1 - 100q1
π2 = (1000 - q1 - q2)q2 - 100q2
• d (π1)/dq1 = 1000 - 2q1 - q2 - 100
• Set equal to zero, so... 2q1 = 1000 - q2 - 100
• q1* = 450 - (q2)/2
q2* = 450 - (q1)/2
• q1* = 450 - (450 - (q1)/2)/2 = 225 + q1/4
• (3/4)q1* = 225
• 225(4/3) = q1* = $300 = q2* by symmetry
• 2 Companies in competition: Firm 1 and Firm 2 • Each sets quantity (q1 , q2), which in turn effects price

博弈论(哈佛大学原版教程)

博弈论(哈佛大学原版教程)

Lecture VII:Common KnowledgeMarkus M.M¨o biusMarch4,2004This is the one of the two advanced topics(the other is learning)which is not discussed in the two main texts.I tried to make the lecture notes self-contained.•Osborne and Rubinstein,sections5.1,5.2,5.4Today we formally introduce the notion of common knowledge and discuss the assumptions underlying players’knowledge in the two solution concepts we discussed so far-IDSDS and Nash equilibrium.1A Model of KnowledgeThere is a set of states of natureΩ={ω1,ω2,..,ωn}which represent the uncertainty which an agent faces when making a decision.Example1Agents1,2have a prior over the states of natureΩ={ω1=It will rain today,ω2=It will be cloudy today,ω3=It will be sunny today}where each of the three events is equally likely ex ante.The knowledge of every agent i is represented by an information partition H i of the setΩ.Definition1An information partition H i is a collection{h i(ω)|ω∈Ω}of disjoint subsets ofΩsuch that1•(P1)ω∈h i(ω),•(P2)Ifω ∈h i(ω)then h i(ω )=h i(ω).Note,that the subsets h i(ω)spanΩ.We can think of h i(ω)as the knowledge of agent i if the state of nature is in factω.Property P1ensures that the true state of natureωis an element of an agent’s information set(or knowledge)-this is called the axiom of knowledge.Property P2is a consistency criterion. Assume for example,thatω ∈h i(ω)and that there is a stateω ∈h i(ω ) butω ∈h i(ω).Then in the state of nature isωthe decision-maker could argue that becauseω is inconsistent with his information the true state can not beω .Example1(cont.)Agent1has the information partitionH1={{ω1,ω2},{ω3}}So the agent has good information if the weather is going to be sunny but cannot distinguish between bad weather.We next define a knowledge function K.Definition2For any event E(a subset ofΩ)we haveK(E)={ω∈Ω|h i(ω)⊆E}.So the set K(E)is the collection of all states in which the decision maker knows E.We are now ready to define common knowledge(for simplicity we only consider two players).Definition3Let K1and K2be the knowledge functions of both players.An event E⊆Ωis common knowledge between1and2in the stateω∈Ωifωif a member of every set in the infinite sequence K1(E),K2(E),K1(K2(E)), K2(K1(E))and so on.This definition implies that player1and2knows E,they know that the other player knows it,and so on.There is an equivalent definition of common knowledge which is frequently easier to work with.2Definition4An event F⊆Ωis self-evident between both players if for allω∈F we have h i(ω)⊆F for i=1,2.An event E⊆Ωis common knowledge between both players in the stateω∈Ωif there is a self-evident event F for whichω∈F⊆E.Example1(cont.)Agent2has information functionH2={{ω1},{ω2},{ω3}}In this case the event E={ω1,ω2}is common knowledge if the state of nature isω1orω2.Both definition can be applied-E survives iterated deletion,but is also self-evident.Wefinally show that both definitions of common knowledge are equiva-lent.We need the next propositionfirst.Proposition1The following are equivalent:1.K i(E)=E for i=1,22.E is self evident between1and2.3.E is a union of members of the partition induced by H i for i=1,2. Proof:Assume a).Then for everyω∈E we have h i(ω)⊆E and b)follows.c)follows because immediately.c)implies a).We can now prove the following theorem.Theorem1Definitions3and4are equivalent.Proof:Assume that the event E is common knowledge in stateωaccording to definition3.First note,thatE⊇K i(E)⊇K j(K i(E))...BecauseΩisfinite andωis a member of those subsets the infinite regression must eventually produce a set F such that K i(F)=F.Therefore,F is self-evident and we are done.Next assume that the event E is common knowledge in stateωaccord-ing to definition4.Then F⊆E,K i(F)=F⊆K i(E)etc,and F is a member of every of the regressive subsets K i(K j(...E...))and so isω.This proves the theorem.32Dirty FacesWe have played the game in class.Now we are going to analyze it by usingthe mathematical language we just developed.Recall,that any agent canonly see the faces of all n−1other agents but not her own.Furthermore,atleast one face is dirty.First of all we define the states of natureΩ.If there are n players thereare2n−1possible states(since all faces clean cannot be a state of nature).It’s convenient to denote the states by the n-tuplesω=(C,C,D,..,C).Wealso denote the number of dirty faces with|ω|and note that|ω|≥1by assumption(there is at least one dirty face).The initial information set of each agent in some state of natureωhas atmost two elements.The agent knows the faces of all other agentsω(−i)butdoes not know if she has a clean or dirty face,i.e.h i(ω)={(C,ω(−i)),(D,ω(−i))}. Initially,all agents information set has two elements except in the case|ω|=1and one agent sees only clean faces-then she know for sure that she has adirty face because all clean is excluded.You can easily show that the event”There is at least one dirty face”is common knowledge as you would expect.The game ends when at least player knows the state of the world for sure,i.e.her knowledge partition h i(ω)consists of a single element.In thefirstthis will only be the case if the state of world is such that only a single playerhas a dirt face.What happens if no agent raises her hand in thefirst period?All agents update their information partition and exclude all states of nature with justone dirty face.All agents who see just one dirty face now know for sure thestate of nature(they have a dirty face!).They raise their hand and the gameis over.Otherwise,all agents can exclude states of nature with at most twodirty faces.Agents who see two dirty faces now know the state of nature forsure(they have a dirty face!).etc.The state of nature with k dirty faces therefore gets revealed at stage kof the game.At that point all guys with dirty faces know the state of naturefor sure.Question:What happens if it is common knowledge that neither all facesare clean,nor all faces are dirty?Remark1The game crucially depends on the fact that it is common knowl-edge that at least one agent has a dirty face.Assume no such information would be known-so the state of the world where all faces are clean would4be a possible outcome.Then no agent in thefirst round would ever know for sure if she had a dirty face.Hence the information partition would never get refined after any number of rounds.3Coordinated AttackThis story shows that’almost common knowledge’can be very different from common knowledge.Two divisions of an army are camped on two hilltops overlooking a com-mon valley.In the valley waits the enemy.If both divisions attack simultane-ously they will win the battle,whereas if only one division attacks it will be defeated.Neither general will attack unless he is sure that other will attack with him.Commander A is in peace negotiations with the enemy.The generals agreed that if the negotiations fail,commander A will send a message to commander B with an attack plan.However,there is a small probability that the messenger gets intercepted and the message does not arrive.The messenger takes one hour normally.How long will it take to coordinate on the attack?The answer is:never!Once commander B receives the message he has to confirm it-otherwise A is not sure that he received it and will not attack. But B cannot be sure that A receives his confirmation and will not attack until he receives another confirmation from A.etc.The messenger can run back and forth countless times before he is intercepted but the generals can never coordinate with certainty.Let’s define the state of nature to be(n,m)if commander A sent n messages and received n−1confirmation from commander B,and commander B sent m messages and received m−1confirmations.We also introduce the state of nature(0,0).In that state the peace negotiations have succeeded, and no attack is scheduled.1The information partition of commander A isH A={{(0,0)},{(1,0),(1,1)},{(2,1),(2,2)},..}.(1)1As was pointed out by an alert student in class this state is necessary to make this exercise interesting.Otherwise,the generals could agree on an attack plan in advance, and no communication would be necessary at all-the attack would be common knowledge already.5The information partition of commander B isH B={{(0,0),(1,0)},{(1,1),(2,1)},..}.(2) Both commanders only attack in some state of the worldωif it is common knowledge that commander B has sent a message,i.e.n≥1(the negotiations have failed and an attack should occur).However,this event can never be common knowledge for any state of nature(i.e.after any sequence of messages)because there is no self-evident set F contained in the event E. This is easy to verify:take the union of any collection of information sets of commander A(only those can be candidates for a self-evident F).Then ask yourself whether such a set can be also the union of a collection of information sets of commander B.The answer is no-there will always some information set of B which’stick out’at either’end’of the candidate set F.6。

section7(博弈论讲义(Harvard University))

section7(博弈论讲义(Harvard University))

Payoffs if 1’s building costs are high State of the World SH
Payoffs if 1’s building costs are low State of the World SL
What is Player 1’s optimal strategy?
Let p1 denote prior probability player 2 assigns to p(SH); implies p(SL) = 1-p1 Let x = player 1’s probability of building when her cost is low; Let y = player 2’s probability of entry Already shown that the best response for Low-Cost Player 1 is identified by: Build (x = 1) if 1.5*y +3.5*(1-y) > 2y+3*(1-y), so when y < ½
Modifying the Example
PLAYER 2 PLAYER 2
P L A Y E R 1
Build Don’t Build
Enter Don’t Enter 0, -1 2, 0 2, 1 3, 0
P L A Y E R 1
Enter Don’t Enter Build 1.5, -1 3.5, 0 Don’t Build 2, 1 3, 0
Payoffs if 1’s building costs are high State of the World SH
Payoffs if 1’s building costs are low State of the World SL

博弈论的一些基本原理——哈佛博弈课

博弈论的一些基本原理——哈佛博弈课

博弈论的一些基本原理——哈佛博弈课太长不看版零和博弈:你死我活,他人的利益损害就是自己的利益所得。

非零和博弈:1、负和博弈:两败俱伤,参与者的利益都得到损害。

2、正和博弈:双赢,双方利益都能得到保证。

(作者的观点支持将竞争转化为正和博弈进行合作)全文约1284字,阅读需四分钟正文1. 纳什均衡:博弈每一方都在选择自己对自己最有利的策略,而不考虑其他人的利益。

纳什均衡是所有参与者最优策略的组合,不一定所有组合都能实现自己利益的最大化,但能使所有人的收益达到最大化的均衡状态。

(麦当劳和肯德基选址的例子:同类商家聚合选商店地址不可避免会出现更激烈的竞争,其结果是商家要提升自己的竞争力,明确市场地位、深入研究消费者需求,从产品、服务、促销等多方面进行改善,树立起区别其他门店的品牌形象。

如果每个竞争者都做到了这一点,就能发挥互补优势,形成磁铁效果。

既能维持现有客户群,还能吸引新的消费者。

)纳什均衡要求每个参与者都是理性的。

博弈根据是否可以达成具有约束力的协议分为合作博弈和非合作博弈。

合作博弈即正和博弈,非合作博弈分为零和博弈及负和博弈。

合作博弈:双方采取合作的方式,或是互相妥协的方式,使博弈双方的利益都能增加。

或是最起码一方的利益增加,另一方利益没有受到损害。

零和博弈:在零和博弈中,所有参与者的利润和亏损之和恰好等于零,赢家的利润来自于输家的亏损,双方不存在合作的可能。

(最简单理解的例子就是赌博,赌桌上赢家(个人看书觉得包括了庄家)赢得的钱是输家输的钱)用一个成语来说,就是你死我活。

非零和博弈既可能是正和博弈,有可能是负和博弈。

(概念还是这三个概念,只是和前面分类依据不同罢辽,不要迷糊=。

=)正和博弈是双赢,而负和博弈是两败俱伤。

负和博弈:博弈论承认人人都有利己的动机,人的一切行为都是为了自身利益的最大化。

但同时,博弈的本质在于参与者的的策略的相互影响、相互依存。

帮助别人有时反而能帮助自己,促使自己的个人收益最大化。

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

Lecture X:Extensive Form GamesMarkus M.M¨o biusMarch17,2004•Gibbons,chapter2•Osborne,sections5.1,5.2and chapter61IntroductionWhile models presented so far are fairly general in some ways it should be noted that they have one main limitation as far as accuracy of modeling goes -in each game each player moves once and moves simultaneously.This misses common features both of many classic games(bridge,chess) and of many economic models.1A few examples are:•auctions(sealed bid versus oral)•executive compensation(contract signed;then executive works)•patent race(firms continually update resources based on where oppo-nent are)•price competition:firms repeatedly charge prices•monetary authority andfirms(continually observe and learn actions) Topic today is how to represent and analyze such games.1Gibbons is a good reference for economic applications of extensive form games.12Extensive Form GamesThe extensive form of a game is a complete description of1.The set of players.2.Who moves when and what their choices are.3.The players’payoffs as a function of the choices that are made.4.What players know when they move.2.1Example I:Model of EntryCurrentlyfirm1is an incumbent monopolist.A secondfirm2has the oppor-tunity to enter.Afterfirm2makes the decision to enter,firm1will have the chance to choose a pricing strategy.It can choose either tofight the entrant or to2.2Example II:Stackelberg CompetitionSupposefirm1develops a new technology beforefirm2and as a result has the opportunity to build a factory and commit to an output level q1before firm2starts.Firm2then observesfirm1before picking its output level q2. For concreteness suppose q i∈{0,1,2}and market demand is p(Q)=3−Q. The marginal cost of production is0.22.3Example III:Matching PenniesSo far we assumed that players can observe all previous moves.In order to model the standard matching pennies game in extensive form we have to assume that the second player cannot observe thefirst player’s move.If we want to indicate that player2cannot observe the move of player1we depict the game as follows:3The extensive form representation allows that players can’forget’infor-mation.For example we can assume that in a game with4rounds player2 can observe player1’s move in round1,but in round4he has forgotten the move of player1.In most cases,we assume perfect recall which rules out that players have such’bad memory’.23Definition of an Extensive Form Game Formally afinite extensive form game consists of1.Afinite set of players.2.Afinite set T of nodes which form a tree along with functions givingfor each non-terminal node t∈Z(Z is the set of terminal nodes)•the player i(t)who moves•the set of possible actions A(t)•the successor node resulting from each possible action N(t,a) 2It becomes difficult to think of a solution concept of a game where players are for-getful.Forgetfulness and rational behavior don’t go well together,and concepts like Nash equilibrium assume that players are rational.43.Payofffunctions u i:Z→ giving the players payoffs as a function ofthe terminal node reached(the terminal nodes are the outcomes of the game).4.An information partition:for each node t,h(t)is the set of nodes whichare possible given what player i(x)knows.This partition must satisfy t ∈h(x)⇒i(t )=i(t),A(t )=A(t),and h(t )=h(t) We sometimes write i(h)and A(h)since the action set is the same for each node in the same information set.It is useful to go over the definition in detail in the matching pennies game where player2can’t observe player1’s move.Let’s number the non-terminal nodes t1,t2and t3(top to bottom).1.There are two players.2.S1=S2={H,T}at each node.3.The tree defines clearly the terminal nodes,and shows that t2and t3are successors to t1.4.h(t1)={t1}and h(t2)=h(t3)={t2,t3}4Normal Form AnalysisIn an extensive form game write H i for the set of information sets at which player i moves.H i={S⊂T|S=h(t)for some t∈T with i(t)=i}Write A i for the set of actions available to player i at any of his information sets.Definition1A pure strategy for player i in an extensive form game is a function s i:H i→A i such that s i(h)∈A(h)for all h∈H i.Note that a strategy is a complete contingent plan explaining what a player will do in any situation that arises.Atfirst,a strategy looks overspec-ified:earlier action might make it impossible to reach certain sections of a5tree.Why do we have to specify how players would play at nodes which can never be reached if a player follows his strategy early on?The reason is that play offthe equilibrium path is crucial to determine if a set of strategies form a Nash equilibrium.Off-equilibrium threats are crucial.This will become clearer shortly.Definition2A mixed behavior strategy for player i is a functionσi:H i→∆(A i)such that supp(σi(h))⊂A(h)for all h∈A i.Note that we specify an independent randomization at each information set!3=It can be shown that for games with perfect recall this definition is equivalent to the one given here,i.e.a mixed strategy is a mixed behavior strategy and vice versa.In games without perfect recall this is no longer true-it is instructive to convince yourself that in such games each mixed behavior strategy is a mixed strategy but not vice versa.64.2Example II:Stackelberg CompetitionFirm1chooses q1andfirm2chooses a quantity q2(q1).With three possible output levels,firm1has three strategies,whilefirm2has33=9different strategies because it can choose three strategies at its three information sets.4.3Example III:Sequential Matching PenniesWe have S1={H,T}.Firm2has four strategies as it can choose two actions at two information sets.Strategy HH implies thatfirm2chooses H at both nodes,observed4.4Look at7One might be tempted to say that player1has three strategies because there are only three terminal nodes which can be reached.However,there are4 because La and Lb are two distinct strategies.After player1plays L it is irrelevant for thefinal outcome what he would play in the bottom node.5Nash Equilibrium in Extensive Form Games We can apply NE in extensive form games simply by looking at the normal form representation.It turns out that this is not an appealing solution concept because it allows for too many profiles to be equilibria.Look at the entry game.There are two pure strategy equilibria:(A,In)and (F,Out)as well as mixed equilibria (αF +(1−α)A,Out)for α≥12.Why is (F,Out)a Nash equilibrium?Firm 2stays out because he thinks that player 2will fight entry.In other words,the threat to fight entry is sufficient to keep firm 2out.Note,that in equilibrium this threat is never played since firm 2stays out in the first place.The problem with this equilibrium is that firm 2could call firm 1’s bluffand enter.Once firm 2has entered it is in the interest of firm 1to accom-modate.Therefore,firm 1’s threat is not credible .This suggests that only (A,In)is a reasonable equilibrium for the game since it does not rely on non-credible threats.The concept of subgame perfection which we will introduce in the next lecture rules out non-credible threats.5.1Example II:StackelbergWe next look at a Stackelberg game where each firm can choose q i ∈[0,1]and p =1−q 1−q 2and c =0.Claim:For any q 1∈[0,1]the game has a NE in which firm 1produces q 1.Consider the following strategies:s 1=q 1s 2=1−q 12if q 1=q 11−q 1if q 1=q 1In words:firm 2floods the market such that the price drops to zero if firm1does not choose q 1.It is easy to see that these strategies form a NE.Firm1can only do worse by deviating since profits are zero if firm 2floods themarket.Firm 2plays a BR to q 1and therefore won’t deviate either.Note,that in this game things are even worse.Unlike the Cournot gamewhere we got a unique equilibrium we now have a continuum of equilibria.Second,we have even more disturbing non-credible threats.For instance,inthe equilibrium where q 1=1firm 2’s threat is ”if you don’t flood the market 9and destroy the market I will”.Not only won’t the threat be carried out-it’s also hard to see why it would be made in thefirst place.10。

相关文档
最新文档