PrisonersDilemma

ec2-18-220-137-164.us-east-2.compute.amazonaws.com | ToothyWiki | RecentChanges | Login | Webcomic

You and a friend have both been captured, accused of hideous crimes against the state.
You know that he is being interrogated, perhaps tortured in a cell - but you have no idea exactly what is happening.

Now, your interrogator walks in.
"You're sunk" he says.
"Your companion has ratted on you - but you can make it easier for yourself if you rat on him too"

You know that:

The interrogator might be lying. He's good at his job - you have no clue whether what he says is true or not.
They have no evidence. If both you and your companion stay quiet then you both go free.
If you both rat on each other, they will have the information they need to put you both to death.

: NoNoNoNo, if you both rat on each other, neither of you dies, and you both get a lesser sentence for being helpful. Otherwise, the dilemma doesn't work... M-A

If you rat on your companion, but he doesn't rat on you - then he gets put to death - but you go free with the thanks of the state and a nice monetary reward. For this simulation pretend that your guilt won't detract from your reward, and that this is still a better result than both of you going free. Or at least as good a result.

So - DoYouRatOnYourFriend.

Does it make a difference how large the reward is?
Does it make a difference if you have to go through the experience multiple times?
Does that make a difference whether you have a limited number of lives?

I'll not start discussing this too hard - it's a well explored problem. But it's still fun.

: In order, if the reward for being ratted on, plus the reward for ratting is greater than the reward for neither ratting, you don't have a prisoner's dilemma, so yes. Multiple times sort of have an effect, but only if the number is unknown, and limited lives is sort of an odd situation, since it's either one, or infinite, but yes, it makes a difference. And the solution for a single dilemma is defect, plain and simple. The iterated dilemma is more interesting. - TheInquisitor

Oh nice. It seems that the MetaGame trumps again. TeamPlay? with spoiling tactics seems to be the way to go. In short, the winning entry tries to identify a teammate, and if it does so, decides which teammate should co-operate and which should defect - sacrificing one for the highest possible score for the other. If it detects an 'enemy' program, it defects, so as to penalise the score of the opponent. Evilness - but I don't think that 'team-dilemma' was the intent, and expect it to become a separate competition. --Vitenka
http://www.wired.com/news/culture/0,1284,65317,00.html?tw=rss.TOP

Uh - but if you *know* that certain agents are going to behave in certain ways, the problems you are solving are friend-or-foe identification and [covert communication], not PrisonersDilemma. Check out the linked PDF - the game being played is connect 4, but the strategy is virtually identical (agents collude to work out what to sacrifice so the team wins the contest); the actual game itself is irrelevant to gaming the tournament. - MoonShadow

Sort of what I said - just interesting that a team actually used this at a contest suceessfully. But identification has always been a part of extended dilemma. If you can recognise agents which often defect, or that usually co-operate then you will do better than if you cannot. This is one reason why TitForTat is so successful - it quickly identifies and punishes the hawks. (I think the other major one is that it is so simple that it is hard for any strategy to do better) --Vitenka

"But identification has always been a part of extended dilemma. If you can recognise agents which often defect".. is it really the same sort of thing, though? ISTM there's a significant difference between the classical problem of recognising an agent as being the same one you've had past experience with and making predictions based solely on that experience, and the gaming-the-tournament problem of recognising someone else you've colluded with out-of-band prior to the game in order to arrange to act in a certain well-defined way - am I wrong? Am I even being clear? Bleh.. :/ - MoonShadow

There is certainly a difference of scale. But I'm not certain that the 'out of band' thing is really an issue - after all, the 'real life' situation can assume that the prisoners can share a language. Then you use the first few trials to pass a message ('I am your leader') - and if not, assume it's a hostile agent and act appropriately (In this case, defect every time - which seems a harsh default.) Is it really useful research at this point to insist that agents invent their bidding-code on the fly? --Vitenka

: ..but surely there's a distinction between a game where the player attempts to build a mental model of another agent's strategy through observing their moves and possibly communicating with them, and one where the player already has a precise model in mind that they *know* the agent will conform to once a recognition sequence has been exchanged? The former is a matter of psychology, game theory, solving AI problems; the latter is just cryptography and stego.. The problem you're trying to solve in PrisonersDilemma is "is my friend going to rat on me?", whereas in the stego game it's "I already know what my friend's gonna do. Is this person my friend?"- MoonShadow

Interetsing use of the word 'just', but never mind :) There certainly is a difference - but expressing it cleanly seems difficult. After all, you have to know that your recognition sequences are both short (to avoid losing too many cycles to them) and unique - which leads to the possibility of other agents faking (or randomly generating) their early replies in order to try to trick you into 'play dead' mode. It is true that they only won because they knew there were enough collaborators in the population - but they later say that they are pretty sure it's a stable strategy. That is, that intelligent agents (which attempt to clone the winning strategies that they see) will tend to also become collaborators. As a side comment, as they noted, it seems impossible to stop labs from co-operating to enter teams of bots - so why not encourage it and see what emerges? --Vitenka

Also note that the MetaGame is unavoidable. You already know that your friend has, say, 80% chance of playing TitForTat, 15% chance of playing hawk and 5% of playing something that you can't recognise in time. Internally you're always going to try and identify strategies. It's just that they have managed to get a "Dove if I like you" strategy as a high enough percentage of the pool to make "Dove if I like you, Hawk if I don't" a viable strategy. The identification part, yes, they have simplified - but it doesn't seem to have lost them anything. --Vitenka

One interesting thing - does co-op play still beat TitForTat if you take the average of your teams scores? I don't think that it does, but I'm not sure. --Vitenka

SeeAlso: WhatIsCheese. Because I can't help but feel that this strategy is in some way cheaty. --Vitenka