Turtle Tournaments customize the BAM2 framework. The active turtles interacts with a selected candidate by playing a mutual dilemma game:
to interact [candidate]
play-game-with candidate
; etc.
end
In a mutual dilemma game two players, A and B, are presented with a mutual dilemma. Each must choose one of two options: fight or flee, cooperate or defect, hold 'em or fold 'em, etc. The choice may be based on strategy and history, but is made without knowing the opponent's choice.
After the choices are made, each player is awarded points based on the game's payoff matrix:
Payoffs |
B |
||
FALSE |
TRUE |
||
A |
FALSE |
a1/b1 |
a2/b2 |
TRUE |
a3/b3 |
a4/b4 |
The a3/b3 entry indicates that if A chooses TRUE and B chooses FALSE, then A receives a3 points and B receives b3 points.
Here's the basic game playing procedure:
to play-game-with [candidate]
let my-choice choice? candidate
let candidate-choice [choice? myself]
of candidate
update-attributes candidate my-choice
candidate-choice
end
The choice can be based on knowledge of the candidate's history and the strategy used by the active turtle. The details must be filled in. For now the choice is random:
to-report choice? [candidate]
report (random 2) < 1 ; for now
end
Note the use of myself when asking the candidate to make a choice:
let candidate-choice [choice? myself] of candidate
The of command is similar to the ask command. The ask command allows turtle T1 to ask turtle T2 to execute a block of commands, while the of command allows T1 to ask T2 to evaluate a block of expressions and report the value of the last one in the block. In both cases, the active turtle is T2. T2 sits on top of T1 on the turtle stack. Inside the block "self" refers to T2 and "myself" refers to T1.
After the choice is made, energy, history, and number of games played must be updated:
to update-attributes [opponent my-choice opponent-choice]
set num-games-played num-games-played +
1
ask opponent [set num-games-played
num-games-played + 1]
; update my history & opponent's
history if necessary
if my-choice and opponent-choice
[
set energy energy + payoff-a4
ask candidate [set energy energy + payoff-b4]
stop
]
if my-choice and not opponent-choice
[
set energy energy + payoff-a3
ask candidate [set energy energy + payoff-b3]
stop
]
if not my-choice and opponent-choice
[
set energy energy + payoff-a2
ask candidate [set energy energy + payoff-b2]
stop
]
if not my-choice and not
opponent-choice
[
set energy energy + payoff-a1
ask candidate [set energy energy + payoff-b1]
stop
]
end
The payoff matrix is stored in eight global variables:
globals [
halt?
world-diam
; payoffs:
payoff-a1 ; payoff for A if A & B chose TRUE
payoff-a2 ; payoff for A if A chooses TRUE and B FALSE
payoff-a3 ; payoff for A if A chooses FALSE and B TRUE
payoff-a4 ; payoff for A if A & B choose FALSE
payoff-b1 ; payoff for B if A & B chose TRUE
payoff-b2 ; payoff for B if A chooses TRUE and B FALSE
payoff-b3 ; payoff for B if A chooses FALSE and B TRUE
payoff-b4 ; payoff for B if A & B choose FALSE
]
In Rebel without a Cause James Dean's character must prove his courage by playing a dangerous game of Chicken. He and his opponent race stolen cars toward each other. Each driver must choose: to swerve or not to swerve. The first to swerve is labeled "chicken".
Here's a typical payoff matrix for Chicken
Payoffs |
B |
||
FALSE |
TRUE |
||
A |
FALSE |
0/0 |
5/0 |
TRUE |
0/5 |
1/1 |
The 0/5 entry in the matrix says that if A chooses to swerve (swerve = TRUE) and B chooses not to swerve (swerve = FALSE), then A is awarded 0 points and B is awarded 5 points. In other words, A is the chicken and B is the hero.
We see the game of Chicken being played out in the real world all the time. Price wars and arms races are examples.
Coordination is similar to Chicken: Two cars are speeding toward each other on a narrow road. Instead of choosing to swerve or not to swerve, the cars must choose between swerving left or swerving right. If they make the same choice, then a terrible crash is averted.
Here's a typical playoff matrix:
Payoffs |
B |
||
FALSE |
TRUE |
||
A |
FALSE |
3/3 |
0/0 |
TRUE |
0/0 |
3/3 |
The 3/3 entry in the upper-left corner says that A and B both chose not to swerve left (SL? = FALSE). In other words, they both swerved right and therefore a crash was averted.
Coordination is the game companies play when they want to develop a product that adheres to the same standards as similar products their competitors are developing, but don't want to reveal information about the product.
It's George and Martha's anniversary. They want to be
together and agreed to meet after work at the movies. The trouble is no
decision was made about which movie to see. The phones are out, so they can't
communicate. George really wants to see Planet
of the Apes, but Martha wants to see
Here's a typical playoff matrix for
Payoffs |
B |
||
FALSE |
TRUE |
||
A |
FALSE |
3/2 |
1/1 |
TRUE |
0/0 |
2/3 |
The 3/2 entry says that if George (= A) and Martha (= B)
both decide not to go to
Of course
Prisoner's Dilemma (PD) is the most famous dilemma game. Two men are accused of a crime. They are separated and each is asked to testify against the other. If both refuse, then both receive light sentences on a lesser charge. If both agree, then both receive moderate sentences reduced as a reward for their testimony. However, if one agrees to testify and the other refuses, then the former goes free while the later receives a stiff sentence.
Here's a typical playoff matrix:
Payoffs |
B |
||
FALSE |
TRUE |
||
A |
FALSE |
1/1 |
0/5 |
TRUE |
5/0 |
3/3 |
The 5/0 entry says that A chooses to testify (testify? = TRUE) and B refuses (testify? = FALSE), then A receives a 5 year prison reduction while B receives no reduction.
Prisoner's Dilemma is played each time a business deal is made. Shall A cheat B for a big payoff or be honest for a moderate payoff? If both cheat, then the payoff is minimal.
In an Iterated Prisoner's Dilemma (IPD) tournament turtles play the Prisoner's Dilemma game. Each turtle has a strategy, a history, and keeps track of the number of games played:
turtles-own [energy vision mobility strategy history num-games-played]
Four commonly used PD strategies are:
to-report choice?
if strategy = "never-cheat"
[report false] ; I won't cheat
if strategy = "always-cheat"
[report true] ; I will cheat
if strategy =
"randonmly-cheat" [report random 2 < 1]
if strategy = "tit-for-tat"
[???]
report false ; default
end
Out of 100 turtles, assume each strategy is used by 25 turtles. Which one is best in the long run?
The Tit-for-tat strategy says to choose what your opponent
chose the last time you played him. This means each turtle must maintain a
history list of
history = [true true false ... true]
A turtle's choice is simply the item in the history list at position p where p is the id (who) of the candidate.
Of course a turtle must replace this item with the candidate's choice after each game.
Complete ipd1.nlogo by adding the tit-for-tat strategy.
In a society of cheaters and saints, the cheaters will always win. So why are there any saints at all? The answer: vindictiveness. Vindictiveness is the tendency for an agent who has been cheated to punish his cheater.
There are a couple of caveats. First, punishing a cheater isn't free. There is a fixed energy cost. For example, the cheater might not like getting punished and might punch the punisher in the nose!
Second, cheaters don't punish other cheaters. This is the thieves' honor code.
Add vindictiveness to the ipd1.nlogo model.
In this extension every turtle has a vindictiveness attribute set to a random number less than the vindictiveness of the society:
set vindictiveness random max-vindictiveness
Where max-vindictiveness is a global set by a slider.
If a turtle has been cheated by his opponent, but didn't himself cheat, then in addition to energy points awarded, he does this:
if random 100 < vindictiveness
[
set energy energy – punishment-cost
ask opponent [set enery energy –
punishment]
]
Where punishment and punishment-cost are also globals set by sliders.
Of course if the opponent cooperated but the active turtle cheated, then the opponent might punish the active turtle.
In the genetic version of IPD turtles mate every 100 ticks:
to interact [candidate]
ifelse ticks mod 100 = 0
[
mate-with candidate
]
[
play-game-with candidate
]
end
All turtles have the same strategy. Here's how it works. A turtle remembers the last three choices made by every other turtle:
history = [[true true false] [false true true] ...]
Note that for any opponent, there are eight possible histories.
A turtle's strategy maps the opponent's history onto a random Boolean value. This is the turtle's choice. For example, if the active turtle's opponent is the turtle with id = 1, then this opponent's history if [false true true] this means the last time the active turtle played with this turtle, he chose false. The previous two times he chose true. The active turtle's strategy might map this choice to false:
strategy: [false true true] -> false
This means the active turtle will try to cheat turtle #1 in the next game.
One way to implement this is to define strategy to be a list of eight random Booleans:
strategy = [false false true false false true true false]
We can use the opponent's history to compute and index into this list. We can do this by translating false = 0 and true = 1 and simply compute the corresponding binary number. For example:
[false true true] = 2 * (2 * 0) + 1) + 1 = 3
Our choice is then:
item 3 strategy = false
Of course we must update the opponent's history list after each game.
When mating the active turtle invokes hatch, then dies. This is a form of population control that keeps the number of turtles fixed.
hatch 1
[
set strategy hatchling-strategy
set vision random world-diam
set mobility random world-diam
set energy random 100
set history []
repeat count turtles
[
set history fput [true true true]
history
]
set id hatchling-id
]
die
Each turtle has a programmer-defined id number. This is different from the system-defined id number called who. The id number is used to select the opponent history from the history list. The hatchling inherits the id number of the dying parent.
The hatchling's strategy is computed by appending the last 8 – N entries of the candidate's strategy to the first N entries of the active turtle's strategy, where N is a random number below 8. This is genetic splicing. Next, a random item in the hatchling's strategy is changed with a probability p where p is very small.
Modify the ipd1.nlogo model by replacing all strategies with genetic strategies and by adding mating.