Prisoner's Dilemma (PD) Strategies

Four commonly used PD strategies are:

to-report choice?
  if strategy = "never-cheat" [report false] ; I won't cheat
  if strategy = "always-cheat" [report true] ; I will cheat
  if strategy = "randomly-cheat" [report random 2 < 1]
  if strategy = "tit-for-tat" [???]
  report false ; default
end

Out of 100 turtles, assume each strategy is used by 25 turtles. Which one is best in the long run?

Tit-for-Tat

The Tit-for-Tat strategy says to choose what your opponent chose the last time you played him. This means each turtle must maintain a history list of N Booleans, where N = the turtle count:

history = [true true false ... true]

A turtle's choice is simply the item in the history list at position p where p is the id (who) of the candidate.

Of course a turtle must replace this item with the candidate's choice after each game.

Vindictiveness

In a society of cheaters and saints, the cheaters will always win. So why are there any saints at all? The answer: vindictiveness. Vindictiveness is the tendency for an agent who has been cheated to punish his cheater.

There are a couple of caveats. First, punishing a cheater isn't free. There is a fixed fitness cost. For example, the cheater might not like getting punished and might punch the punisher in the nose!

Second, cheaters don't punish other cheaters. This is the thieves' honor code.

Add vindictiveness to the model.

In this extension every turtle has a vindictiveness attribute set to a random number less than the vindictiveness of the society:

set vindictiveness random max-vindictiveness

Where max-vindictiveness is a global set by a slider.

If a turtle has been cheated by his opponent, but didn't himself cheat, then in addition to fitness points awarded, he does this:

if random 100 < vindictiveness
[
   set fitness fitness – punishment-cost
   ask opponent [set fitness fitness – punishment]
]

Where punishment and punishment-cost are also globals set by sliders.

Of course if the opponent cooperated but the executioner turtle cheated, then the opponent might punish the executioner turtle.

Genetically grown strategies

In the genetic version of PD turtles mate every 100 ticks.

All turtles have the same strategy. Here's how it works. A turtle remembers the last three choices made by every other turtle:

history = [[true true false] [false true true] ...]

Note that for any opponent, there are eight possible histories.

A turtle's strategy maps the opponent's history onto a random Boolean value. This is the turtle's choice. For example, if the active turtle's opponent is the turtle with id = 1, then this opponent's history if [false true true] this means the last time the active turtle played with this turtle, he chose false. The previous two times he chose true. The active turtle's strategy might map this choice to false:

strategy: [false true true] -> false

This means the active turtle will try to cheat turtle #1 in the next game.

One way to implement this is to define strategy to be a list of eight random Booleans:

strategy = [false false true false false true true false]

We can use the opponent's history to compute and index into this list. We can do this by translating false = 0 and true = 1 and simply compute the corresponding binary number. For example:

[false true true] = 2 * (2 * 0) + 1) + 1 = 3

Our choice is then:

item 3 strategy = false

Of course we must update the opponent's history list after each game.

Mating

When mating the active turtle invokes hatch, then dies. This is a form of population control that keeps the number of turtles fixed.

hatch 1
  [
    set strategy hatchling-strategy
    set fitness random 100
    set history []
    repeat count turtles
    [
      set history fput [true true true] history
    ]
    set id hatchling-id
  ]
  die

Each turtle has a programmer-defined id number. This is different from the system-defined id number called who. The id number is used to select the opponent history from the history list. The hatchling inherits the id number of the dying parent.

The hatchling's strategy is computed by appending the last 8 – N entries of the candidate's strategy to the first N entries of the active turtle's strategy, where N is a random number below 8. This is genetic splicing. Next, a random item in the hatchling's strategy is changed with a probability p where p is very small.