Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Ruby (http://www.velocityreviews.com/forums/f66-ruby.html)
-   -   [SUMMARY] Paper Rock Scissors (#16) (http://www.velocityreviews.com/forums/t819314-summary-paper-rock-scissors-16-a.html)

Ruby Quiz 01-27-2005 01:59 PM

[SUMMARY] Paper Rock Scissors (#16)
 
This week's quiz is a classic computer science problem in disguise. It's
generally done with the Prisoner's Dilemma:

http://plato.stanford.edu/entries/prisoner-dilemma/

The game chosen doesn't much matter, but the idea is that there really shouldn't
be much strategy involved. In the Prisoner's Dilemma, it's generally agreed
that it's hard to beat a player that confesses every time. For the game of
Paper Rock Scissors, the winning strategy is to be purely random, as Benedikt
Huber explained on Ruby Talk:

You can't give any predictions on the next move of a random player.
Therefore you have a 1/3 prop. to choose a winning, losing
or drawing move.

To be fair, Paper Rock Scissors does have quite a bit of strategy theory these
days, but the conditions of that theory (mostly body language) are unavailable
to computer players. Entire books have been written on the subject, believe it
or not:

http://www.worldrps.com/

So random is the best we can do? Is that hard to build? Uh, no. Here's a
sample by Avi Bryant:

class AJBRandomPlayer < Player
def choose
[:paper, :scissors, :rock][rand(3)]
end
end

If we test that, we get the expected 50/50 results:

AJBRandomPlayer vs. JEGPaperPlayer
AJBRandomPlayer: 511.0
JEGPaperPlayer: 489.0
AJBRandomPlayer Wins
AJBRandomPlayer vs. JEGQueuePlayer
AJBRandomPlayer: 499.5
JEGQueuePlayer: 500.5
JEGQueuePlayer Wins

Of course, that's so uninteresting, we're probably beginning to wonder if
James's quiz selecting skills are on the fritz. Possibly, but interesting
solutions make me look good none-the-less. This week, Christian Neukirchen sent
in more than one of those:

CNBiasInverter: Choose so that your bias will be the inverted
opponent's bias.

CNIrrflug: Pick a random choice. If you win, use it again; else,
use a random choice.

CNStepAhead: Try to think a step ahead. If you win, use the choice
where you'd have lost. If you lose, you the choice where you'd
have won. Use the same on draw.

CNBiasFlipper: Always use the choice that hits what the opponent
said most or second-to-most often (if the most often choice is not
absolutely prefered).

CNBiasBreaker: Always use the choice that hits what the opponent
said most often.

CNMeanPlayer: Pick a random choice. If you win, use it again; else,
use the opponent's choice.

I really should show all of those here, but that would make for a ridiculously
large summary. Let's go with Christian's favorite:

class CNBiasInverter < Player
def initialize(opponent)
super
@biases = {:rock => 0, :scissors => 0, :paper => 0}
@hit = {:rock => :paper, :paper => :scissors, :scissors => :rock}
end

def choose
n = ::Kernel.rand(@biases[:rock] + @biases[:scissors] +
@biases[:paper]).to_i
case n
when 0..@biases[:rock]
:paper
when @biases[:rock]..@biases[:rock]+@biases[:scissors]
:rock
when @biases[:rock]+@biases[:scissors]..@biases[:rock]+
@biases[:scissors]+@biases[:paper]
:scissors
else
p @biases[:rock]+@biases[:scissors]..@biases[:paper]
abort n.to_s
end
end

def result( you, them, win_lose_or_draw )
@biases[them] += 1
end
end

initialize() sets up the a Hash for tracking the biases. (I don't believe @hit
is needed.) result() is the compliment to that. It adjusts the proper bias
count each time the opponent makes a selection.

choose() does all the interesting work. A random number is chosen between 0 and
the total of all the bias counts. That number is then associated with the
indicated bias by some clever use of ranges and the opposite of that bias is
returned as CNBiasInverter's choice.

In other words, as the opponent chooses more and more of a particular item, the
bias count for that item climbs. This will cause the semi-random choice to
drift towards the opposite of that favored move.

Let's compare with our baseline:

CNBiasInverter vs. JEGPaperPlayer
CNBiasInverter: 995.0
JEGPaperPlayer: 5.0
CNBiasInverter Wins
CNBiasInverter vs. JEGQueuePlayer
CNBiasInverter: 653.5
JEGQueuePlayer: 346.5
CNBiasInverter Wins

The results are getting better. But, of course, random is still trump:

AJBRandomPlayer vs. CNBiasInverter
AJBRandomPlayer: 509.5
CNBiasInverter: 490.5
AJBRandomPlayer Wins

There were many, many interesting strategies, like the one above. But random
remained the great equalizer. Which leads us to the critical question: What
exactly is the point of this exercise?

Cheating, of course!

With the Prisoner's Dilemma and this quiz, it's common the engineer the
environment to be ripe for cheating. Since there's no winning strategy
available, we'll need to bend the rules a little bit. That's because
programmers have enormous egos and can't stand to lose at anything!

(Note: Technically, no one even cheated. The definition of cheat that applies
here is, "to violate rules dishonestly." Go back and reread the quiz, if you
need to...)

What's the ultimate cheat? Well, here's the first by Bill Atkins:

class BACheater < Player
def initialize opponent
Object.const_get(opponent).send :define_method, :choose do
:paper
end
end

def choose
:scissors
end
end

It doesn't get much simpler than that! Bill's initialize() method uses the
passed in name of the opponent to locate the correct Class object and redefine
the choose() method of that Class to something super easy to deal with. The
opponent is modified to always throw :paper and BACheater always throws
:scissors.

That's 100% successful against anything we've seen thus far. Worse, you're
player is permanently modified when it goes up against BACheater, leaving you
vulnerable to clever strategies like CNBiasInverter above:

AJBRandomPlayer vs. BACheater
AJBRandomPlayer: 0
BACheater: 1000
BACheater Wins
AJBRandomPlayer vs. CNBiasInverter
AJBRandomPlayer: 4.5
CNBiasInverter: 995.5
CNBiasInverter Wins
BACheater vs. CNBiasInverter
BACheater: 1000
CNBiasInverter: 0
BACheater Wins

Ouch!

Another cheat used by more than one player was to try and predict an opponent's
move, then respond with a counter. Here is Benedikt Huber's version:

KILLER = { :rock => :paper,
:paper => :scissors,
:scissors => :rock }

class BHCheatPlayer < Player

def initialize( opponent )
super
@opp = Object.const_get(opponent).new(self)
end

def choose
KILLER[@opp.choose]
end

def result(you,them,result)
@opp.result(them,you,result)
end

end

Again initialize() retrieves the Class object, but instead of modifying the
Class, it simply creates an internal copy of the opponent. result() forwards
each pick to the copied opponent, to keep it synchronized with the real
opponent. From there, choose() is obvious: See what the opponent is about to
do and counter.

It was pointed out on Ruby Talk that this doesn't demolish random players;
however, against any random strategy, this becomes a random player. Countering
a random choice is a still a random move, even if the choice isn't what the
opponent is about to do.

There were other great cheats. Jannis Harder's self-repairing program and
FGSpyPlayer (well commented) are both worth studying. Some cheating approaches
were also overlooked. For example, no one tried to modify the score, but it can
be done. There were also a lot of excellent non-cheating solutions. Which
leaves me with no choice but to say the lame but utterly true, "All the
submissions are worth a look!"

My thanks to all my fellow cheaters and the rule abiding players that tolerate
our antics.

Tomorrow's quiz was an actual job project of mine, years ago...



martinus 01-27-2005 04:39 PM

Re: [SUMMARY] Paper Rock Scissors (#16)
 
> BTW, I'd be interested if anyone found a nicer looking solution to
> do a weighted rand.


What you are doing is called Roulett wheel selection, which is often
used as a selection algorithm when using genetic algorithms. I think
this is a bit nicer:

def choose
# sum up all values
sum = @biases.values.inject(0) {|sum, val| sum + val }
# choose something
search_sum = rand(sum)
# count up until found
current_sum = 0
found = @biases.detect do |key, value|
current_sum += value
current_sum > search_sum
end
found[0]
end

martinus



All times are GMT. The time now is 05:09 PM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.