How to Survive on a Desert Island
March 7, 2012 Leave a comment
For the last few months, a team and I have been aggressively competing* in the 2nd Social Learning Strategies Tournament. Here’s what it’s all about:
Suppose you find yourself in an unfamiliar environment where you don’t know how to get food, avoid predators, or travel from A to B. Would you invest time working out what to do on your own, or observe other individuals and copy them? If you copy, who would you copy? The first individual you see? The most succesful individual? The most common behaviour? Do you always copy, or do so selectively? If you could refine behaviours, would you invest time in that or let others do it for you? What if you then migrated – would you rely on your existing knowledge, or copy the locals?
The team consisted of a rocket scientist, a mathematician, a genetic engineer, and me. Fortunately, the other three had enough brainpower to help us put together something interesting to submit.
The deadline for submission was Feb 28, 2012. Our team ended up using Baysian economics to put together a competitor. If you’re interested, the abstract overview is below.
Bayes_Bots makes decisions based on the expected payoff of the moves in her arsenal: Observe, Innovate, Exploit, and, in the appropriate extension, Refine. To decide which move to use, Bayes_Bots will look at the distribution of the learned payoffs from Innovate, and Observe. Bayes_Bots uses Bayesian inference, to learn these distributions: she assumes that the values learned from Innovate and Observe can be modeled by an exponential distribution, and given a distribution on the payoffs associated with each arm, the means of the Observed distributions will follow a Beta distribution, while the payoffs from Observe follow an exponential distribution. Bayes_Bots will discount older information as less reliable, using Pc as the probability that a given strategy’s payoff changes.
Bayes_Bots will Innovate rarely. However, she will always Innovate on her first turn; this will help provide new raw information to the collective population of agents.
Observe_who. In the observe_who strategy, Bayes_Bots will not change her strategy. The assumption is that information is equally valuable from all other agents in the field, regardless of their age, number of times they’ve been observed, etc.
Refine. Bayes_Bots will Refine one of her high-payoff moves at least once, in order to understand what benefit that might have to her overall expected payoffs. Otherwise, Bayes_Bots will not change her strategy; if other agents refine their strategies, Bayes_Bots will learn the refined payoff.
Localization/Demes. When Bayes_Bots changes to a new deme, she will discard information about the distribution of payoffs from observed strategies. She will retain information regarding the distribution of payoffs from innovated strategies, as well as the distribution of the means of the observed strategies, as these pieces of information are assumed to be useful across all demes.
If you want to read the full entry, let me know – I’m happy to share out the doc. It also has our very complex math and equally complex Python code.



