## Introduction

I listened to the radio this morning and the host was talking about the Grand National, which is today. Perhaps because I've been reading about a model of the stock market in The Social Atom, that I started to think of how to create a simulation of a horse race and how bookies would determine the odds they offer. This is what I came up with using Python.

## The horses

I modelled horses as simply a list of numbers. The race will be modelled as a turn-based simulation (or game), in which each horse moves a distance randomly selected from their list. For example, horse A may have a list [1, 2, 3] and move 3 units, while horse B may have a list [2, 4, 6] and move 2 units. On average, we would expect horse B to move twice as fast a horse A.

```
import random
NUM_HORSES = 8
HORSE_SPEEDS = 8
horses = dict([('horse %s' % h, [random.randint(8,16) for n in range(HORSE_SPEEDS)]) for h in range(NUM_HORSES)])
```

The above code creates a dictionary of 8 horses called "horse 0" to "horse 7" (I used a dictionary just so I can give them silly horse names just like in real life in I want). Each horse is a list of 8 numbers from 8 to 16, which should give a reasonable spread of speeds around a mean of 12. We can see the mean speed of each horse like so:

```
for name, horse in horses.items():
print "%s: %.3f" % (name, sum(horse) * 1.0 / HORSE_SPEEDS)
```

Which will output something like this:

```
horse 0: 11.875
horse 1: 12.500
horse 2: 12.750
horse 3: 12.375
horse 4: 12.375
horse 5: 12.125
horse 6: 11.000
horse 7: 12.125
```

## The race

A race is simulated by defining the position of each horse along a track with a certain length. The position starts as a random number less than one to avoid draws. Then the position of each horse is increased by a random number picked from their list of speeds plus a random number less than one. This is repeated until one of the horses crosses the finish line. If more than one horse crosses the finish line in a given turn, then the winner will be the one that crosses by the largest margin.

```
def race(horses, distance):
positions = dict([(name, random.random()) for name, horse in horses.items()])
running = True
while running:
for name, horse in horses.items():
positions[name] += random.choice(horse) + random.random()
if positions[name] > distance:
running = False
return positions
```

To get the order in which the horses come in (though we'll only be interested in the winner), we can sort the horses using the positions:

```
positions = race(horses, 2000)
horse_order = horses.keys()
horse_order.sort(lambda x, y: cmp(positions[y], positions[x]))
for name in horse_order:
print "%s: %.3f" % (name, sum(horses[name])*1.0/HORSE_SPEEDS)
```

The output is now the list of horses and their average speed in order of their position in the race. For example:

```
horse 2: 12.750
horse 3: 12.375
horse 7: 12.125
horse 1: 12.500
horse 4: 12.375
horse 5: 12.125
horse 0: 11.875
horse 6: 11.000
```

As you see, the horses with the higher average speeds tend to do better, but it isn't guarenteed, which is just how we want it. Increasing the length of the race will reduce the effect chance has on the outcome; shorter races will make the outcome more random.

The race could easily be given a graphical representation, showing the horses' positions changing each turn thus making the simulation more of a game.

## The punters

The punters (people who bet on horses) are modelled as objects in which each horse is assigned a value. How well this value reflects the true speed of the horse depends on a punter's skill level. Punters with a skill of 0 give each horse a random value. Punters with a skill of *n*, for *n*>0, assign a value to each horse equal to the sum of *n* randomly picked speeds for that horse. So a punter with a skill of 3 will get to see three random speeds of a horse. A random value of less than 1 is added to avoid draws. For now, punters pick a horse, simply by finding the horse they have assigned the highest value.

Punters have a certain amount of money to bet and I decided to make this dependent on their skill as I figured that people with less skill at picking horses are less likely to bet large amounts of money. This may or may not be true in real life, but I suspect that people who pick horses randomly are less likely to gamble a large sum of money.

```
NUM_PUNTERS = 256
class Punter:
max_skill = 4
def __init__(self, horses):
self.skill = random.randint(0,self.max_skill)
self.money = random.randint(10, 20+self.skill*10)
self.assessments = dict([(name, random.random() + sum(random.sample(horse, self.skill))) for name, horse in horses.items()])
self.pick = self.pickHorse()
def pickHorse(self):
horses = self.assessments.keys()
horses.sort(lambda x, y: cmp(self.assessments[y], self.assessments[x]))
return horses[0]
punters = [Punter(horses) for n in range(NUM_PUNTERS)]
```

We can now count how many bets each horse gets with:

```
picks = dict([(name, [punter.pick for punter in punters].count(name)) for name in horses.keys()])
for name in horse_order:
print "%s: %.3f (%d)" % (name, sum(horses[name])*1.0/HORSE_SPEEDS, picks[name])
```

The output now puts the number of people who bet on each horse in brackets. For example:

```
horse 5: 13.625 (72)
horse 4: 13.125 (44)
horse 1: 12.875 (44)
horse 6: 12.750 (24)
horse 2: 12.375 (43)
horse 3: 12.000 (8)
horse 0: 11.875 (11)
horse 7: 11.500 (10)
```

In this case, the winning horse attracted the most bets, but there was a fairly even spread for the next four horses. We can go further by looking at the skill level of each punter betting on each horse:

```
skills = dict([(name, [punter.skill for punter in punters if punter.pick == name]) for name in horses.keys()])
for name in horse_order:
print "%s: %.3f (%d, %.2f)" % (name, sum(horses[name]) * 1.0 / HORSE_SPEEDS, picks[name], sum(skills[name]) * 1.0 / len(skills[name]))
```

Now the output includes the average skill level of punters who bet on each horse. For example:

```
horse 1: 13.000 (48, 2.60)
horse 6: 13.125 (79, 2.49)
horse 4: 12.250 (35, 1.80)
horse 3: 12.250 (23, 1.91)
horse 7: 11.625 (26, 1.31)
horse 0: 11.625 (27, 1.52)
horse 2: 10.250 (6, 0.33)
horse 5: 10.125 (12, 0.25)
```

It is clear that skill has a big effect on how likely punters are to pick the correct horse. In this case, the second fastest horse won, but the punters that backed it were the most skillful group.

## The bookies

I intend to include one or two bookies. To start with, I think they should have no prior knowledge of the horses, and set the odds for each horse at 7 to 1. They will then update their odds after each bet to take into account the information for that bet. Ideally they would change the odds to minimise their losses no matter what the outcome but I'm not quite sure how best to do this.

Punters will pick a horse based on which one they think they can make the most money from (i.e. is most undervalued by a bookie). I suspect the punters that place their bets earliest will do best. Or maybe they will only bet on the horse they think most likely to win if its odds seem favourable to them. Bookies might have to change their odds to encorage more punters to them to ensure at least some punters place bets.

Hopefully once all the bets are taken, the odds should be a reasonable reflection of the true odds.

To be continued when I get some time to try things out...

## Comments (1)

## Anonymous on Feb. 13, 2013, 5:59 p.m.

Hello,

When do we get to see it? I've been working on various simular projects with real horse racing data.

Keep up the good work.

Cheers.