Equation to get the probability of n dice roll's value
Mark Oates

With a recent review of some old code, I came to find a function that was not working as expected. The function is simple; it rolls a set of n-sided dice and returns the total value.

Since my recent obsession with testing, I sought out to write a test that ensures the returned values match the expected probabilities, but found writing the actual test to be a bit of a challenge.

The test itself creates a DiceRoller and rolls it 10000 times and then asserts the expected number of collected rolls matches the expected probability of those rolls (within a margin of error).

As it stands, it looks like this:

1TEST(DiceRollerTest, roll__returns_a_total_count_of_all_rolled_dice_with_the_expected_probability) 2{ 3 const int SIDES_OF_DIE = 4; 4 const int NUMBER_OF_DIE = 3; 5 int buckets[SIDES_OF_DIE * NUMBER_OF_DIE] = { 0 }; 6 int num_rolls = 10000; 7 8 DiceRoller dice_roller(SIDES_OF_DIE, NUMBER_OF_DIE, 123); // 3 4-sided die 9 10 for (unsigned i=0; i<num_rolls; i++) 11 { 12 int rolled_number = dice_roller.roll(); 13 buckets[rolled_number-1]++; 14 } 15 16 float expected_probabilities[12] = { 0.0, 0.0, 0.015, 0.046, 0.093, 0.156, 0.187, 0.187, 0.156, 0.093, 0.046, 0.015 }; 17 float allowed_testing_margin_of_error = 0.01; 18 19 for (unsigned i=0; i < (NUMBER_OF_DIE * SIDES_OF_DIE); i++) 20 { 21 float probability_met_for_roll = (float)buckets[i] / num_rolls; 22 std::cout << (i+1) << ": " << buckets[i]/(float)num_rolls << std::endl; 23 ASSERT_TRUE(abs(probability_met_for_roll - expected_probabilities[i]) <= allowed_testing_margin_of_error); 24 } 25}

To get buy, I decided to hard-code the expected_probabilities on line 16, but, don't really like that. I'd like to have a proper probability function calculate the expected values for any n-sided dice with x number of dice, and test against that. That way I could arbitrarily test any number combination of n-sided dice.

But, I haven't been able to traverse the rift between the mathematics (the notation of which I still don't quite understand), and an actual program function to represent it.

Anybody have any ideas?

Edgar Reynaldo

Sum the probability manually, and then compare. N dX = X^N possibilities. The problem is summing equivalent values

Elias

Ha, that's the kind of problem they'd have you solve at a Google tech interview!

Instead of figuring out very complex math, you can just enumerate over all possibilities. This would solve your specific example with 3 dice with 4 sides:

```count[13] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
for (dice1 = 1; dice1 <= 4; dice1++) {
for (dice2 = 1; dice2 <= 4; dice2++) {
for (dice3 = 1; dice3 <= 4; dice3++) {
sum = dice1 + dice2 + dice3;
count[sum]++;
}
}
}
```

To code a loop of an arbitrary number of loops, you can use a trick where you have an array of counters:

```int dice[NUMBER_OF_DIE];
while (true) {
int sum = 0;
for (int j = 0; j < NUMBER_OF_DIE; j++) {
sum += dice[j];
}
count[sum]++;

for (int j = 0; j < NUMBER_OF_DIE; j++) {
dice[j]++;
if (dice[j] <= SIDES_OF_DIE) break;
dice[j] = 1;
}
}
```

Basically instead of dice1, dice2, dice3 I use dice[0], dice[1], dice[2] but it works with an arbitrary number.

Putting it all together:

1int dice[NUMBER_OF_DIE]; 2for (int j = 0; j < NUMBER_OF_DIE; j++) dice[j] = 1; 3 4int index = 0; 5int possibilities = pow(SIDES_OF_DIE, NUMBER_OF_DIE); 6int maximum = SIDES_OF_DIE * NUMBER_OF_DIE; 7 8int count[maximum + 1]; 9for (int i = 0; i < maximum + 1; i++) count[i] = 0; 10 11for (int i = 0; i < possibilities; i++) { 12 13 int sum = 0; 14 for (int j = 0; j < NUMBER_OF_DIE; j++) { 15 sum += dice[j]; 16 } 17 count[sum]++; 18 19 for (int j = 0; j < NUMBER_OF_DIE; j++) { 20 dice[j]++; 21 if (dice[j] <= SIDES_OF_DIE) break; 22 dice[j] = 1; 23 } 24} 25 26for (int i = 0; i < maximum + 1; i++) { 27 printf("probability of %d is %f\n", i, count[i] / (double)possibilities); 28}

SiegeLord

The function that returns a probability of an event (an example of an event would be "die fell with 6 dots on top") is called the probability mass function (PMF). When you have two PMFs whose events are integers, the PMF of the sum of those two integers is the convolution of the two PMFs. You can read about convolutions all over the place, it's a very useful operation and happens to be easy to implement.

```std::vector<double> convolve_discrete(const std::vector<double>& pmf1, const std::vector<double>& pmf2)
{
std::vector<double> ret(pmf1.size() + pmf2.size() - 1, 0.0);

for (int i = 0; i < pmf1.size(); i++)
for (int j = 0; j < pmf2.size(); j++)
ret[i + j] += pmf1[i] * pmf2[j];

return ret;
}
```

Note that that function assumes that the integers for each PMF go from 0 to its length - 1. So, for your dice what you do is you pick two dice (whose PMFs are presumably uniform), and use this function to return the PMF of their sum. If you have more dice, then you apply this function recursively.

The advantage of this method over enumeration methods is that it supports biased dice easily, variable number of sides per dice etc. It is also a lot faster than enumeration (this method is O(NK^2 log N), I think, and enumeration is O(K^N) where K is number of sides, and N is the number of dice.