In our brief introduction to probability distributions we talked about rolling dice, so let's stick with that example. Imagine I roll a die three times and each time you try and guess what the outcome will be. What's the probability of you guessing exactly k rolls right, where k is 0, 1, 2 or 3?
More generally, imagine you perform an experiment (eg roll a die) times, and each time the result can be success or failure. What's the probability you get exactly successes, where can be any integer from to ?
In our example, as long as the die is fair, you have a probability of of guessing right. Since the probability of three independent events (ie guessing correctly) is the product of the individual probabilities, your probability of three correct guesses is
Your probability of guessing wrong is . By the same reasoning as above, the probability of getting no guess right is
What about the probability of guessing one roll right, so ? There are three ways in which this could happen:
- You get the 1st roll right and the other two wrong
- You get the 2nd roll right and the other two wrong
- You get the 3rd roll right and the other two wrong
Since each involves one correct and two incorrect guesses, the probability of each of the three scenarios is And since the probability of any one of three events occurring is the sum of the individual probabilities, we have that the probability of getting exactly one guess right is
Finally, we look at the probability of two correct guesses. Again this can happen in three ways (we leave it up to you to work these out). Each individual way has a probability of , so the overall probability is
Here's the histogram displaying the distribution.
Now let’s look at the general set-up. You’re doing experiments that can each end in success or failure, and you’re asking for the probability that there are exactly successes among the experiments. Write for the probability of success so is the probability of failure. By the same reasoning as above, a particular sequence of successes and failures has probability
But, also as above, such a sequence can occur in several ways, each way defined by how the successes are sprinkled in among the failures. It turns out that the number of ways you can sprinkle objects in among a sequence of objects, denoted by , is given by
Here the notation , where is a positive integer, stands for
(and is defined to equal 1). We now have a neat way of writing the probability of successes:
That’s the binomial distribution.
The mean of this distribution, also known as the expectation is So in our example above where and the mean is
Loosely speaking, this means that if we played our game of guessing three rolls lots and lots of times, then on average you could expect to get half a roll per game right. Or, to phrase it in a way that uses whole numbers on average you could expect to get one roll in two games right.
The variance of the binomial distribution, which measures how spread out the probabilities are, is
So in our example above it is
The shape of the binomial distribution depends on the value of the mean and the number of experiments. Here are some more examples: