*[Continued from Grasping the normal distribution (Part 1)]*
**What are the odds?**
A good definition for the word "probability" is hard to find. The ones I've found use synonyms like likelihood or chance, so the definition is circular. Fortunately, most of us have an innate understanding of the concept. The probability that the Sun will come up tomorrow is pretty high. The probability that I'll hit the Powerball jackpot, be asked for a date by Kate Upton, and be hit on the head by a meteorite—all in the same day—is vanishingly small.
If you really want to learn about probability, you don't need to go to Yale or Harvard. You only need to study under a professional gambler, like those fictitious, Runyonesque characters in *Guys and Dolls*. Nobody can compute probabilities in his head like a gambler. Only he calls them "the odds."
Ask one of these guys what the odds are for a tossed coin coming up heads, and he'll immediately say "50/50." That's his way of saying that there is no preference for one result over the other. Tossed many times, the coin will come up heads, on average, half the time. We'd say the probability is 1/2. That's 50% to the gambler.
Faced with the same question, a mathematician might define a *probability function*:
(2)
Since the two probabilities are equal, we say that the distribution is *uniform*. The gambler would say that the coin is *fair*.
The roll of a single die is also fair. Except for the number of dots, all the faces are made just alike, so there's no reason to suppose that one of them will come up more often than the other. The gambler would say that the odds of a tossed die showing, say, six, are 1 in 6. The mathematician would write
(3)
The sets of probabilities in **Equations 2 and 3** are called *probability distribution functions*. For these two cases, the functions are discrete, having values only at the integer mesh points. Try as you might, you're never going to roll some dice and get a value of 3.185295. As for all non-integer results, the probability of that result is 0.
The probabilities of my sun vs. Kate examples are:
(4)
From these few and sometimes silly examples, we can get an idea as to what a probability really is. It must be a single scalar number that represents the likelihood that some event will happen. What's more, the value must be constrained to the range
(5)
because no event can happen less frequently than never, or more frequently than always.
Note carefully that the probabilities in **Equations 2 and 3** add up to 1. When you flip a coin, you must get *some* result, and the result can only be heads or tails. Landing on its edge is not allowed. Similarly, when you roll a single die, getting a value between 1 through 6 is certain.
On the basis of this sometimes arm-waving argument, we can now give a rigorous definition of a probability. It's:
(6)
**On a roll**
Let's perform a little thought experiment. I'm going to roll a single die six times, and count how many times each face shows up. The result is shown in **Figure 1**, in the form of a *histogram*.
What's that you say? You were expecting to see each face appear once and only once? Well, that's what you'd get if the results were predictable, but they're not. It's a *random* process, remember? The chance of getting one and only one occurrence of each face are about:
(7)
If we roll the dice a lot more times, we should get a histogram more like what we expect. **Figure 2** shows the result of 6,000 rolls.
Even with so many trials, we still see slight differences between the ideal and actual results. But at this point it probably won't take much to convince you that, on average, the number of occurrences are equal. The die is indeed fair, and the probability of rolling any given value is 1/6.
**More dice, please**
So far, the graphs I've shown are rather dull. Six numbers, all equal, are not exactly likely to get your blood pumping. But things get a lot more interesting if you add more dice. **Figure 3** shows the histogram for two dice.
Now we're getting somewhere! At last, the histogram has some character.
Why are there more occurrences of a seven than a 2 or 12? The answer goes right to the heart of probability theory. The unstated rule for a roll with two dice is that we *add* the values of the two dice. When we add them, there can only be one way to get a result of 2: each die has to show a value of 1. Ditto for a sum of 12. But there are six possible ways to get a sum of 7. You can have:
(8)
All six ways must be counted, and the order matters. and count as two different ways, not just one. As we can see from the histogram, a result of 7 is six times more likely than a result of 2 or 12.
If you add the heights of all the bars, you'll get a total of 36. That makes sense; you can arrange the first die in six possible ways. For each of those ways, you can arrange the second in six ways. The total number of ways must be:
(9)
Our gambler friends would say the odds of a 7 are 6 out of 36, or 1 out of 6.
Well, if two dice make for a more interesting histogram, maybe we should try three or more.** Figures 4 through 6** show the results for three, four, and five dice, respectively.
What are we looking at here? Can we say "bell curve"?
Let's review the bidding. We started this thought experiment with the simple statistics for a single die—statistics which happen to describe a *uniform distribution*, in which all outcomes are equally likely.
From that simplest of beginnings, we added more dice, always following the rule that the result is the numerical sum of the faces shown on each die. We watched the shapes of the histograms morph from the uniform distribution through the triangular shape of **Figure 4** into the familiar bell-curve shape. It's really quite remarkable that we not only got a histogram of this shape from such primitive beginnings, but the bell-curve shape begins to appear with so few (three to five) dice.
But if you think that's remarkable, wait till you hear this: We would have gotten the same shape for *any* starting distribution! All we need is some device that produces at least two numbers at random, and the rule that we get the score by adding the individual results.
This truly remarkable result follows from the *central limit theorem*.
No doubt you've already figured out that the shape that these histograms seem to be trending to is the shape of the normal distribution. Now you can see why the normal distribution is so ubiquitous in nature. It's because you almost never see (except in board games) a single source of the noise. Usually the noise is generated by many random processes, all running independently of each other. As long as the outputs of the many sources are added together (as they would be in, say, an electronic circuit), the normal distribution is the inevitable result.
*[To be continued at Grasping the normal distribution (Part 3)]* |