CHAPTER 6 PROBABILITY Chapter 5 introduced the concepts of z scores and the normal curve. This chapter takes these two concepts a step further and explains their relationship with another statistical concept called probability. The concept of probability is highly complex, providing sufficient subject matter for entire textbooks and college courses. Despite the presence of such complexity, an attempt has been made to limit the discussion of probability in this text to the most fundamental concepts necessary to provide a sufficient background for the conduct of quantitative political research. Outcomes from actions and decisions are often uncertain. There are relatively few instances in everyday life when an individual can be absolutely sure that a particular action can result in only set of consequences. Most choices are made in the context of uncertainty where decision makers must acknowledge the consequences of their choice will fall within a range of possible outcomes. The concept of probability involves an effort to evaluate these possible outcomes and determine the likelihood of each within a given concept. Probability is defined for the purposes of this class as the number of times out of 100 that a given outcome can be expected to occur. This represents a proportion that describes the chance that a specific outcome will occur. By definition, all probabilities fall within a range of 0 meaning that there is no chance an outcome can occur and 1 which indicates complete certainty of a given outcome. In most cases, a realistic assessment of probability will lie somewhere in between. For example, what is the probability of rolling a die once and getting a six? A die has 6 sides. The number six represents one of the six possible outcomes. Therefore, the probability of rolling a die once and getting a 75
six is 1 in 6 (P =.17 or 17 out of 100). The probability of an interior decorator choosing a wallpaper design from one hundred designs in a paint store is 1 in 100 (P =.01 or 1 out of 100). If only Democrats and Republican candidates are running, the probability of a Republican being elected President is 1 in 2 (P =.50 or 50 out of 100) What is the probability that the first letter in your license plate will be a your middle initial? Several elementary rules of probability should also be introduced to clarify the process of working with probability. These rules are the addition, mutually exclusive and multiplication rules. The addition rule states that the probability of obtaining any one of several different outcomes equals the sum of their separate probabilities. For example, a card is selected from a deck of playing cards. What is the probability of getting either the ace, king, queen, or jack of 1 2 hearts? The probability is 1/52 + 1/52 + 1/52 + 1/52 = 4/52 (P =.08) or 8 chances in 100 that the card will be either the ace, king, queen or jack of hearts. The probability of drawing either a black or red card is 26/52 + 26/52 = 52/52 = 1 (P = 1.00). Since the probability of all outcomes is equal to 1, by adding the separate probabilities the card must either be black or red, and one can be sure it will be one or the other. When expressed as a proportion, ordinary probability always ranges from 0 to 1. Exact probability is expressed by as many decimals as one's calculator displays. proportional results of.076923 would be expressed as. The mutually exclusive rule of probability means that no two outcomes can occur at the same time. 1 The example assumes replacement of the card after each draw. 2 The result of the division is.076923 but is rounded to two decimal places. 76
The probability rule for mutual exclusive outcomes, specifically stated, is that the probability of one or the other of two mutually exclusive outcomes is the sum of their individual probabilities. When a coin is flipped, it must be either a head or a tail. It cannot be both a head and a tail. On a single flip, the probability is 1 out of 2 (P =.50 or 50 of 100) that it will be a head and the probability is 1 of 2 that it will be a tail. The two events are mutually exclusive because the occurrence of one outcome makes the other outcome impossible. If outcome X and outcome Z are mutually exclusive then P = X and Z = 0. Both X and Z cannot happen at the same time. When a card is drawn from a deck, it is either black or red not black and red. The colors are mutually exclusive. The third important rule of probability is the multiplication rule. The multiplication rule states that the probability of obtaining a combination of mutually exclusive outcomes is equal to the product of their individual probabilities. Suppose a penny, nickel, and dime were flipped and landed flat. What is the probability they will all be heads? The probability is.5 x.5 x.5 =.125 or P =.13, or the chances are 13 out of 100 that all three will be heads. If we shuffled a deck of playing cards and turned up the top four cards, the probability of these cards all being aces would be 4/52 x 3/51 x 2/50 x 1/49 =.08 x.06 x.04 x.02 =.000003807 or less than 1 of 100. One of the most important statistical considerations related to probability is the normal curve. The normal curve is a frequency distribution with 100% of the values in a distribution under the curve. Most of the values occur at the peak of the curve around the mean, and as the curve moves toward the end, it flattens out and includes fewer and fewer values. Since this is true, as one moves from the center along the base line, the probability of a value occurring decreases. For example, the probability is 68 of 100 (P =.68) that a value in a normal 77
distribution will fall between a plus 1 and minus 1 standard deviation unit from the mean, a 95 of 100 (P =.95) chance that the value will fall in the interval plus 2 and minus 2 standard deviations, and a 99 of 100 (P =.99) probability that a value will fall in the area plus 3 and minus 3 standard deviation units from the mean. These concepts are illustrated in Figure 6:1. FIGURE 6:1 NORMAL FREQUENCY DISTRIBUTION CURVE As previously suggested, the probability is P =.50 or 50 chances out of 100 of getting a head on one flip of a coin, P =.02 or 2 out of 100, or 1 out of 52 of drawing a particular card from a deck of 52 cards, and P =.01 or 1 out of 100 of a woman choosing one design of wallpaper out of 100 designs. Likewise, the probability is 68 out of 100 that any raw value will fall between a plus 1 and minus 1 standard deviation unit under the normal curve. The following example is a further elaboration of this concept. 78
If it was known that IQ is normally distributed among a group of college students with a standard deviation of 10 and a mean of 130, then the probability of obtaining an IQ between 130 and 150 by randomly selecting the IQ of a single student can be determined by calculating a z score. A student with an IQ of 150 falls plus 2.00 standard deviation units above the mean. By referring to the Table in Appendix C, we find that 48 percent of the IQ scores for the entire group will fall between the mean and plus 2.00 standard units above the mean. Translated to probability, this means that if there were 100 IQ scores, 48 IQ scores out of 100 (P =.48) fall between 130 and 150. The probability that an individual IQ score of 150 falls in the interval of a plus 2 standard deviations from the mean is 48 of 100. Figure 6:2 is an illustration of this concept. FIGURE 6:2 PERCENT OF THE CURVE FOR +2.00 STANDARD UNITS 79
There are several other research questions that could be raised that are related to these data. For example, what is the probability of an IQ being 150 or above? Since one knows that 50 percent of the curve is above the mean, and 48 percent of the values are between 130 and 150, by subtracting 48 percent from 50 percent, one finds that 2 percent (P =.02) have an IQ of 150 or higher and that the chances are 2 in 100 that a randomly selected IQ will be 150 or higher. Similarly, what is the probability that an IQ score will fall between 110 and the mean? First, a z score is calculated as follows: A student with an IQ of 110 falls minus 2.00 standard deviation units below the mean. The Table in Appendix C indicates that 48 percent of the values fall between 110 and 130. This is also shown in Figure 6:3. FIGURE 6:3 PERCENT OF THE CURVE FOR -2.00 STANDARD UNITS 80
If one also wants to know the probability of obtaining an IQ below 110, one simply subtracts 48 percent from the 50 percent on the negative side of the curve to find that 2 percent or 2 of 100 (P =.02) could have an IQ of 110 and less. The addition rule of probability can be applied to this illustration. For example, what is the probability that a randomly selected IQ will be either 150 or above or 110 and below? The separate possibilities are added as follows: P =.02 +.02 =.04. There are 4 chances in 100 that an IQ falls either above 150 or below 110. The multiplication rule is also applicable. For example, what is the probability that three students could have an IQ of less than 110? The separate probabilities are multiplied as follows: P =.0228 x.0228 x.0228 =.0000118. There is less than one chance in one hundred of this occurring. The exact probability is 118 of 10,000,000. In this chapter, z scores or standard scores have been used to find probability. The concept of probability as it is related to the normal curve has also been explained. The use of standard deviation and how it relates to z scores and the normal curve should now be apparent. All of these concepts are important building blocks for subsequent statistical analyses and will be further explored in later chapters. An Illustrative Problem: I have 100 pieces of candy in a jar labeled 1 through 100. I pull out a piece at random, note the number and replace the candy in the jar. What is probability my friend will pull out the same piece of candy? What are the chances she will pull out a higher or lower numbered piece of candy? 81
Step 1 s SEQUENTIAL STATISTICAL STEPS PROBABILITY What is the first step in applying probability to the normal curve? As continued from chapter 5, standard deviation must be obtained. Step 2 z What is the z score? Subtract the mean from the raw value of the distribution and divide by the standard deviation. The z score will be a positive or negative value. Step 3 Consult Table What percentage of the values of the distribution z score are between the raw value and the mean? Find the z score in the table and subtract the value from 50%. Remember, the table applies to only 50% of the curve. Step 4 Probability What is the probability that a randomly selected of z value will fall in this range of values? Since 100% of the values are under the curve, to find probability, simply state the percentage in terms of probability. Step 5 Repeat Process What are the probabilities for the other rules of the probability? Addition and multiplication rules may also apply. Step 6 Draw Conclusions Once probability has been determined, decisions can be made about other values. 82
EXERCISES - CHAPTER 6 (1) Given a normal distribution of hourly incomes (X) in which the mean = $10.00 and s = $2.50, express each of the following incomes as a z score: (A) X = $ 8.40 (H) X = $14.00 (O) X = $ 6.50 (B) X = $11.00 (I) X = $12.00 (P) X = $ 8.00 (C) X = $13.20 (J) X = $ 9.50 (Q) X = $ 4.00 (D) X = $ 5.00 (K) X = $ 6.00 (R) X = $ 7.60 (E) X = $15.00 (L) X = $10.00 (S) X = $14.20 (F) X = $11.50 (M) X = $11.60 (T) X = $13.00 (G) X = $ 9.00 (N) X = $ 7.00 (U) X = $12.42 (2) For the income distribution in the problem above, and a mean of 10.00 and s = 2.50 determine: (A) the percent of the individuals who earn an hourly income of $15.00 or more; (B) the probability of locating an individual whose hourly income is $15.00 or more; (C) the percent of the individuals who earn between $10.00 and $10.50; (D) the probability of locating an individual whose income falls between $10.00 and $10.50; (E) the percent of the individuals who earn $8.40 or less; (F) the exact probability of an individual earning more than $5.00 an hour; (G) the exact probability of an individual earning below $20. (3) Assume the following are sample data and answer the questions given. Show graphically answers to questions J through U. Final Grades f 101-110 1 91-100 2 81-90 3 71-80 4 61-70 7 51-60 4 41-50 3 31-40 2 21-30 1 83
(A) mean 2 (B) s (C) s (D) median (E) percentile rank for 65 (F) z score for 35 (G) z score for 92 (H) z score for 75 (I) z score for 45 (J) percentage of grades above 92 (K) percentage of grades below 75 (L) probability of a grade being below 45 (M) probability of a grade being above 35 (N) percentage of the grade being above 95 (O) exact probability of a grade being above 95 (P) (Q) (R) probability of a grade being either 75 or less or 92 and above. probability of three grades being 90 or above if 60 is passing, the percent who failed (S) percentage of the values below 80. (T) percentage of the values above 50. (U) probability a value will be either above 80 or below 50. 84
(4) Find the mean, median, mode, range, variance, and standard deviation for the following distributions. X f 565 3 560 2 555 1 550 6 545 4 540 3 535 2 530 1 X 32 30 28 26 18 14 16 12 (5) Find the mean, median, mode, variance, standard deviation for the following distribution. X f $190-200 2 $170-189 3 $150-169 4 $130-149 5 $110-129 3 $90-109 2 What is the z score for $155, $120, and $179? What is the skew? "A man must get a thing before he can forget it." Oliver Wendell Holmes 85