Avoid Bias with Random Statistical Samples

Statistics All-in-One For Dummies

How do you select a statistical sample in a way that avoids bias? The key word is random. A random sample is a sample selected by equal opportunity; that is, every possible sample of the same size as yours had an equal chance to be selected from the population. What random really means is that no subset of the population is favored in or excluded from the selection process.

Non-random (in other words bad) samples are samples that were selected in such a way that some type of favoritism and/or automatic exclusion of a part of the population was involved, whether intentional or not. A classic example of a non-random sample comes from polls for which the media asks you to phone in your opinion on a certain issue (“call-in” polls). People who choose to participate in call-in polls do not represent the population at large because they had to be watching that program, and they had to feel strongly enough to call in. They technically don’t represent a sample at all, in the statistical sense of the word, because no one selected them beforehand — they selected themselves to participate, creating a volunteer or self-selected sample. The results will tend to be skewed toward people with strong opinions.

To take an authentic random sample, you need a randomizing mechanism to select the individuals. For example, the Gallup Organization starts with a computerized list of all telephone exchanges in America, along with estimates of the number of residential households that have those exchanges. The computer uses a procedure called random digit dialing (RDD) to randomly create phone numbers from those exchanges, and then selects samples of telephone numbers from those. So what really happens is that the computer creates a list of all possible household phone numbers in America and then selects a subset of numbers from that list for Gallup to call.

No matter how large a sample is, if it’s based on non-random methods, the results will not represent the population that the researcher wants to draw conclusions about. Don’t be taken in by large samples — first check to see how they were selected. Look for the term random sample. If you see that term, dig further into the fine print to see how the sample was actually selected and use the preceding definition to verify that the sample was, in fact, selected randomly. A small random sample is better than a large non-random one.

About This Article

About the book author:

Deborah J. Rumsey, PhD, is an Auxiliary Professor and Statistics Education Specialist at The Ohio State University. She is the author of Statistics For Dummies, Statistics II For Dummies, Statistics Workbook For Dummies, and Probability For Dummies.