Table of Contents
In social science research, sampling is virtually always done "without" replacement; that is, after a particular unit has been selected for observation, it is not put back into the pool so it can be selected again.
There are four primary types of probability sampling designs. (Type 1) Simple Random Sample -- Every unit in the population has an equal chance of being selected in the sample. Each unit in the population is assigned a number. A set of numbers is then randomly selected with units assigned those numbers being included in the sample.
With a simple random sample, note that every unit in the population has an equal chance or probability of being selected for the sample. This probability is: p = 1/N Where N=the size of the study population.
Using simple random sampling may be very difficult if the size of the study population is very large as it could be very cumbersome and time consuming to assign a number to every unit in the study population, especially if it has to be done by hand.
(Type 2) Systematic Sample -- The sampling fraction (k) is calculated by dividing the study population size by the desired sample size. A random number is selected between 1 and k. Beginning with the randomly selected number every kth unit in the population is selected for inclusion in the sample.
A random number is then selected between 1 and 10 (e.g., a 3 is selected). The 3rd unit would then be selected for the sample. Using the sampling fraction, 10 units would be skipped and the 13th unit would be included. Skipping ahead 10 more units, the 23rd unit would be included. This procedure would continue until 10 units were selected for the sample and the desired sample size reached.
Note that if a number between 1 and k were not randomly selected to start on (that is, if you just started on k), a systematic sample would not meet the criteria for a probability sample. By, randomly selecting the starting point, all units have a nonzero probability of being included.
The primary reason for using a systematic sample is that it may more practical in terms of time and resources compared to a simple random sample. Particularly, when the size of the study population is large and it would be too difficult to assign a number to each unit.
A primary danger of systematic sampling is that this design can produce a biased, or non-representative sample if the sampling frame from which the sample is selected is ordered in some kind of systematic fashion that will influence the composition of the sample.
In order to ensure that a systematic sample is representative, it is important to make sure that the sampling frame being used to select the sample is not ordered in some systematic fashion that will produce a biased (or unrepresentative) sample in relation to the key variables that are to be measured.
There are two primary reasons for using a stratified sampling design. Reason 1 -- To potentially reduce sampling error by gaining greater control over the composition of the sample, particularly concerning variables where it is important that the sample be representative.
This type of design does not work as well when the goal is to draw inferences about the population as a whole as the sample is not representative of the population on the stratification variable. However, weighting can be used to attempt to correct this problem.
The major problem with using stratified sampling is that it is necessary for the researcher to have data on the characteristics of the population (i.e., have population data on the stratification variable) in order to select the sample. In many situations, data on population characteristics may be unavailable and unknown.
(Type 4) Cluster Sample -- The population is divided up into subgroups or "clusters" that represent aggregates of individual units. A sample of clusters is then selected. All individual units that are contained within a cluster that is selected are included in the sample.
A major advantage of cluster sampling is that it can be used on very large populations and it is not necessary to have data on important variables for the entire population. Rather, it is just necessary to be able to divide the population up into clusters of some type.
A major disadvantage of cluster sampling is that this method tends to produce less representative samples compared to other probability sampling designs, particularly when the clusters contain large numbers of units within them and only a few are needed to meet the desired sample size.
Remember, with a nonprobability sample, every unit in the study population does not have a chance, or a nonzero probability, of being selected for inclusion in the sample. As a result of this, statistical tests, such as the calculation of confidence intervals, cannot be validly used, because such procedures assume that each unit in the population does have a chance of being included in the sample.
Therefore, any studies using nonprobability sampling designs should be viewed with suspicion if the researchers are trying to use the data to draw empirical generalizations, or inferences, to a larger population.
Thus, with this design, not only is every unit in the population not eligible for inclusion in the sample, but the composition may be affected by the personal biases of the researcher as to who he/she believes should be interviewed.
(4) Snowball Sample -- a unit with a desired characteristic is identified. This unit is asked to identify other units with the desired characteristic. These additional units are also asked to identify other units with the desired characteristic. Through this process the size of the sample "snowballs" or grows larger.
The problem with the snowball sample is the same; that is, not all units in the study population would have a chance of being included in the sample. Therefore, inferences cannot be validly drawn to the study population.
While findings obtained from nonprobability samples cannot be empirically generalized to a larger population, they could be viewed as "suggestive." That is, the findings could be viewed as the results that a researcher "might" obtain, if he/she conducted a study using a probability sample.
Sampling error (i.e., the difference between a sample estimator and its corresponding population parameter) is partly attributable to random fluctuations in terms of which units happen to be randomly selected to be included in a sample. This is known as random error.
b. Nonresponse Bias -- sampling error that results when a substantial number of units in a sample: (a) does not provide data (e.g., does not respond to a survey or participate in a study); and (b) has significantly different characteristics compared to those units that do provide data.
c. Selective Availability -- sample error that results because units that are difficult to identify are left out of the study population and have no chance of being included in the sample (i.e., study population doesn't match theoretical population).
e. Measurement Error -- sampling error that results from using poor quality indicators to measure key study variables. As a result incorrect numbers are assigned to units that don't represent the true quantity of an action or orientation, or the true characteristics possessed by the units.
|Author: Department of Sociology|