Sampling



Sampling


imagehttp://evolve.elsevier.com/Grove/practice/


Many of us have preconceived notions about samples and sampling, which we acquired from television commercials, polls of public opinion, market researchers, and newspaper reports of research findings. The advertiser boasts that four of five doctors recommend its product; the newscaster announces that John Jones is predicted to win the senate election by a margin of 3 to 1; the newspaper reports that scientists’ studies have found that taking a statin drug, such as atorvastatin (Lipitor), significantly reduces the risk of coronary artery disease.


All of these examples use sampling techniques. However, some of the outcomes are more valid than others, partly because of the sampling techniques used. In most instances, television, newspapers, and advertisements do not explain their sampling techniques. You may hold opinions about the adequacy of these techniques, but there is not enough information to make a judgment.


The sampling component is an important part of the research process that needs to be carefully thought out and clearly described. To achieve these goals, researchers need to understand the techniques of sampling and the reasoning behind them. With this knowledge, you can make intelligent judgments about sampling when you are critically appraising studies or developing a sampling plan for your own study. This chapter examines sampling theory and concepts; sampling plans; probability and nonprobability sampling methods for quantitative, qualitative, outcomes, and intervention research; sample size; and settings for conducting studies. The chapter concludes with a discussion of the process for recruiting and retaining subjects or participants for study samples in various settings.



Sampling Theory


Sampling involves selecting a group of people, events, behaviors, or other elements with which to conduct a study. A sampling plan defines the process of making the sample selections; sample denotes the selected group of people or elements included in a study. Sampling decisions have a major impact on the meaning and generalizability of the findings.


Sampling theory was developed to determine mathematically the most effective way to acquire a sample that would accurately reflect the population under study. The theoretical, mathematical rationale for decisions related to sampling emerged from survey research, although the techniques were first applied to experimental research by agricultural scientists. One of the most important surveys that stimulated improvements in sampling techniques was the U.S. census. Researchers have adopted the assumptions of sampling theory identified for the census surveys and incorporated them within the research process (Thompson, 2002).


Key concepts of sampling theory are (1) populations, (2) elements, (3) sampling criteria, (4) representativeness, (5) sampling errors, (6) randomization, (7) sampling frames, and (8) sampling plans. The following sections explain these concepts; later in the chapter, these concepts are used to explain various sampling methods.



Populations and Elements


The population is a particular group of people, such as people who have had a myocardial infarction, or type of element, such as nasogastric tubes, that is the focus of the research. The target population is the entire set of individuals or elements who meet the sampling criteria, such as women who have experienced a myocardial infarction in the past year. Figure 15-1 shows the relationships among the population, target population, and accessible populations. An accessible population is the portion of the target population to which the researchers have reasonable access. The accessible population might be elements within a country, state, city, hospital, nursing unit, or clinic, such as the adults with diabetes in a primary care clinic in Fort Worth, Texas. The sample is obtained from the accessible population by a particular sampling method, such as simple random sampling. The individual units of the population and sample are called elements. An element can be a person, event, behavior, or any other single unit of study. When elements are persons, they are usually referred to as subjects or research participants or informants (see Figure 15-1). The term used by researchers depends of the philosophical paradigm that is reflected in the study and the design. The term subject, and sometimes research participant, is used within the context of the postpositivist paradigm of quantitative research (see Chapter 2). The term study or research participant or informant is used in the context of the naturalistic paradigm of qualitative research (Fawcett & Garity, 2009; Munhall, 2012). In quantitative, intervention, and outcomes research, the findings from a study are generalized first to the accessible population and then, if appropriate, more abstractly to the target population.



Generalizing means that the findings can be applied to more than just the sample under study because the sample is representative of the target population. Because of the importance of generalizing, there are risks to defining the accessible population too narrowly. For example, a narrow definition of the accessible population reduces the ability to generalize from the study sample to the target population and diminishes the meaningfulness of the findings. Biases may be introduced that make generalization to the broader target population difficult to defend. If the accessible population is defined as individuals in a white, upper-middle-class setting, one cannot generalize to nonwhite or lower income populations. These biases are similar to biases that may be encountered in a nonrandom sample (Thompson, 2002).


In some studies, the entire population is the target of the study. These studies are referred to as population studies (Barhyte, Redman, & Neill, 1990). Many of these studies use data available in large databases, such as the census data or other government-maintained databases. Epidemiologists sometimes use entire populations for their large database studies. In other studies, the entire population of interest in the study is small and well defined. For example, one could conduct a study in which the defined population was all living recipients of heart and lung transplants.


In some cases, a hypothetical population is defined for a study. A hypothetical population assumes the presence of a population that cannot be defined according to sampling theory rules, which require a list of all members of the population. For example, individuals who successfully lose weight would be a hypothetical population. The number of individuals in the population, who they are, how much weight they have lost, how long they have kept the weight off, and how they achieved the weight loss are unknown. Some populations are elusive and constantly changing. For example, identifying all women in active labor in the United States, all people grieving the loss of a loved one, or all people coming into an emergency department would be impossible.



Sampling or Eligibility Criteria


Sampling criteria, also referred to as eligibility criteria, include a list of characteristics essential for membership or eligibility in the target population. The criteria are developed from the research problem, the purpose, a review of literature, the conceptual and operational definitions of the study variables, and the design. The sampling criteria determine the target population, and the sample is selected from the accessible population within the target population (see Figure 15-1). When the study is complete, the findings are generalized from the sample to the accessible population and then to the target population if the study has a representative sample (see the next section).


You might identify broad sampling criteria for a study, such as all adults older than 18 years of age able to read and write English. These criteria ensure a large target population of heterogeneous or diverse potential subjects. A heterogeneous sample increases your ability to generalize the findings to a larger target population. In descriptive or correlational studies, the sampling criteria may be defined to ensure a heterogeneous population with a broad range of values for the variables being studied. However, in quasi-experimental or experimental studies, the primary purpose of sampling criteria is to limit the effect of extraneous variables on the particular interaction between the independent and dependent variables. In these types of studies, the sampling criteria need to be specific and designed to make the population as homogeneous or similar as possible to control for the extraneous variables. Subjects are selected to maximize the effects of the independent variable and minimize the effects of variation in other extraneous variables so that they have a limited impact on the dependent variable scores.


Sampling criteria may include characteristics such as the ability to read, to write responses on the data collection instruments or forms, and to comprehend and communicate using the English language. Age limitations are often specified, such as adults 18 years and older. Subjects may be limited to individuals who are not participating in any other study. Persons who are able to participate fully in the procedure for obtaining informed consent are often selected as subjects. If potential subjects have diminished autonomy or are unable to give informed consent, consent must be obtained from their legal representatives. Thus, persons who are legally or mentally incompetent, terminally ill, or confined to an institution are more difficult to access as subjects (see Chapter 9). However, sampling criteria should not become so restrictive that the researcher cannot find an adequate number of study participants.


A study might have inclusion or exclusion sampling criteria (or both). Inclusion sampling criteria are characteristics that a subject or element must possess to be part of the target population. Exclusion sampling criteria are characteristics that can cause a person or element to be excluded from the target population. Researchers need to provide logical reasons for their inclusion and exclusion sampling criteria, and certain groups should not be excluded without justification. In the past, some groups, such as women, ethnic minorities, elderly adults, and poor people, were unnecessarily excluded from studies (Larson, 1994). Today, federal funding for research is strongly linked to including these populations in studies. Exclusion criteria limit the generalization of the study findings and should be carefully considered before being used in a study.


Twiss et al. (2009) conducted a quasi-experimental study to examine the effects of strength and weight training (ST) exercises on muscle strength, balance, and falls of breast cancer survivors (BCSs) with bone loss (population). This study included clearly identified inclusion and exclusion sampling or eligibility criteria that are presented in the following excerpt.



Twiss et al. (2009) identified specific inclusion and exclusion sampling criteria to designate the subjects in the target population precisely. These sampling criteria probably were narrowly defined by the researchers to promote the selection of a homogeneous sample of postmenopausal BCSs with bone loss. These inclusion and exclusion sampling criteria were appropriate for the study to reduce the effect of possible extraneous variables that might have an impact on the treatment (ST exercises) and the measurement of the dependent variables (muscle strength, balance, and falls). Because this is a quasi-experimental study that examined the impact of the treatment on the dependent or outcome variables, the increased controls imposed by the sampling criteria strengthened the likelihood that the study outcomes were caused by the treatment and not by extraneous variables. Twiss et al. (2009) found significant improvement in muscle strength and balance for the treatment group but no significant difference in the number of falls between the treatment and comparison groups.



Sample Representativeness


For a sample to be representative, it must be similar to the target population in as many ways as possible. It is especially important that the sample be representative in relation to the variables you are studying and to other factors that may influence the study variables. For example, if your study examines attitudes toward acquired immunodeficiency syndrome (AIDS), the sample should represent the distribution of attitudes toward AIDS that exists in the specified population. In addition, a sample must represent the demographic characteristics, such as age, gender, ethnicity, income, and education, which often influence study variables.


The accessible population must be representative of the target population. If the accessible population is limited to a particular setting or type of setting, the individuals seeking care at that setting may be different from the individuals who would seek care for the same problem in other settings or from individuals who self-manage their problems. Studies conducted in private hospitals usually exclude poor patients, and other settings could exclude elderly or undereducated patients. People who do not have access to care are usually excluded from health-focused studies. Subjects and the care they receive in research centers are different from patients and the care they receive in community clinics, public hospitals, veterans’ hospitals, and rural health clinics. Obese individuals who choose to enter a program to lose weight may differ from obese individuals who do not enter a program. All of these factors limit representativeness and limit our understanding of the phenomena important in practice.


Representativeness is usually evaluated by comparing the numerical values of the sample (a statistic such as the mean) with the same values from the target population. A numerical value of a population is called a parameter. We can estimate the population parameter by identifying the values obtained in previous studies examining the same variables. The accuracy with which the population parameters have been estimated within a study is referred to as precision. Precision in estimating parameters requires well-developed methods of measurement that are used repeatedly in several studies. You can define parameters by conducting a series of descriptive and correlational studies, each of which examines a different segment of the target population; then perform a meta-analysis to estimate the population parameter (Thompson, 2002).



Sampling Error


The difference between a sample statistic and a population parameter is called the sampling error (Figure 15-2). A large sampling error means that the sample is not providing a precise picture of the population; it is not representative. Sampling error is usually larger with small samples and decreases as the sample size increases. Sampling error reduces the power of a study, or the ability of the statistical analyses conducted to detect differences between groups or to describe the relationships among variables (Aberson, 2010; Cohen, 1988). Sampling error occurs as a result of random variation and systematic variation.


image
Figure 15-2 Sampling error.


Random Variation


Random variation is the expected difference in values that occurs when one examines different subjects from the same sample. If the mean is used to describe the sample, the values of individuals in that sample will not all be exactly the same as the sample mean. Values of individual subjects vary from the value of the sample mean. The difference is random because the value of each subject is likely to vary in a different direction. Some values are higher and others are lower than the sample mean. The values are randomly scattered around the mean. As the sample size becomes larger, overall variation in sample values decreases, with more values being close to the sample mean. As the sample size increases, the sample mean is also more likely to have a value similar to that of the population mean.



Systematic Variation


Systematic variation, or systematic bias, is a consequence of selecting subjects whose measurement values are different, or vary, in some specific way from the population. Because the subjects have something in common, their values tend to be similar to the values of others in the sample but different in some way from the values of the population as a whole. These values do not vary randomly around the population mean. Most of the variation from the mean is in the same direction; it is systematic. All the values in the sample may tend to be higher or lower than the mean of the population (Thompson, 2002).


For example, if all the subjects in a study examining some type of healthcare knowledge have an intelligence quotient (IQ) higher than 120, many of their scores will likely be higher than the mean of a population that includes individuals with a wide variation in IQ, such as IQs that range from 90 to 130. The IQs of the subjects have introduced a systematic bias. This situation could occur, for example, if all the subjects were college students, which has been the case in the development of many measurement methods in psychology.


Because of systematic variance, the sample mean is different from the population mean. The extent of the difference is the sampling error (see Figure 15-2). Exclusion criteria tend to increase the systematic bias in the sample and increase the sampling error. An extreme example of this problem is the highly restrictive sampling criteria used in some experimental studies that result in a large sampling error and greatly diminished representativeness.


If the method of selecting subjects produces a sample with a systematic bias, increasing the sample size would not decrease the sampling error. When a systematic bias occurs in an experimental study, it can lead the researcher to believe that a treatment has made a difference when, in actuality, the values would be different even without the treatment. This situation usually occurs because of an interaction of the systematic bias with the treatment.



Refusal and Acceptance Rates in Studies

Systematic variation or bias is most likely to occur when the sampling process is not random. However, even in a random sample, systematic variation can occur if potential subjects decline participation. Systematic bias increases as the subjects’ refusal rate increases. A refusal rate is the number and percentage of subjects who declined to participate in the study. High refusal rates to participate in a study have been linked to individuals with serious physical and emotional illnesses, low socioeconomic status, and weak social networks (Neumark, Stommel, Given, & Given, 2001). The higher the refusal rate, the less the sample is representative of the target population. The refusal rate is calculated by dividing the number of potential subjects refusing to participate by the number of potential subjects meeting sampling criteria and multiplying the results by 100%.


Refusalrateformula=number potential subjectsrefusing to participate÷number potentialsubjects meeting sample criteria×100%


image

For example, if 200 potential subjects met the sampling criteria, and 40 refused to participate in the study, the refusal rate would be 20%.


Refusal rate=40 (number refusing)÷200 (number meeting sampling criteria)=0.2×100%=20%


image

Sometimes researchers provide an acceptance rate, or the number and percentage of the subjects who agree to participate in a study, rather than a refusal rate. The acceptance rate is calculated by dividing the number of potential subjects who agree to participate in a study by the number of potential subjects who meet sampling criteria and multiplying the result by 100%.


Acceptance rate formula=number potentialsubjects agreeing to participate÷numberpotential subjects meeting sample criteria×100%


image

If you know the refusal rate, you can also subtract the refusal rate from 100% to obtain the acceptance rate. Usually researchers report either the acceptance rate or the refusal rate but not both. In the example mentioned earlier, 200 potential subjects met the sampling criteria; 160 agreed to participate in the study, and 40 refused.


Acceptance rate=160 (number accepting)÷200 (number meeting sampling criteria)=0.8×100%=80%


image

Acceptance rate=100%refusal rate or 100%20%=80%


image


Sample Attrition and Retention Rates in Studies

Systematic variation can also occur in studies with high sample attrition. Sample attrition is the withdrawal or loss of subjects from a study. Systematic variation is greatest when a high number of subjects withdraw from the study before the data have been collected or when a large number of subjects withdraw from one group but not the other in the study (Kerlinger & Lee, 2000; Thompson, 2002). In studies involving a treatment, subjects in the control group who do not receive the treatment may be more likely to withdraw from the study. Sample attrition should be reported in the published study to determine if the final sample represents the target population. Researchers also need to provide a rationale for subjects withdrawing from the study and to determine if they are different from the subjects who complete the study. The sample is most like the target population if the attrition rate is low (<10% to 20%) and the subjects withdrawing from the study are similar to the subjects completing the study. Sample attrition rate is calculated by dividing the number of subjects withdrawing from a study by the sample size and multiplying the results by 100%.


Sample attrition rate formula=number subjectswithdrawing÷sample size×100%


image

For example, if a study had a sample size of 160, and 40 people withdrew from the study, the attrition rate would be 25%.


Attrition rate=40 (number withdrawing)÷160 (sample size)=0.25×100%=25%


image

The opposite of the attrition rate is the retention rate, or the number and percentage of subjects completing the study. The higher the retention rate, the more representative the sample is of the target population, and the more likely the study results are an accurate reflection of reality. Often researchers identify either the attrition rate or the retention rate but not both. It is better to provide a rate in addition to the number of subjects withdrawing or completing a study. In the example just presented with a sample size of 160, if 40 subjects withdrew from the study, then 120 subjects were retained or completed the study. The retention rate is calculated by dividing the number of subjects completing the study by the initial sample size and multiplying by 100%.


Sample retention rate formula=number subjectscompleting study÷sample size×100%


image

Retention rate=120 (number retained)÷160 (sample size)=0.75×100%=75%


image

The study by Twiss et al. (2009) of the effects of ST exercises on muscle strength, balance, and falls of BCSs with bone loss was introduced earlier in this chapter with the discussion of sampling criteria; the following excerpt presents the acceptance rate and sample attrition for this study.



Twiss et al. (2009) identified that 249 participants or subjects met the sampling criteria and 249 were enrolled in the study indicating that the acceptance rate for the study was 100%. The sample retention was 223 women for a retention rate of 90% (223 ÷ 249 × 100% = 89.6% = 90%), and the sample attrition rate was 26 women for an attrition rate of 10% (100% − 90% = 10%). The treatment group retention was 110 women with a retention rate of 89% (110 ÷ 124 × 100% = 88.7% = 89%). The comparison group retention was 113 women with a retention rate of 90% (113 ÷ 125 = 90.4% = 90%). This study has an excellent acceptance rate (100%) and a very strong sample retention rate of 90% for a 24-month-long study. The retention rates for both groups were very strong and comparable (treatment group 89% and comparison group 90%). Twiss et al. (2009) also provided a rationale for the subjects’ attrition, and the reasons were varied and seemed appropriate and typical for a study lasting 24 months. The acceptance rate, the sample and group retention rates, and the reasons for subjects’ attrition indicate limited potential for systematic variation in the study sample. The likelihood is increased that the sample is representative of the target population and the results are an accurate reflection of reality. The study would have been strengthened if the researchers would have included not only the numbers but also the sample and group retention rates.



Randomization


From a sampling theory point of view, randomization means that each individual in the population should have a greater than zero opportunity to be selected for the sample. The method of achieving this opportunity is referred to as random sampling. In experimental studies that use a control group, subjects are randomly selected and randomly assigned to either the control group or the experimental group. The use of the term control group—the group not receiving the treatment—is usually limited to studies using random sampling and random assignment to the treatment and control groups. The control group usually receives no care. If nonrandom sampling methods are used for sample selection, the group not receiving a treatment receives usual or standard care and is generally referred to as a comparison group. With a comparison group, there is an increase in the possibility of preexisting differences between that group and the experimental group receiving the treatment.


Random sampling increases the extent to which the sample is representative of the target population. However, random sampling must take place in an accessible population that is representative of the target population. Exclusion criteria limit true randomness. Thus, a study that uses random sampling techniques may have such restrictive sampling criteria that the sample is not truly random. In any case, it is rarely possible to obtain a purely random sample for nursing studies because of informed consent requirements. Even if the original sample is random, persons who volunteer or consent to participate in a study may differ in important ways from persons who are unwilling to participate. All samples with human subjects must be volunteer samples, which includes individuals willing to participate in the study, to protect the rights of the individuals (Fawcett & Garity, 2009). Methods of achieving random sampling are described later in the chapter.



Sampling Frame


For each person in the target or accessible population to have an opportunity to be selected for the sample, each person in the population must be identified. To accomplish this goal, the researcher must acquire a list of every member of the population through the use of the sampling criteria to define membership. This listing of members of the population is referred to as the sampling frame. The researcher selects subjects from the sampling frame using a sampling plan. Djukic, Kovner, Budin, and Norman (2010) studied the effect of nurses’ perceived physical work environment on their job satisfaction and described their sampling frame in the following excerpt.



The sampling frame in this study included the names of the 746 RNs who were asked to participate in the study.



Sampling Plan


A sampling plan describes the strategies that will be used to obtain a sample for a study. The plan is developed to enhance representativeness, reduce systematic bias, and decrease the sampling error. Sampling strategies have been devised to accomplish these three tasks and to optimize sample selection. The sampling plan may use probability (random) sampling methods or nonprobability (nonrandom) sampling methods.


A sampling method is the process of selecting a group of people, events, behaviors, or other elements that represent the population being studied. A sampling method is similar to a design; it is not specific to a study. The sampling plan provides detail about the application of a sampling method in a specific study. The sampling plan must be described in detail for purposes of critical appraisal, replication, and future meta-analyses. The sampling method implemented in a study varies with the type of research being conducted. Quantitative, outcomes, and intervention research apply a variety of probability and nonprobability sampling methods. Qualitative research usually includes nonprobability sampling methods. The sampling methods to be included in this text are identified in Table 15-1 and are linked to the types of research that most commonly incorporate them. The following sections describe the different types of probability and nonprobability sampling methods most commonly used in quantitative, qualitative, outcomes, and intervention research in nursing.




Probability (Random) Sampling Methods


Probability sampling methods have been developed to ensure some degree of precision in estimations of the population parameters. Probability samples reduce sampling error. The term probability sampling method refers to the fact that every member (element) of the population has a probability higher than zero of being selected for the sample. Inferential statistical analyses are based on the assumption that the sample from which data were derived has been obtained randomly. Thus, probability sampling methods are often referred to as random sampling methods. These samples are more likely to represent the population than samples obtained with nonprobability sampling methods. All subsets of the population, which may differ from one another but contribute to the parameters of the population, have a chance to be represented in the sample. Probability sampling methods are most commonly applied in quantitative, outcomes, and intervention research.


There is less opportunity for systematic bias if subjects are selected randomly, although it is possible for a systematic bias to occur by chance. Using random sampling, the researcher cannot decide that person X would be a better subject for the study than person Y. In addition, a researcher cannot exclude a subset of people from selection as subjects because he or she does not agree with them, does not like them, or finds them hard to deal with. Potential subjects cannot be excluded just because they are too sick, not sick enough, coping too well, or not coping adequately. The researcher, who has a vested interest in the study, could (consciously or unconsciously) select subjects whose conditions or behaviors are consistent with the study hypothesis. It is tempting to exclude uncooperative or assertive individuals. Random sampling leaves the selection to chance and decreases sampling error and increases the validity of the study (Thompson, 2002).


Theoretically, to obtain a probability sample, the researcher must develop a sampling frame that includes every element in the population. The sample must be randomly selected from the sampling frame. According to sampling theory, it is impossible to select a sample randomly from a population that cannot be clearly defined. Four sampling designs have been developed to achieve probability sampling: simple random sampling, stratified random sampling, cluster sampling, and systematic sampling.



Simple Random Sampling


Simple random sampling is the most basic of the probability sampling methods. To achieve simple random sampling, elements are selected at random from the sampling frame. This goal can be accomplished in various ways, limited only by the imagination of the researcher. If the sampling frame is small, the researcher can write names on slips of paper, place the names in a container, mix well, and draw out one at a time until the desired sample size has been reached. Another technique is to assign a number to each name in the sampling frame. In large population sets, elements may already have assigned numbers. For example, numbers are assigned to medical records, organizational memberships, and professional licenses. The researcher can use a computer to select these numbers randomly to obtain a sample.


There can be some differences in the probability for the selection of each element, depending on whether the name or number of the selected element is replaced before the next name or number is selected. Selection with replacement, the most conservative random sampling approach, provides exactly equal opportunities for each element to be selected (Thompson, 2002). For example, if the researcher draws names out of a hat to obtain a sample, each name must be replaced before the next name is drawn to ensure equal opportunity for each subject.


Selection without replacement gives each element different levels of probability for selection. For example, if the researcher is selecting 10 subjects from a population of 50, the first name has a 1 in 5 chance (10 draws, 50 names), or a 0.2 probability, of being selected. If the first name is not replaced, the remaining 49 names have a 9 in 49 chance, or a 0.18 probability, of being selected. As further names are drawn, the probability of being selected decreases.


There are many ways to achieve random selection, such as with the use of a computer, a random numbers table, drawing names out of a hat, or a roulette wheel. The most common method of random selection is the computer, which can be programmed to select a sample randomly from the sampling frame with replacement. However, some researchers still use a table of random numbers to select a random sample. Table 15-2 shows a section from a random numbers table. To use a table of random numbers, the researcher places a pencil or a finger on the table with the eyes closed. The number touched is the starting place. Moving the pencil or finger up, down, right, or left, the researcher uses the numbers in order until the desired sample size is obtained. For example, the researcher places a pencil on 58 in Table 15-2, which is in the fourth column from the left and fourth row down. If five subjects are to be selected from a population of 100 and the researcher decides to go across the column to the right, the subject numbers chosen are 58, 25, 15, 55, and 38. Table 15-2 is useful only if the population number is less than 100. However, tables are available for larger populations, such as the random numbers table provided in the online resources for this textbook or the Thompson (2002, pp. 14-15) sampling text.



Degirmen, Ozerdogan, Sayiner, Kosgeroglu, and Ayranci (2010, p. 153) conducted a pretest-posttest randomized controlled experimental study to determine the effect of hand and foot massage and foot massage only interventions on the postoperative pain of women who had a cesarean operation. These researchers obtained their sample using a simple random sampling method that is described in the following excerpt from their study.



Degirmen et al. (2010) clearly identified their target population as women needing cesarean operations, and the 281 women with presenting orders provided the sampling frame for the study. The sample of 75 women was randomly selected, but the researchers did not indicate the process for the random selection. The use of a computer to select a sample randomly is usually the most efficient and unbiased process. The subjects were evenly divided with 25 in each group, but the researchers do not indicate if the assignment to groups was random or based on the convenience of the subjects or researchers. Application of simple random sampling and the attrition of only three (4%) subjects from the study seem to provide a sample representative of the target population. However, the study would have been strengthened by a discussion of the process for random sampling and a clarification of how the subjects were assigned to groups. The outcomes of the study were that foot and hand massage interventions significantly reduced postoperative pain experienced by the women and that foot and hand massage was significantly more effective than foot massage only.



Stratified Random Sampling


Stratified random sampling is used when the researcher knows some of the variables in the population that are critical to achieving representativeness. Variables commonly used for stratification are age, gender, ethnicity, socioeconomic status, diagnosis, geographical region, type of institution, type of care, care provider, and site of care. The variable or variables chosen for stratification need to be correlated with the dependent variables being examined in the study. Subjects within each stratum are expected to be more similar (homogeneous) in relation to the study variables than they are to be similar to subjects in other strata or the total sample. In stratified random sampling, the subjects are randomly selected on the basis of their classification into the selected strata.


For example, if in conducting your research you selected a stratified random sample of 100 adult subjects using age as the variable for stratification, the sample might include 25 subjects in the age range 18 to 39 years, 25 subjects in the age range 40 to 59 years, 25 subjects in the age range 60 to 79 years, and 25 subjects 80 years or older. Stratification ensures that all levels of the identified variable, in this example age, are adequately represented in the sample. With a stratified random sample, you could use a smaller sample size to achieve the same degree of representativeness as a large sample acquired through simple random sampling. Sampling error decreases, power increases, data collection time is reduced, and the cost of the study is lower if stratification is used (Fawcett & Garity, 2009; Thompson, 2002).


One question that arises in relation to stratification is whether each stratum should have equivalent numbers of subjects in the sample (termed disproportionate sampling) or whether the numbers of subjects should be selected in proportion to their occurrence in the population (termed proportionate sampling). For example, if stratification is being achieved by ethnicity and the population is 45% white non-Hispanic, 25% Hispanic nonwhite, 25% African American, and 5% Asian, your research team would have to decide whether to select equal numbers of each ethnic group or to calculate a proportion of the sample. Good arguments exist for both approaches. Stratification is not as useful if one stratum contains only a small number of subjects. In the aforementioned situation, if proportions are used and the sample size is 100, the study would include only five Asians, hardly enough to be representative. If equal numbers of each group are used, each group would contain at least 25 subjects; however, the white non-Hispanic group would be underrepresented. In this case, mathematically weighting the findings from each stratum can equalize the representation to ensure proportional contributions of each stratum to the total score of the sample. Most textbooks on sampling describe this procedure (Levy & Lemsbow, 1980; Thompson, 2002; Yates, 1981).


Ulrich et al. (2006) used a stratified random sampling method to obtain their sample of nurse practitioners (NPs) and physician assistants (PAs) for the purpose of studying the ethical conflict of these healthcare providers associated with managed care. The following excerpt from this study describes the sampling method used to obtain the final sample of 1536 providers (833 NPs and 689 PAs).



“A self-administered questionnaire was mailed to an initial stratified random sample [sampling method] of 3,900 NPs and PAs practicing in the United States. The sample was selected from the national lists provided by Medical Marketing Services, an independently owned organization that manages medical industry lists (www.mmslists.com/main.asp). The list for PAs was derived from the American Academy of Physicians Assistants (AAPA), and a comprehensive list of NPs was derived from the medical and nursing boards of the 50 states and the District of Columbia [sampling frames for NPs and PAs].… After undeliverable (1.9%) and other disqualified respondents (13.2%, i.e., no longer practicing, non-primary-care practitioner) were removed, the overall adjusted response rate was 50.6%.” (Ulrich et al., 2006, p. 393)


The study sampling frames for the NPs and PAs are representative of all 50 states and the District of Columbia, and the lists for the sampling frames were from quality sources. The study has a strong response rate of 50.6% for a mailed questionnaire, and the researchers identified why certain respondents were disqualified. The final sample was large (1536 subjects) with strong representation for both NPs (833 subjects) and PAs (689 subjects). The study sample might have been stronger with a more equal number of NP and PA subjects. The 833 NPs and 689 PAs add to 1522 subjects and it is unclear why the sample size is identified as 1536 unless there are missing data from subjects. However, the sample was a great strength of this study and appeared to represent the target population of NPs and PAs currently practicing in primary care in the United States.



Cluster Sampling


Cluster sampling is a probability sampling method applied when the population is heterogeneous; it is similar to stratified random sampling but takes advantage of the natural clusters or groups of population units that have similar characteristics (Fawcett & Garity, 2009). Cluster sampling is used in two situations. The first situation is when a simple random sample would be prohibitive in terms of travel time and cost. Imagine trying to arrange personal meetings with 100 people, each in a different part of the United States. The second situation is in cases in which the individual elements making up the population are unknown, preventing the development of a sampling frame. For example, there is no list of all the heart surgery patients who complete rehabilitation programs in the United States. In these cases, it is often possible to obtain lists of institutions or organizations with which the elements of interest are associated.


In cluster sampling, the researcher develops a sampling frame that includes a list of all the states, cities, institutions, or organizations with which elements of the identified population would be linked. States, cities, institutions, or organizations are selected randomly as units from which to obtain elements for the sample. In some cases, this random selection continues through several stages and is referred to as multistage cluster sampling. For example, the researcher might first randomly select states and next randomly select cities within the sampled states. Hospitals within the randomly selected cities might then be randomly selected. Within the hospitals, nursing units might be randomly selected. At this level, either all the patients on the nursing unit who fit the criteria for the study might be included, or patients could be randomly selected.


Cluster sampling provides a means for obtaining a larger sample at a lower cost. However, it has some disadvantages. Data from subjects associated with the same institution are likely to be correlated and not completely independent. This correlation can cause a decrease in precision and an increase in sampling error. However, such disadvantages can be offset to some extent by the use of a larger sample.


Fouladbakhsh and Stommel (2010, p. E8) used multistage cluster sampling in their study of the “complex relationships among gender, physical and psychological symptoms, and use of specific CAM [complementary and alternative medicine] health practices among individuals living in the United States who have been diagnosed with cancer.” These researchers described their sampling method in the following excerpt from their study.



“The NHIS [National Health Interview Survey] methodology employs a multistage probability cluster sampling design [sampling method] that is representative of the NHIS target universe, defined as ‘the civilian noninstitutionalized population’ (Botman, Moore, Moriarty, & Parsons, 2000, p. 14; National Center for Health Statistics). In the first stage, 339 primary sampling units were selected from about 1,900 area sampling units representing counties, groups of adjacent counties, or metropolitan areas covering the 50 states and the District of Columbia [1st stage cluster sampling]. The selection included all of the most populous primary sampling units in the United States and stratified probability samples (by state, area poverty level, and population size) of the less populous ones. In a second step, primary sampling units were partitioned into substrata (up to 21) based on concentrations of African American and Hispanic populations [2nd stage cluster sampling]. In a third step, clusters of dwelling units form the secondary sampling units selected from each substratum [3rd stage cluster sampling]. Finally, within each secondary sampling unit, all African American and Hispanic households were selected for interviews, whereas other households were sampled at differing rates within the substrata. Therefore, the sampling design of the NHIS includes oversampling of minorities.” (Fouladbakhsh & Stommel, 2010, pp. E8-E9)

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Feb 17, 2017 | Posted by in NURSING | Comments Off on Sampling

Full access? Get Clinical Tree

Get Clinical Tree app for offline access