Using Statistics to Determine Differences

Using Statistics to Determine Differences


The statistical procedures in this chapter examine differences between or among groups. Statistical procedures are available for nominal, ordinal, and interval/ratio level data. The procedures vary considerably in their power to detect differences and in their complexity. How one interprets the results of these statistics depends on the design of the study. If the design is quasi-experimental or experimental and the study is well designed and has no major issues in regard to threats to internal and external validity, causality can be considered, and the results can be inferred to the associated population. If the design is comparative descriptive, differences identified are associated only with the sample under study. The parametric statistics used to determine differences that are discussed in this chapter are the independent samples t-test, paired or dependent samples t-test, and analysis of variance (ANOVA). If the assumptions for parametric analyses are not achieved or if study data are at the ordinal level, the nonparametric analyses of Mann-Whitney U, Wilcoxon signed-rank test, and Kruskal-Wallis H are appropriate techniques to use to test the researcher’s hypotheses. The chapter concludes with a discussion of the chi-square test of independence, which is a nonparametric analysis technique for analyzing nominal level data.

Choosing Parametric versus Nonparametric Statistics to Determine Differences

Parametric statistics are always associated with a certain set of assumptions that the data must meet; this is because the formulas of parametric statistics yield valid results only when the properties of the data are within the confines of these assumptions (Munro, 2005). If the data do not meet the parametric assumptions, there are nonparametric alternatives that do not require those assumptions to be met, usually because nonparametric statistical procedures convert the original data to rank-ordered or ordinal level data.

Many statistical tests can assist the researcher in determining whether his or her data meet the assumptions for a given parametric test. The most common assumption (that accompanies all parametric tests) is the assumption that the data are normally distributed. The K2 test and the Shapiro-Wilk test are formal tests of normality that assess whether distribution of a variable is non-normal—that is, skewed or kurtotic (see Chapter 21) (D’Agostino, Belanger, & D’Agostino, 1990). The Shapiro-Wilk test is used with samples with less than 1000 subjects. When the sample is larger, the Kolmogorov-Smirnov D test is more appropriate. All of these statistics are found in mainstream statistical software packages and are accompanied by a p value. Significant normality tests with p ≤ 0.05 indicate that the distribution being tested is significantly different from the normal curve, violating the normality assumption. The nonparametric statistical alternative is listed in each section in the event that the data do not meet the assumptions of each parametric test illustrated in this chapter.


One of the most common parametric analyses used to test for significant differences between group means of two samples is the t-test. The independent t-test analysis technique was developed to examine differences between two independent groups; the paired or dependent t-test analysis technique was developed to examine differences between two matched or paired groups, or a comparison of pretest and posttest measurements. The details of the independent and paired t-tests are described in this section.

t-test for Independent Samples

The most common parametric analysis technique used in nursing studies to test for significant differences between two independent samples is the independent samples t-test. The samples are independent if the study participants in one group are unrelated to or different from the participants in the second group. Use of the t-test for independent samples involves the following assumptions:

The t-test is robust to moderate violation of its assumptions. Robustness means that the results of analysis can still be relied on to be accurate when an assumption has been violated. The t-test is not robust with respect to the between-samples or within-samples independence assumptions, and it is not robust with respect to an extreme violation of the normality assumption unless the sample sizes are extremely large. Sample groups do not have to be equal for this analysis—instead, the concern is for equal variance. A variety of t-tests have been developed for various types of samples. The formula and calculation of the independent samples t-test is presented next.


The formula for the t-test is:



To compute the t-test, one must compute the denominator in the formula, which is the standard error of the difference between the means. If the two groups have different sample sizes, one must use this formula:



If the two groups have the same number of subjects in each group, one can use this simplified formula:



Using an example from a study examining the levels of depression among 16 elderly long-term care residents, differences between residents with and without dementia were investigated (Cipher & Clifford, 2004). A subset of data for these patients was selected for this example so that the computation would be small and manageable (Table 25-1). In actuality, studies involving t-tests need to be adequately powered to identify significant differences between groups accurately (Aberson 2010; Cohen & Cohen, 1983). All data presented in this chapter are actual, unmodified clinical data for a small number of study participants.

The independent variable in this example was level of dementia and included two levels—a “no dementia” group and a “severe dementia” group. The level of dementia was based on clinical ratings of neuropsychologists using the Functional Assessment Staging Tool (Reisberg, Ferris, Deleon, & Crook, 1982). The dependent variable was the score of the long-term care resident on the Geriatric Depression Scale (GDS) (Yesavage, Brink, & Rose, 1983). The GDS assesses the level of depression in elderly adults, with higher numbers indicative of more depressive symptoms. The null hypothesis is: There are no significant differences between elderly adults with dementia and elderly adults without dementia on depression scores.

The computations for the t-test are as follows:

Step 1: Compute means for both groups, which involves the sum of scores for each group divided by the number in the group.

Step 2: Compute the numerator of the t-test:



Step 3: Compute the standard error of the difference.

Step 4: Compute t value:



Step 5: Compute degrees of freedom (df):



Step 6: Locate the critical t value in the t distribution table in Appendix B at the back of your textbook and compare the critical t value with the obtained t value.

Interpretation of Results

Our obtained t is 2.55, exceeding the critical value, which means that our t-test is significant and represents a real difference between the two groups. We can reject the null hypothesis and state: An independent samples t-test computed on GDS scores revealed long-term residents with no dementia had significantly higher depression scores than long-term residents who had severe dementia, t (14) = 2.55, p < 0.05; image = 8.4 versus 5.6. Prior research suggests that elderly residents with dementia do not experience less depression, but rather they have difficulty communicating their distress (Ott & Fogel, 2004; Scherder et al., 2005). With additional research in this area, this knowledge might be used to facilitate improvements in methods used by healthcare professionals to assess emotional distress accurately among elderly adults with dementia (Cipher, Clifford, & Roper, 2006; Thakur & Blazer, 2008).

Nonparametric Alternative

If the data do not meet the assumptions involving normality or equal variances for an independent samples t-test, the nonparametric alternative is the Mann-Whitney U test. Mann-Whitney U calculations involve converting the data to ranks, discarding any variance or normality issues associated with the original values. In some studies, the data collected are ordinal level, and the Mann-Whitney U test is appropriate for analysis of the data. The Mann-Whitney U test is 95% as powerful as the t-test in determining differences between two groups. For a more detailed description of the Mann-Whitney U test, see the statistical textbooks by Daniel (2000) and Munro (2005). The statistical workbook for healthcare research by Grove (2007) has exercises for expanding your understanding of t-tests and Mann-Whitney U results from published studies.

t-tests for Paired Samples

When samples are related, the formula used to calculate the t statistic is different from the formula previously described for independent groups. One type of paired samples refers to a research design that repeatedly assesses the same group of people, a design commonly referred to as a repeated measures design. Another research design for which a paired samples t-test is appropriate is the case-control research design. Case-control designs involve a matching procedure whereby a control subject is matched to each case, in which the cases and controls are different people but matched demographically (Gordis, 2008). Paired or dependent samples t-tests can also be applied to a crossover study design, in which subjects receive one kind of treatment and subsequently receive a comparison treatment (Gordis, 2008). However, similar to the independent samples t-test, this t-test requires that differences between the paired scores be independent and normally or approximately normally distributed.

Feb 17, 2017 | Posted by in NURSING | Comments Off on Using Statistics to Determine Differences

Full access? Get Clinical Tree

Get Clinical Tree app for offline access