Data Analysis in Mixed Research: A Primer

Anthony J. Onwuegbuzie (Corresponding author) & Julie P. Combs

Department of Educational Leadership and Counseling

Sam Houston State University, USA

Received: March 23, 2011

Accepted: April 20, 2011

Abstract

The purpose of this methodological article is to provide a primer for conducting a mixed

analysis—the term used for analyzing data in mixed research. Broadly speaking, a mixed

analysis involves using quantitative and quantitative data analysis techniques within the same

study. In particular, a heuristic example using real data from a published study entitled

“Perceptions of Barriers to Reading Empirical Literature: A Mixed Analysis” (Benge,

Onwuegbuzie, Burgess, & Mallette, 2010) is used with the aid of screenshots to illustrate

how a researcher can conduct a quantitative dominant mixed analysis, wherein the

quantitative analysis component is given higher priority and qualitative data and analysis is

incorporated to increase understanding of the underlying phenomenon.

Keywords: Mixed research, Mixed methods research, Quantitative research, Qualitative

research, Mixed analysis, Analysis screenshots

1. Mixed Research Defined

Mixed Research, or what is referred to as mixed methods research, involves “mix[ing] or

combin[ing] quantitative and qualitative research techniques, methods, approaches, concepts

or language into a single study” (Johnson & Onwuegbuzie, 2004, p. 17). As noted by Collins,

Onwuegbuzie, and Sutton (2006), mixed research studies contain 13 steps—each of which

occur at one of the following three phases of the mixed research process: research

conceptualization (i.e., determining the mixed goal of the study, formulating the mixed

research objective[s], determining the rationale of the study and rationale[s] for mixing

quantitative and qualitative approaches, determining purpose of the study and the purpose[s]

for mixing quantitative and qualitative approaches, determining the mixed research

question[s]), research planning (i.e., selecting the mixed sampling design, selecting the mixed

research design), and research implementation (i.e., collecting quantitative and qualitative

data, analyzing the quantitative and qualitative data, legitimating the data sets and mixed

research findings, interpreting the mixed research findings, writing the mixed research report,

reformulating the mixed research question[s]). Of these 13 steps, analyzing data in a mixed

research study potentially is the most complex step because the researcher(s) involved has to

be adept at analyzing both the quantitative and qualitative data that have been collected, as

well as integrating the results that stem from both the quantitative and qualitative analysis “in

a coherent and meaningful way that yields strong meta-inferences (i.e., inferences from

qualitative and quantitative findings being integrated into either a coherent whole or two

distinct sets of coherent wholes; Tashakkori & Teddlie, 1998)” (Onwuegbuzie & Combs,

2010, p. 398). As such, guidelines and exemplars are needed for conducting mixed analyses.

Thus, the purpose of this article is to describe and to illustrate data in mixed research.

2. Mixed Analysis Defined

Mixed analysis is the term used for analyzing data in mixed research. Onwuegbuzie and

Combs (2010) recently provided an inclusive definition of mixed analysis that incorporates

the definition and typologies that have been presented in major methodological works. These

works included articles, book chapters, books, and paper presentations across numerous fields

and disciplines such as the social and behavioral sciences (including psychology and

education), nursing and allied health, business, and linguistics that spanned 21 years. Based

on their interpretations of the extant literature, Onwuegbuzie and Combs (2010) identified 13

criteria that represent decisions that mixed researchers make before, during, and/or after the

conduct of their mixed analyses:

1. rationale/purpose for conducting the mixed analysis

2. philosophy underpinning the mixed analysis

3. number of data types that will be analyzed

4. number of data analysis types that will be used

5. time sequence of the mixed analysis

6. level of interaction between quantitative and qualitative analyses

7. priority of analytical components

8. number of analytical phases

9. link to other design components

10. phase of the research process when all analysis decisions are made

11. type of generalization

12. analysis orientation

13. cross-over nature of analysis

Using these 13 criteria, Onwuegbuzie and Combs (2010) derived the following inclusive and

comprehensive definition of mixed analysis:

Mixed analysis involves the use of both quantitative and qualitative analytical

techniques within the same framework, which is guided either a priori, a posteriori, or

iteratively (representing analytical decisions that occur both prior to the study and

during the study). It might be based on one of the existing mixed methods research

paradigms (e.g., pragmatism, transformative-emancipatory) such that it meets one of

more of the following rationales/purposes: triangulation, complementarity,

development, initiation, and expansion. Mixed analyses involve the analysis of one or

both data types (i.e., quantitative data or qualitative data; or quantitative data and

qualitative data), which occur either concurrently (i.e., in no chronological order), or

sequentially in two phases (in which the qualitative analysis phase precedes the

quantitative analysis phase or vice versa, and findings from the initial analysis phase

inform the subsequent phase) or more than two phases (i.e., iteratively). The analysis

strands might not interact until the data interpretation stage yielding a basic parallel

mixed analysis, although more complex forms of parallel mixed analysis can be used,

in which interaction takes place in a limited way before the data interpretation phase.

The mixed analysis can be designed based, wherein it is directly linked to the mixed

methods design (e.g., sequential mixed analysis techniques used for sequential mixed

methods designs). Alternatively, the mixed analysis can be phase based, in which the

mixed analysis takes place in one or more phases (e.g., data transformation). In mixed

analyses, either the qualitative or quantitative analysis strands might be given priority

or approximately equal priority as a result of a priori decisions (i.e., determined at the

research conceptualization phase) or decisions that emerge during the course of the

study (i.e., a posteriori or iterative decisions). The mixed analysis could represent

case-oriented, variable-oriented, and process/experience oriented analyses. The mixed

analysis is guided by an attempt to analyze data in a way that yields at least one of

five types of generalizations (i.e., external statistical generalizations, internal

statistical generalizations, analytical generalizations, case-to-case transfer, naturalistic

generalization). At its most integrated form, the mixed analysis might involve some

form of cross-over analysis, wherein one or more analysis types associated with one

tradition (e.g., qualitative analysis) are used to analyze data associated with a different

tradition (e.g., quantitative data). (pp. 425-426)

Of these 13 decision criteria, the following five criteria appear to be most common: (a)

rationale/purpose for conducting the mixed analysis, (b) number of data types that will be

analyzed, (c) time sequence of the mixed analysis, (d) priority of analytical components, and

(e) number of analytical phases. Each of these criteria is described in the subsequent sections.

Rationale/purpose for conducting the mixed analysis

Greene, Caracelli, and Graham (1989) identified five purposes for mixing quantitative and

qualitative data: triangulation (i.e., quantitative findings are compared to the qualitative

results); complementarity (i.e., results from one analysis type [e.g., qualitative] are interpreted

to enhance, expand, illustrate, or clarify findings derived from the other strand [quantitative]);

development (i.e., data are collected sequentially and the findings from one analysis type are

used to inform data collected and analyzed using the other analysis type); initiation (i.e.,

contradictions or paradoxes that might reframe the research question are identified), and

expansion (i.e., quantitative and qualitative analyses are used to expand the study’s scope and

focus).

Number of data types that will be analyzed

Traditionally, as noted by Creswell and Plano Clark (2007), “Data analysis in mixed methods

research consists of analyzing the quantitative data using quantitative methods and the

qualitative data using qualitative methods” (p. 128). However, mixed analyses also can

involve the sequential analysis of one data type—which are referred to as sequential mixed

analyses (Tashakkori & Teddlie, 1998), wherein data that are generated from the initial

analysis then are converted into the other data type. For example, a researcher could conduct

a qualitative analysis of qualitative data followed by a quantitative analysis of the qualitative

codes that emerge from the qualitative analysis and that are transformed to quantitative data

(e.g., exploratory factor analysis of themes that emerge from a constant comparison analysis

of qualitative data; cf. Onwuegbuzie, 2003). Such conversion of qualitative data into

numerical codes that can be analyzed quantitatively (i.e., statistically) is known as

quantitizing (Miles & Huberman, 1994; Tashakkori & Teddlie, 1998). Alternatively, a

researcher could conduct a quantitative analysis of quantitative data followed by a qualitative

analysis of the quantitative data that emerge from the quantitative analysis and that are

transformed to qualitative data (e.g., narrative profile formation of a set of test scores or

subscale scores representing the affective domain). Such conversion of quantitative data into

narrative data that can be analyzed qualitatively is known as qualitizing (Tashakkori &

Teddlie, 1998).

Time sequence of the mixed analysis

Time sequence refers to whether the quantitative and qualitative analysis components occur

in a chronological order (Creswell & Plano Clark, 2007). Specifically, the qualitative and

quantitative analyses can be conducted in chronological order, or sequentially (i.e., sequential

mixed analysis), or they can be conducted in no chronological order, or concurrently (i.e.,

concurrent mixed analysis). When sequential mixed analyses are conducted, either (a) the

quantitative analysis component is conducted first, which then drives or informs the

subsequent qualitative analysis component (i.e., sequential quantitative-qualitative analysis;

Onwuegbuzie & Teddlie, 2003); (b) the qualitative analysis component is conducted first,

which then informs the subsequent quantitative analysis component (i.e., sequential

qualitative-quantitative analysis; Onwuegbuzie & Teddlie, 2003); or (c) the quantitative and

qualitative analyses are conducted sequentially in more than two phases (i.e., iterative

sequential mixed analysis; Teddlie & Tashakkori, 2009).

Priority of analytical components

Another important aspect of mixed analyses is the priority or emphasis given to the

quantitative analysis component(s) and the qualitative analysis component(s). Either the

qualitative and quantitative analysis components can be given approximately equal priority

(i.e., equal status) or one analysis component can be given significantly higher priority than

the other analysis component (i.e., dominant status). If the quantitative analysis component is

given significantly higher priority, then the analysis essentially is a quantitative-dominant

mixed analysis, wherein the analyst adopts a postpositivist stance, while believing

simultaneously that the inclusion of qualitative data and analysis is likely to increase

understanding of the underlying phenomenon (cf. Johnson, Onwuegbuzie, & Turner, 2007).

In contrast, if the qualitative analysis component is given significantly higher priority, then

the analysis essentially is a qualitative-dominant mixed analysis, whereby the analyst

assumes a constructivist-poststructuralist-critical stance with respect to the mixed analysis

process, while believing simultaneously that the inclusion of quantitative data and analysis is

likely to provide richer data and interpretations (cf. Johnson et al., 2007).

Number of analytical phases

Mixed analyses involve several phases. For example, Greene (2007, p. 155) identified the

following four phases of analysis: (a) data transformation, (b) data correlation and

comparison, (c) analysis for inquiry conclusions and inferences, and (d) using aspects of the

analytic framework of one methodological tradition within the analysis of data from another

tradition. Onwuegbuzie and Teddlie (2003) conceptualized a seven-step process for mixed

analyses: (a) data reduction (i.e., reducing the dimensionality of the quantitative data and

qualitative data), (b) data display (i.e., describing visually the quantitative data and qualitative

data), (c) data transformation (i.e., quantitizing and/or qualitizing data), (d) data correlation

(i.e., correlating quantitative data with quantitized data or correlating quantitative data with

qualitized data), (e) data consolidation (i.e., combining both quantitative and qualitative data

to create new or consolidated variables or data sets), (f) data comparison (i.e., comparing data

from the quantitative and qualitative data sources), and (g) data integration (i.e., integrating

both qualitative and quantitative data into a coherent whole).

Heuristic Example

The following mixed research study (Benge, Onwuegbuzie, Burgess, & Mallette, 2010)

provides an example of how one can conduct a mixed analysis. This study is relevant to any

field because it involves the study of reading ability within the context of doctoral-level

research methods courses.

Purpose of the Study

The purpose of Benge et al.’s (2010) study was fourfold: (a) to examine levels of reading

ability—as measured by reading comprehension and reading vocabulary—among doctoral

students; (b) to identify doctoral students’ perceptions of barriers that prevented them from

reading empirical articles; (c) to examine the relationship between these perceived barriers

and levels of reading vocabulary and reading comprehension; and (d) to determine which

perceived barriers predict the perceived difficulty that doctoral students experience in reading

empirical research articles.

Participants were 205 doctoral students enrolled in one of the doctoral-level research design

courses at a large research university in the United States. Because all participants

contributed to both the qualitative and quantitative phases of the study, and the qualitative

and quantitative data were collected concurrently, the mixed sampling design used was a

Concurrent Design using Identical Samples (Onwuegbuzie & Collins, 2007). Although in the

study the quantitative and qualitative approaches were given approximately equal weight, the

researchers placed a greater emphasis on the quantitative analysis phase, yielding a

quantitative- dominant mixed analysis. The rationale/purpose for mixing quantitative and

qualitative analysis was complementarity and expansion (Greene et al., 1989).

All participants were administered the Nelson-Denny Reading Test (NDRT; Brown, Fishco,

& Hanna, 1993) and the Reading Interest Survey (RIS). The NDRT was used to measure

levels of reading vocabulary (80 items; KR-20 = .85) and reading comprehension (38 items;

KR-20 = .69). The RIS contains 62 items that are either open-ended (e.g., “What barriers

prevent you from reading more empirical research articles?”) or closed-ended (e.g., “Please

indicate your perceptions about the levels of ease/difficulty you experience in reading

empirical research articles. Please check the option that best applies: 1 = EASY; 2 =

SOMEWHAT EASY; 3 = NEITHER EASY NOR DIFFICULT; 4 = SOMEWHAT

DIFFICULT; 5 = DIFFICULT”). Figure 1 displays part of these data.

Quantitative Dominant Mixed Analysis: Stage-by-Stage

A sequential mixed analysis (SMA; Onwuegbuzie & Teddlie, 2003; Tashakkori & Teddlie,

1998) was conducted to analyze doctoral students’ test score data and survey responses. This

analysis involved six stages.

Stage 1: Quantitative Analysis of Quantitative Data

The first stage involved the use of descriptive statistics (i.e., descriptive stage; data reduction)

to compute reading comprehension and reading vocabulary scores and compare them to the

normative data. The screenshots for obtaining the descriptive statistics and output are

displayed in Figures 2-4. A series of independent samples t tests (not shown) revealed that the

current sample of doctoral students had statistically significantly higher scores on the reading

comprehension (t = 6.84, p < .0001; effect size = 0.49) and reading vocabulary (t = 11.21, p
< .0001; effect size = 0.80) components of the NDRT than did Brown et al.’s (1993)
normative sample of 5,000 undergraduate students from 38 institutions. However,
disturbingly, approximately 10% of doctoral students attained reading comprehension and
reading vocabulary scores that represented the lower percentiles of this normative sample.
Stage 2: Qualitative Analysis of Qualitative Data
In the second stage, the doctoral students’ perceptions of barriers that prevented them from
reading empirical articles were subjected to a thematic analysis (i.e., exploratory stage; data
reduction) using constant comparison analysis (Glaser & Strauss, 1967). This analysis
revealed the following eight themes that represented students’ perceived barriers to reading
empirical literature: time, research/statistics knowledge, interest/relevance, text coherence,
vocabulary, prior knowledge, reader attributes, and volume of reading.
Stage 3: Quantitative Analysis of Qualitative Data
The themes then were quantitized (i.e., data transformation) such that if a doctoral student
listed a characteristic that was eventually unitized under a particular theme, then a score of
“1” was assigned to the theme for the student response; otherwise, a score of “0” was
assigned. This dichotomization led to the formation of what Onwuegbuzie (2003) called an
inter-respondent matrix of themes (i.e., participant x theme matrix) that consisted only of 0s
and 1s. This inter-respondent matrix of 0s and 1s was entered into the SPSS database,
alongside the other variables. Figure 5 displays part of these data.
The inter-respondent matrix was used to calculate the frequency (i.e., prevalence rate) of each
theme. The steps for conducting the frequency analysis are displayed in Figures 6-8, and the
effect sizes pertaining to three of the themes extracted from qualitative data are presented in
Figure 9.
Stage 4: Quantitative Analysis of Qualitative Data
The fourth stage of Benge et al.’s (2010) SMA involved a principal component analysis to
ascertain the underlying structure of seven of the eight emergent themes (i.e., ...
