Example of data analysis to create open codes
First, the figure illustrates how inductive content analysis can be applied to qualitative data to answer a research question (i.e. adolescents feel that having type1DM represents a threat to life). Second, the figure clearly outlines the abstraction process or, in other words, how researchers can move from raw collected data to theoretical concepts. Third, the illustrated abstraction process shows the structure of the main concept threat to life. Hence, it is clear that threat to life includes physical, mental and social factors.
2.2.2 Frequently Asked Questions
Many researchers, especially those who have not extensively applied content analysis, frequently ask questions about the analytical process. The most common questions are: which kinds of open codes are satisfactory? how should I name open codes? what should I do with open codes that do not belong to any sub-concepts, -categories or -themes? what should I do when I find opposing or contradictory perspectives in the data? and how should I handle confusing data?
Open codes are identified from raw data. They can be expressed in words that are identical to the raw data, or the code can be changed slightly. Researchers should keep in mind two important points when phrasing open codes. First, there must be a clear connection between each open code and the raw data. Losing this connection to context during the analytical process can guide the analysis in the wrong direction, meaning that the researcher will subjectively interpret the data and/or the open codes no longer represent the raw data. Second, the description of an open code should not be too long. It should be between 1 and 3 words, as a longer description, i.e. a phrase or a sentence, means that the researcher has not grasped the main content.
Content analysis scholars have emphasised that the process of generating open codes is highly sensitive as researchers can easily interpret the data subjectively. This can result in codes that are not strongly connected to the original data [2, 8]. Researchers need to be familiar with the data to maintain a good connection between open codes and the raw data. A helpful technique is adding certain notes to identified codes that will help the researchers when they return to the raw data. Several examples for the study presented above are: “worries about future (boy X or participant X)” and “worries about health condition (boy F)”. This will help researchers when they have identified many cases of a similar open code, as they will need to check the contents of each code before grouping them into the same sub-concept. This means that they will need to return to the raw data for each identified open code. To generalise, a good open code is short, its content is closely related to the raw data, and it has some identifier that denotes the source in the raw data.
Another generally asked question concerns the names of sub-concepts, concepts and main concepts. First and foremost, researchers must keep in mind that the name should arise from the shared content of the group. For example, when thinking of a label for similar sub-concepts, the researcher should determine what content is included in each sub-concept. Researchers will rarely create totally new concepts; in this way, descriptions of groups will come from a researcher’s intellectual knowledge, theoretical understanding or expertise in the research field. As such, a researcher may think of a good label for a group based on previous research. However, it is important to note that any chosen label or description should reflect the shared content and the context that is under study.
Researchers are also commonly puzzled by what to do with open codes that do not belong to any of the generated sub-concepts (this is also evident in sub-concepts that do not fit into any of the created concepts). Most often, the reason is that data collection did not reach saturation. The researcher should mention this when reporting results, as well as list the open codes that did not fit into sub-concepts and describe the motivations for excluding these codes from the analytical process. The lack of saturation can also harm the data abstraction process, as researchers who did not reach data saturation may find that certain sub-concepts do not fit into any of the created concepts. Situations in which the participant group is highly heterogeneous may also cause researchers to generate open codes that do not fit into any sub-concepts. The fact that heterogeneous participant groups have such diverse perspectives and opinions may make it difficult for researchers to determine whether they achieved data saturation.
Another common concern is the identification of opposing perspectives or experiences. In most cases, the researcher will report both opposing perspectives as main concepts, for example, satisfaction with care and dissatisfaction with care. However, the researcher is also tasked with deciding whether to report these two main concepts or describe the research on the concept-level, i.e. present the contents included in the main concepts satisfaction with care and dissatisfaction with care.
Researchers often report feeling that they have rich but confusing data. The first step to tackling confusing data is getting familiar with the raw data, which can be achieved by reading through the data several times. After this, the researcher will be able to define a unit of analysis and start the analytical and data reduction processes. Even if a researcher is familiar with their data, they may still face challenges answering the research questions while analysing the data. To avoid this, a researcher may consider performing a pilot data collection to make sure that the collected data are relevant to the research question. Furthermore, as discussed in Chap. 1, the researcher should analyse the data during the data collection process and, if necessary, reformulate the research question. In qualitative research it is possible that participants may not focus on the interview question but rather provide information that is not related to the study question. In these instances, the researcher should be aware that it is logical that data which are not related to the subject of study will not provide any open codes.
2.3 Reporting Results
Inductive content analysis results are sometimes challenging to report because the researcher can only describe part of analytical process exactly, and rely on their past insight or intuition to explain other parts of the analysis. When reporting inductive content analysis findings, researchers should strive to describe the contents of the presented concepts through the identified sub-concepts and open codes. The researcher should also provide authentic citations that connect the results and raw data. This will improve the trustworthiness of the presented research. Scholars often ask about the proper number of authentic citations. There is no clear answer, but a good rule is that more authentic citations than text will make it hard for the reader to understand the results of the analysis. It is also important to select citations that will reflect different parts of the analytical process, for example, a citation for each of the presented sub-concepts and concepts. In addition, researchers can gain trustworthiness by including citations from a wide array of participants. Figures and tables are another way that researchers can clarify how they conducted the analytical process. An effective figure presents an example of part of the analysis and helps readers draw their own conclusion about whether or not the study is trustworthy.
It is critical that the reporting of results handles the findings as a product of inductive content analysis. Qualitative researchers occasionally use constructions such as “participants described…” or “participants said…”, both of which should never be used because the researcher is stating what the participants said during interviews rather than what the content analysis revealed. Everyday expressions are sometimes used, and their presence often indicates that the researcher has trouble discussing their results on the theoretical level, which may also be reflected in the names of sub-concepts and concepts (i.e. they are almost identical to what was expressed in the raw data).
The performed analysis demonstrated that adolescents perceive type1DM as a threat to their physical, social and mental well-being. The threat to social well-being encompassed dependence, control, conflicts and differences. Dependence included various aspects, e.g., dependence on parents, insulin, nurses and physicians. This can be exemplified through one adolescent’s response: “of course it means that I am dependent on many things…. My parents, insulin treatment, nurses and physicians……. I am not free such like other adolescents are….. I have a strict schedule”.