
Artificial intelligence – AI – is the buzzword of 2023. The term refers to computer systems that learn to perform tasks previously requiring human judgement.
AI conjures varied imagery: sentient machines from sci-fi who use artificial general intelligence to advance or destroy human society; machine learning algorithms that identify patterns and project biases in big data; self-driving cars and facial recognition that use computer vision; and large language models (LLMs) that power essay-writing bots and shape web searches.
The proliferation and popularization of AI has birthed a cottage industry of hype and criti-hype, including reflections on the future of qualitative research.
This blog post examines some potential uses (and misuses) of AI in qualitative research, with a focus on the LLMs popularized by chatGPT. I connect these to social theorist Max Weber’s warning about how “rational” tools constrain our lives in irrational and far-reaching ways— epitomized by his metaphor of the iron cage—and discuss implications for qualitative research.
Computers, Qualitative Research and LLM’s
Using computers alongside qualitative research is not new.
For decades, social scientists have used our PCs and Macs to streamline traditional qualitative tasks: compiling field notes, coding transcripts, building memos for papers and creating data-sets for books. Recent works examine how the growing intersections of qualitative research and computational text analysis can be applied to approaches such as grounded theory, the extended case method and ethnography.
Recently, there has been discussion about whether AI changes the game or just adds a new tool to the repertoire. ChatGPT is a big part of that.
For those unfamiliar, ChatGPT employs a simple web interface to make it feel like you are having a conversation with a robot that has read everything on the internet. Chatting with the generative pre-trained transformer (GPT) can be unnerving both its ability to generate content and make errors. This has become a cultural flashpoint, subject to heated debate on the threats and potential of AI.
According to the expert-bot itself, LLMs like GPT are “artificial intelligence systems that have been trained on vast amounts of text data to understand and generate human-like text.” They respond to human instructions, and “can answer questions, write essays, summarize text, translate languages and even generate creative content like stories or poems, based on the patterns they've learned.” Like, term papers.
While LLM’S are far from sentient, they are seeing widespread uptake in industry, government, and scientific research along other forms of AI. This is creating a host of ethical and practical issues. Now, they are proliferating alongside qualitative inquiry.
Qualitative Applications of LLMs (and some ‘buts’)
As a sociologist who has written about using computational tools in qualitative research for much of my career, I find the expansion of LLMs both exciting and troublesome.
LLMs, like the GPT-4 underlying ChatGpt, have potential utility for qualitative applications. They can:
- Categorize text according to themes learned from researchers
- Help break data sets into manageable chunks
- Identify patterns that analysts may have missed
- Enhance other automated processes— like de/identifying speakers/organizations
- Aid in producing interactive visuals
Importantly, in scientific use LLMs can be fine-tuned to work better with specific genres of content (e.g. medical records or ethnographic interviews). Before the ChatGPT explosion, our team used an LLM (BERT) to code in-depth interview transcripts. The system, trained on the insights of knowledgeable Medical Cultures Lab fieldworkers, was remarkably accurate and efficient. LLMs can learn from analyst input and can work alongside iterative approaches and require much less text than classical machine learning approaches. Outsourcing data processing to a well-trained LLM can reduce repetitive tasks and free up time for researchers to learn from participants in the field.
But… AI Can Miss the Mark
AI has been hyped as an efficient ‘fix’ for all sorts of problems. This is a real danger for methods like ethnography, which scholars often use to reveal what efficient methods miss and mismeasure. One of qualitative research’s core contributions has been showing where methodological quick-fixes miss the mark.
A core problem of qualitative research is how to make sense of complex and meaningful human interactions, even when (maybe especially) when doing so is inefficient. While technology may be generative in dealing with longstanding qualitative challenges, generations of scientific misuse of tools ranging from genetic analysis to algorithms to ethnography stand as a caution: sidestepping/outsourcing/technologizing deeper questions of how we study and relate to the social world is perilous— especially when the tools are powerful.
And, make no mistake, AI is a powerful and transformative tool that plays an ever more central role in the commercial, scientific and state institutions that structure our lives. AI affects what content we see when we browse the web, it suggests edits in our papers and emails, it is baked into analytics systems that shape policy and pathways through universities and hospitals. AI helps cameras connect images of our faces to databases and can affect elections. AI is becoming impossible to avoid— like the expanding bureaucracies Weber wrote about a century ago.
Even when we are cognizant of AI as scientific tools with peril and promise, they are easy to abuse. For instance, LLMs can make surprising errors, reflect the biases of their training data, and can elevate the assumptions of their designers and birthplaces.
AI is proliferating in the qualitative data analysis space, too. ATLAS.ti recently partnered with OpenAI (which owns chatGPT) to use LLMs for coding. MAXQDA is using LLMs for generating summaries. NVIVO is using AI to transcribe audio. Outside of these major qualitative analysis platforms, new apps are springing up and quantitative software has had access to AI tools for some time. Below, I use ATLAS.ti to provide an example of some pros and cons. [1] My concern is far from being platform specific. I note that the tool is in active beta development and does not replace/hinder existing features.
The Beta-Implementation of AI coding with ATLAS.ti
ATLAS.ti recently released a beta-version of an ‘AI coding’ tool. The marketing (i.e., coding with AI can be ‘10x faster’), has created some controversy in qualitative circles.
At the time of writing (June ‘23), the ‘AI coding’ function works by users uploading their data to an online cloud system. The program then generates a lot of ‘open’ codes (i.e. inductive themes that do not represent an explicit theory or hypotheses) for documents that users select. When I tested this with a short public interview with President Joe Biden, AI coding identified themes, suggested codes and allowed human review before application. The ability to check the AI’s work is a welcome addition. My teams have argued human review of automated text classification is important even on models especially trained and tuned for scientific use. Of course, it is also very easy to just accept all suggestions and produce a coded data set very quickly.
Some of the codes were surprisingly good. GPT flagged discussions of the COVID-19 pandemic and patriotic language. Still, dozens of codes for a few pages of text disconnected from research is misaligned with many methodological approaches. Most themes were irrelevant to my questions and there was no way to edit model parameters. Even for those who truly understand AI language models, the workings are immutable and hidden (a general issue with LLMS).
Concerning from the perspective of research ethics and integrity, colleagues noted that those who clicked through the disclaimers about bias and information handling might have missed that their data was being uploaded. ATLAS.ti’s other language models can be used offline in the desktop versions, which is important for sensitive data. Other researchers noted that the tools can be used as a justification for avoiding qualitative training—because AI is often framed as better and faster. And, here at least, it is faster.
Yet, with few parameters to tweak, no knowledge of computational models or qualitative inquiry required, no possibility to teach the LLM about unique contexts and communities, AI coding might encourage researchers to avoid immersion or deep reading. This, in my estimation, risks amplifying errors rather than improving inquiry.
The broader point here is that the implementation of AI in qualitative software echoes important concerns about (mis)uses of AI more broadly.
AI and Max Weber
Critics of AI and LLMs note a core danger is the ability to generate content that looks credible, but is false, misleading or fabricated—deepfakes, spam websites, misinformation, and faulty analyses.
Qualitative researchers, under never-ending pressure to produce more, faster (a counting obsession embedded in the neoliberalization of science), are always tempted with rapidity. As AI comes to code, summarize, write reports and disseminate findings, opting for a ‘cutting edge’ fix without weighing the dangers and promises is dangerous for qualitative research— in part because it can lead to research that misses the mark while also being ‘instrumentally rational’ given increasingly quantified incentive systems.
And here, I think there is something to learn from Max Weber’s cautions.
Weber wagged his finger against celebrations of technological development as progress. In his vision, the most ostensibly rational tools— science, technology, bureaucracy— often produce irrational consequences irrespective of their architect’s intent. “Fast” qualitative work that misses the point of using qualitative methods would be an example.
A broader point of Weber’s was that we may lament, but we can’t just turn back the clock. Tools like bureaucracy (or algorithms) become embedded in our lives, not just because they are useful but because they are instrumental for those in power. Weber believed such forces become unavoidable once they take hold. Whether the cage is made of silicon/bytes or iron, the duality of being constrained by the irrationality of ‘rationality’ is eerily similar.
Weber also saw humans as continually searching for meaning. He emphasized the danger—susceptibility to demagogues, narcissism, nostalgia and the elevation of technical sophistication as progress even when it is empty. However, Weber also made errors. Perhaps he underestimated the potential for the creative repurposing of new technologies. The future will tell.
Qualitative Methods in the Era of AI
A few concluding thoughts:
First, misusing methods while framing the result as progress has been the story of every tool/technique in the social sciences—from bad survey analyses to colonial ethnography, eugenics and bell-curved psychometrics. We need to engage critically with both new tools and the “rational” systems of scholarly production that “validate” our work, lest we repeat this history.
Second, AI and LLMs do offer some very exciting prospects for qualitative inquiry when used well. LLMs can help free up time for data collection and deep analysis, help identify new insights and expand fruitful synergies between qualitative and computational social science.
Third, research is only as good as the data we analyze. The great burden and gift of qualitative research is being close to human life. It is the nature of our data— generated by spending time with people—that is at the core of methods like ethnography. How we code and aggregate is important, but if the data are a mismatch for our questions, we will get answers that are “precisely incorrect”— even with massive volumes of data.
Finally, we should always be cognizant of the risk of using our platform as researchers to elevate falsehoods. If our models are not trained to account for context, we risk errors. A challenge with AI is the ability to make falsehoods look credible with an efficiency “never before been seen.”
Weber famously quipped, “no one knows who will live in [the iron cage] in the future.” I have at least a drop more optimism, in part, because I expect new generations of researchers will be there to thoughtfully study it as we thoughtfully navigate the spread of AI.
[1] Disclosure: I have taught workshops on qualitative data analysis, using ATLAS and related QDA platforms since grad school. I also use commercial quantitative software (STATA), R, and python in my research.