In the last post of this series, I introduced the issue of algorithmic bias. Inspired by the well-meaning but ultimately flawed efforts of Google’s chatbot, Gemini, towards inclusivity, I looked at the challenges of addressing bias—such as the conceptual ambiguity of terms like ‘fairness’ and the inherent subjectivity in correcting bias. This raised the fundamental question of how best to approach the issue: should we adopt a normative stance, based on how we think the world should be, or a descriptive one, reflecting how the world actually is? The former approach has subjectivity problems, risks distorting the truth, and importantly might deprive us of insights into inequality that we need to address. However, bias is damaging on so many levels and needs to be dealt with somehow.
These thoughts inspired the following hypothesis: If machine learning models can produce biased outcomes—unintentional and often untraceable due to the complexity of deep neural networks—by detecting correlations and patterns in data that aren’t immediately obvious to people, then why can’t we use the same technology to address the bias itself? For example, by turning machine learning back on its own processes, we could potentially better identify where bias originates and how it is perpetuated, uncovering underlying causes that may not have previously been recognized. If applied repeatedly and at scale, this approach could give us a comprehensive view of the ecosystem of inequality and discrimination, offering valuable insights for driving meaningful societal change.
I chatted with data scientist Jesse McCrosky about this idea. And apparently… it’s a bit more complicated than that…
Jesse is a Canadian-born data scientist, enviably now living in Helsinki. He has worked in the private sector as well in academia, and for global technology giants like Google as well as for non-profit companies, focused on public good, like Mozilla. He was kind enough to give me some of his time to go over my thoughts. Jesse guided me through some of the more detailed aspects of data science, expanding my understanding of machine learning and the dynamic interplay of factors involved. In doing so, it became clear that I needed to move beyond my original line of thinking. However, all was far from lost, as Jesse also set me on the path to discovering the power of LLMs (large language models). Of course I was already familiar with LLMs, and I use ChatGPT increasingly as its ability advances, but I wasn’t aware of some of the super interesting ways that LLMs can be used to effectively address bias.
Before I share what I learned about LLMs, why was the original hypothesis flawed?
It essentially amounts to the complexity of both the algorithms and the data involved. As many algorithms are driven by statistical processes that can detect patterns and correlations, in today’s age of data abundance, the temptation may be to “throw more data” at these processes to scale up insights (as I wanted to). However, as Jesse reminded me, “Data and patterns are not interchangeable. It often isn’t a “more is better” situation.” Before adding data, you need to define the questions you are asking and then work out what data is and is not relevant to finding the answer. Jesse’s points boil down to the distinction between description (correlation) and explanation (causation).
Algorithms are limited to some extent by their design. They are structured and trained to solve specific problems. As noted in the last post, biased outputs may arise from various factors: bias in the data, the algorithm’s structure, or the assumptions embedded in its development. Simply adding more data without addressing these issues won’t solve the problem, it will likely produce more of the same results.
It is theoretically possible to address those structural and data issues and dig deep into bias causation using all manner of algorithms, but in reality, it is not practical. For example, when biased output was discovered in a healthcare algorithm, researchers compared predictions with actual health outcomes and then tested different predictive labels to overcome the bias (gaining beneficial insights into drivers of inequality). And when bias was found in a hiring algorithm researchers examined the representation of socio-economic groups in the training data and retrained the model on subsets to observe changes in the output, and they analyzed which features the model was most sensitive to, tweaking variables to better understand their effect. So while it is possible to use machine learning models to understand and map out an ecosystem of factors driving inequality, this would essentially become more of an enormous, labor-intensive, exorbitantly expensive social science study, rather than a way to leverage the powerful efficiencies of AI, which is what I had in mind.
On to LLMs and back to the distinction between correlation and causation. Due to the nature of Large Language Model design, they offer little in terms of explanation, regardless of the amount of data they have access to. They are built to recognize patterns, predict words and sequences of text, and generate human-like text by leveraging those learned patterns. However, even though they don’t explain bias in the sense of causation, their descriptive capabilities across vast amounts of data still give them a powerful edge in bias mitigation.
Jesse told me about some work being done in simulated representation for example, where an LLM is asked to simulate the beliefs or attitudes of a group (e.g., a marginalized community) to understand how different communities might react to various policies or issues. It turns out LLMs show real promise here (although it’s important to remember that the risk of perpetuating stereotypes, which can drown meaningful insights into a community, is always present and needs to be kept in check).
LLMs are incredibly adept at processing large amounts of data and identifying patterns. This combined with their ability to generate new ideas and text, and their deep learning structure—which helps them retain context over time and focus on what’s relevant— allows them not only to detect bias but also simulate its effects across different scenarios. This makes them an invaluable tool for exploring ‘what-if’ scenarios and revealing how bias can emerge in ways we might not have anticipated.
There are numerous ways that LLMs can be utilized toward this end. And although models would still need to be custom-built, fine-tuned, or retrained on specialized datasets for some of these examples, the feasibility and practicality of doing this is much greater than for traditional predictive or other causation machine learning models, as LLMs can adapt more flexibly to new data and tasks without requiring a complete redesign.
Here are some different and very interesting ways I found that LLMs can be used to help address bias:
- By analyzing historical or cultural texts, LLMs could identify bias over time, track changes in language use, and examine underrepresented perspectives.
I’m immediately struck here by thoughts of Charles W. Mills’ paper on ‘White Ignorance’ for those familiar with it, and how historical texts have shaped social narratives. This approach could be a very effective way of digging further into his ideas.
- By analyzing media content, LLMs could measure how often and in what context different social groups are depicted, helping to quantify representation. They can also identify stereotypes by spotting patterns where certain groups are linked to specific traits or behaviors. Additionally, LLMs can evaluate the tone and framing of media coverage, detecting how different demographic groups are discussed and whether the manner of framing contributes to bias or reinforces negative stereotypes.
- By mapping bias in workforce data, LLMs could identify language in job postings, performance reviews, or internal communication that suggest biased hiring or promotion practices. Or it could analyze salary data and demographic information to understand patterns of pay gaps. But in superior ways to those used now which often just look at the issue rather simplistically—e.g. a given demographic group vs. a simple salary measure—without looking into other factors such as level of seniority or job function, which would add a further dimension.
- By crowdsourcing data, LLMs could aggregate perceptions of bias from diverse communities, bringing to light specific areas where bias is felt. Or gather community-sourced solutions or ideas to reduce bias in policies, in the workplace, or across societal norms.
- By constructing bias simulations, LLMs could model how biased information spreads throughout a simulated society, showing the impact of biased messaging or information. And it could generate content that highlights cognitive biases in decision-making and social interaction (e.g., confirmation bias, stereotyping, etc.).
Related to this approach, some very interesting research was conducted recently around using LLMs to counter conspiracy theory mindsets. The approach seems promising with researchers finding that “people who strongly believe in seemingly fact-resistant conspiratorial beliefs can change their mind when presented with compelling evidence.” This was achieved through multiple conversational rounds with a Large Language Model presenting them with counterfactual evidence to their beliefs. The experiment reduced belief in a participant’s chosen conspiracy by 20% on average and persisted for at least 2 months.
This seems like it could have really significant implications in addressing other important epistemological concerns around echo chambers and the effects of the spread of disinformation online, as well as conspiracy theory mindsets.
- By generating case studies or training scenarios, LLMs could create realistic simulations of workplace, legal, or social situations where bias might arise, allowing people to explore different solutions.
- By creating bias-awareness campaigns, LLMs could help people launch education and awareness initiatives around bias. It could generate persuasive and empathetic narratives on the effects of discrimination that could be used in content or for articles, speeches, or scripts.
Ethical responsibility and fairness are increasingly being prioritized by businesses and data scientists as awareness of algorithmic bias and discriminatory AI practices grows. Expect to see significant changes in this area. I, for one, will be closely monitoring innovations in technologies like large language models. While there is still much work to be done, with so many stakeholders involved in addressing the harms of AI, and as we continue to move toward responsible AI planning from the outset, I believe we are definitely on the right track.