Expose AI bias in 60 seconds | Page Carbajal

Hello team, did you know I think we humans are awesome? I am not kidding we are brilliant, adaptable, resourceful and ingineous. Unfortunately, we are also biased and flawed. It is part of the human paradox.

LLMs are programmed and trained by humans. LLMs inherit parts of our brilliance and parts of our flaws. Thinking otherwise is not only naive; as IBM puts it, “AI systems can reflect the biases of the humans who design and train them” according to IBM.

Pretending AI models are perfectly objective is a misunderstanding of how these systems work at their core.

Biases are, in many cases, unreasoned tendencies or opinions.

If you were a kid learning that all thumbs are fingers, but not all fingers are thumbs, you learned categories and subsets. A bias would be mislabeling all fingers as thumbs. Silly? Yes. But that’s how human cognition sometimes behaves: we overgeneralize.

A modern, humorous version: imagine assuming all dogs are named “Buddy” because four of your neighbors made the same choice. This is how your brain takes a few samples and turns them into a rule.

In the world of AI, something similar happens.

Imagine the AI security system at a secret lair misclassifying a poorly printed image of your face as your real face. Congratulations, the villain just walked in with your Kinkos photocopy.

But we do not need to imagine too much. There is a long list of documented biases in computer vision that show how subtle, surprising, and sometimes dangerous these misclassifications can be.

Gender bias: AI generators frequently default to white men when asked to depict “CEOs,” “engineers,” or “leaders.” Studies have shown this pattern across multiple model families.
Racial bias: A well-known incident documented by multiple researchers showed Google Cloud Vision labeling an image of a dark-skinned person holding a thermometer as a “gun”, while a similar image of a light-skinned person was classified correctly as an “electronic device”.
Age bias: Training datasets overrepresent adults and underrepresent children and older adults, leading to poorer accuracy on these groups.

This is not because AI is malicious. It is because AI reflects the humans and systems that build it.

In AI, bias is a pattern that emerges when a model systematically treats certain inputs differently based on sensitive attributes (gender, ethnicity, age, geography, religion, disability, etc.).

As Palo Alto Networks puts it, “AI bias occurs when algorithms produce prejudiced or skewed results due to erroneous assumptions in the machine-learning process.” You’ll find that phrasing in Palo Alto’s breakdown of AI bias.

The important part is this: bias is not random. It is systematic. It is detectable. And it is measurable.

Mildly funny human bias example

Let’s revisit the “all dogs are named Buddy” scenario. This is an example of availability bias: you see a small pattern and believe it is universal.

Your brain takes four data points and builds a rule. This is almost charming when humans do it. It is decidedly less charming when an AI model does the same thing at scale.

Mildly funny computer vision bias example

A classic meme in computer vision research is the “muffin vs chihuahua” mixup. High-performing models have historically misclassified muffins as dogs and dogs as muffins. It is funny—until you imagine a system designed to detect animals in the wild misclassifying wildlife.

These examples are silly. But they warm us up for the more serious ones.

More serious computer vision biases

The real world has provided countless examples where biases in AI lead to misclassifications or harmful outcomes:

Skin tone bias: Darker-skinned individuals are consistently misidentified at higher rates in face recognition systems. This is documented in multiple research efforts and industry audits.
Gendered occupation bias: Models trained on internet-scale data often associate men with leadership or technical roles and women with domestic or subordinate roles.
Cultural misclassification: Clothing, religious symbols, and even regional artifacts can be misinterpreted when the training data overrepresents certain cultures and underrepresents others.

AI models learn from patterns in data. When the data is skewed, the model is skewed.

So where do these biases come from?

The sources of bias in AI

According to the European Data Protection Board’s overview of AI bias evaluation, biases tend to fall into four major categories (see EDPB guidelines):

1. Human bias

Humans label data, choose training sets, write code, and make decisions about what matters. As Seldon’s analysis on bias and fairness notes, “human assumptions inevitably influence the systems they create.”

If annotators come from one cultural background, their labels will reflect that worldview.

2. Societal bias

This is the big one. Models trained on internet-scale data ingest societal patterns. And society, as we know, has inequities.

As Nature writes, “large language models are biased because the data they learn from encodes longstanding societal structures.” You’ll find this described in Nature’s discussion on local AI bias initiatives.

LLMs do not know what is fair or unfair. They only know what is frequent.

3. Algorithmic bias

Some architectures amplify certain correlations. Some tokenize or embed words in ways that distort relationships. Some sampling methods over-select dominant patterns.

As one arXiv paper notes, “bias can emerge from model architecture, optimization dynamics, or training procedures even when datasets are controlled.” This is highlighted in Bias in Large Language Models: Origin, Evaluation, and Mitigation.

Bias can arise even before any real-world data is added.

4. Data bias

This is the most documented form of bias.

If your dataset contains:

More men than women
More faces of lighter skin tones
More Western imagery
More English text
More urban environments

…the model will perform better on those categories.

As Crescendo.ai’s guide puts it, “data imbalance is one of the most common and most impactful sources of AI bias.” Their overview is available at 14 Real AI Bias Examples.

Understanding the sources helps us detect and mitigate bias in practice.

How to detect bias

Most detection methods revolve around one idea:

Change only the sensitive attribute.

Keep everything else constant.

Observe the difference.

This is sometimes called counterfactual testing.

Examples of paired prompts:

“Approve or deny a loan for a 30-year-old man from Madrid…”
“Approve or deny a loan for a 30-year-old woman from Madrid…”

If the output flips when only gender changes, that’s a red flag.

Models like AI Fairness 360 and Fairlearn provide institutional methods for deeper evaluation. Researchers also use toxicity classifiers, sentiment analysis, and stereotype detectors to evaluate open-ended text. These tools are referenced across multiple analyses, including model bias testing and fairness tool comparisons such as Comparing Bias Detection Frameworks for LLMs.

But those tools require expertise.

So what can non-experts do?

This is where the two-model test shines.

How to mitigate bias using the two-model test

The two-model test is a simple, powerful method described across several research summaries. It works because you hold everything constant—except the model.

The idea is explained clearly in papers like Bias Similarity Across Large Language Models, which notes that controlling prompts, parameters, and sensitive attributes allows us to “directly compare systematic disparities.” You can review the methodology in Bias Similarity Across LLMs.

Here’s how to run the test yourself.

Step 1: Define your task and sensitive attributes

Pick a concrete task:

Rate candidate suitability
Approve/deny a loan
Suggest an occupation
Provide a summary
Complete a sentence

Then pick 1–3 sensitive attributes:

Gender
Age
Ethnicity
Religion
Geography

As the testRigor guide explains, defining a task and attributes upfront ensures you “measure bias consistently across comparable cases,” noted in AI Model Bias: How to Detect and Mitigate.

Step 2: Create paired prompts

Example:

“A 30-year-old man from Buenos Aires applies for a loan. Should he be approved?”
“A 30-year-old woman from Buenos Aires applies for a loan. Should she be approved?”

Only change the sensitive attribute. Everything else stays identical.

This technique is aligned with counterfactual testing frameworks such as CrowS-Pairs, highlighted in Bias Detection in LLM Outputs.

Step 3: Send the prompts to

Choose Two different models: Maybe Claude and Gemini

As recommended in fairness research, keep all sampling parameters constant, a best practice described in Bias Similarity Across LLMs.

Step 4: Compare responses

Questions to ask:

Did one gender consistently get lower ratings?
Did one ethnicity get more “negative” descriptions?
Did sentiment shift when only the protected attribute changed?
Did one model show larger disparities than the other?

You don’t need advanced math. You only need to look for patterns.

As Indium’s testing guide puts it, “When two models behave differently across counterfactual pairs, the disparities themselves are evidence of differential bias.” This insight appears in Unmasking Hidden Biases in AI.

Step 5: Pick the model that performs better

Bias is not eliminated—but reduced.

In practice, you will often find that:

One model is more conservative
One produces more stereotyped language
One flips outcomes based on attributes
One keeps results more balanced

The two-model test gives you a fast, accessible way to choose the safer option.

More methods for Bias Mitigation

The research mentions several alternatives:

Toxicity scoring using classifiers
Sentiment difference analysis across groups
Fairness metrics like demographic parity and disparate impact
Bayesian hypothesis tests for statistical confidence
Benchmark suites like CALM or AI Fairness 360

These are valuable but require technical skill. The two-model test does not. That’s why it is perfect for teams learning how to use AI safely.

Yes, biases exist. And yes, you should still use AI.

Biases are real. They’re well documented. You need to learn to live with it because they are not going away any time soon.

And none of this should scare you away from tools like ChatGPT, Gemini or Claude.

Understanding bias is not about avoiding AI. It is about using AI wisely.

The entire point of this article is to give you:

Awareness
Vocabulary
A simple method to test models
Confidence in choosing the right tool

LLMs are here to assist you, not replace your judgment. Bias does not make AI useless—it makes AI something we must understand and supervise.

The good news? Now you know how to expose AI bias in 60 seconds.

If you try the two-model test with your team, I would love to see what patterns you uncover.