AI Bias Types: Glossary

A

  • Anchoring Bias: The model relies too heavily on the initial piece of information it receives, even if that information is irrelevant or misleading.
    • Example: A hiring algorithm might overvalue candidates from certain universities simply because they were ranked highly in an outdated ranking system.
  • Algorithm Bias: Bias embedded within the algorithm itself, often due to the choices made during the design and development process.
    • Example: A facial recognition system that uses an algorithm optimized for lighter skin tones might have higher error rates for individuals with darker skin.
  • Automation Bias: Occurs when users over-rely on the output of an AI system, even when it is incorrect or incomplete, due to an assumption that the system is always right.
    • Example: A social media platform’s algorithm flags a user’s post as inappropriate based on keywords, and a human moderator removes it without careful review.
  • Availability Bias: The model favors information that is easily accessible or readily recalled, leading to an overrepresentation of certain patterns.
    • Example: A news aggregator trained on data from popular sources might over-represent mainstream opinions and under-represent minority viewpoints.

C

  • Confirmation Bias: This bias occurs when the AI model is designed or trained in a way that reinforces pre-existing beliefs or hypotheses, ignoring evidence that contradicts them.
    • Example: A search engine that personalizes results based on a user’s past behavior might only show information that confirms their existing views.

D

  • Data Bias: A general term encompassing various types of bias that arise from the data used to train the AI model. This can include sampling bias, measurement bias, and prejudice bias.
    • Example: Any dataset that does not accurately reflect the real-world population it is meant to represent.

E

  • Exclusion Bias: Occurs when certain groups or data points are excluded from the training data, leading to a model that is not representative of the entire population.
    • Example: A language translation model trained mainly on formal written text might struggle to accurately translate informal spoken language.

G

  • Group Attribution Bias: Assigning characteristics to an individual based on their perceived membership in a particular group.
    • Example: An AI system used for security screening might flag individuals from a certain ethnic group as higher risk based solely on their ethnicity.

I

  • Implicit Bias: Unconscious biases that developers or data annotators may introduce into the AI system.
    • Example: A team developing an emotion recognition AI might unknowingly label facial expressions differently based on the perceived gender of the person in the image.

M

  • Measurement Bias: Arises from the way data is collected or measured. Systematic errors in the measurement process can lead to biased data and, consequently, biased models.
    • Example: Using social media sentiment to gauge public opinion on a political issue might be biased because certain demographics are more active on social media.

O

  • Outliers Bias: Outliers or extreme data points in the training set disproportionately influence the model’s learning process, leading to skewed results.
    • Example: A model predicting house prices might be skewed by a few extremely expensive mansions in the dataset.

P

  • Prejudice Bias: Reflects existing societal prejudices and stereotypes, leading to discriminatory outcomes.
    • Example: A loan application system might deny loans to people from certain neighborhoods based on historical data reflecting past discriminatory lending practices.

R

  • Reporting Bias: Occurs when certain types of data are more likely to be reported or collected than others, leading to an incomplete or skewed picture of reality.
    • Example: A crime prediction model based on police reports might be biased towards areas with higher police presence.

S

  • Sampling Bias: Occurs when the data used to train an AI model does not accurately represent the real-world population it is intended to1 model.
    • Example: A self-driving car trained primarily on data from urban environments might not perform well in rural areas.
  • Selection Bias: A type of sampling bias where the selection of data points is not random, leading to a non-representative sample.
    • Example: A medical study that only recruits participants from a specific hospital might not generalize to the wider population.
  • Survivorship Bias: Focusing only on successful examples while ignoring failures, leading to an overly optimistic or skewed perspective.
    • Example: An AI model trained to predict stock market success based only on companies that are currently successful.

T

  • Temporal Bias: Occurs when data collected over a specific time period is not representative of other time periods.
    • Example: A model trained on pre-pandemic data might make inaccurate predictions about consumer behavior in the post-pandemic world.

This glossary, while extensive, is not exhaustive. It is important to remember that bias in AI is an ongoing concern, and researchers are constantly working to identify and address new forms of bias as they emerge.